On Wed, 2008-09-24 at 03:17 -0700, John W. Krahn wrote: > [EMAIL PROTECTED] wrote: > > Hi, > > Hello, > > > We receive a text file with the following entries. > > > > "000001","item1","apple one","apple two","apple three" > > "000002","item2","body one","body two","body three" > > "000003","item2","body one","body two","body three" > > "000004","item2","body one","body two","body three" > > "000005","item1","orange one","orange two","orange three" > > "000006","item2","body one","body two","body three" > > "000007","item2","body one","body two","body three" > > "000008","item2","body one","body two","body three" > > "000009","item2","body one","body two","body three" > > "000010","item2","body one","body two","body three" > > > > How do I use perl to convert the above to the following? I'm a novice > > perl user. > > > > "apple three","body one","body two","body three" > > "apple three","body one","body two","body three" > > "apple three","body one","body two","body three" > > "orange three","body one","body two","body three" > > "orange three","body one","body two","body three" > > "orange three","body one","body two","body three" > > "orange three","body one","body two","body three" > > "orange three","body one","body two","body three" > > $ echo '"000001","item1","apple one","apple two","apple three" > "000002","item2","body one","body two","body three" > "000003","item2","body one","body two","body three" > "000004","item2","body one","body two","body three" > "000005","item1","orange one","orange two","orange three" > "000006","item2","body one","body two","body three" > "000007","item2","body one","body two","body three" > "000008","item2","body one","body two","body three" > "000009","item2","body one","body two","body three" > "000010","item2","body one","body two","body three"' | \ > perl -lne' > my @data = split /,/; > if ( $data[ 1 ] eq q/"item1"/ ) { > $field1 = $data[ 4 ]; > } > elsif ( $data[ 1 ] eq q/"item2"/ ) { > print join q/,/, $field1, @data[ 2, 3, 4 ]; > } > ' > "apple three","body one","body two","body three" > "apple three","body one","body two","body three" > "apple three","body one","body two","body three" > "orange three","body one","body two","body three" > "orange three","body one","body two","body three" > "orange three","body one","body two","body three" > "orange three","body one","body two","body three" > "orange three","body one","body two","body three"
You have two problems here: 1) parsing the CSV file and 2) rearranging the data. There is no standard for CSV files. There is a MIME type; its definition is available at http://tools.ietf.org/html/rfc4180 Data can be categorized by how it has to be parsed. The simplest is context-free data. An example is tab-separated values (TSV). These can be parsed using only regular expressions. The next complex type is bounded-recursive contexts. An example is CSV. These require a finite-state automation (FSA). FSAs are also called state machines. The most complex are unbounded-recursive contexts. An example is the algebra expressions you learnt in high school. These require a FSA with a push-down stack. To create a state machine for CVS: 1. Identify all the contexts. 2. Identify all the symbols in each context. 3. Identify all the transitions from one context to another. 4. Identify the start and end states. 5. Create the code to implement the state machine. Or download an appropriate module from CPAN http://search.cpan.org/ that does all this for you. And no, I'm not going to recommend one because CSV is not standardize. You'll have to decide which one fits your needs. As for your second problem, I think John has answered it. -- Just my 0.00000002 million dollars worth, Shawn Linux is obsolete. -- Andrew Tanenbaum -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/