On Sat, Jul 11, 2009 at 03:57:26PM +0100, Penguin Lover Mick squawked:
> > > Hmm, I don't think it gets anywhere:
> > > =======================================
> > > cat test.xml | grep -v NaN | grep '<row>' | tr e ' ' | awk
> > > {'print "Q"$2"qcq"$3"qcq"$9"Q"'} | tr Q '"' | tr c ',' | tr q '"'" >
> > > test.csv
> > >
> > > =======================================
> > >
> > > It just sits there at the > cursor.  I think it needs something more to
> > > it, or
> >
> > Looks like a syntax error with improperly nested quotations marks.
> >
> > The last command in the sequence, which reads
> >
> >   tr q '"'"
> >
> > try replacing that with
> >
> >   tr q '"'
> >
> > (remove the final double quote)
> >
> > W
> 
> Thank you both!  It works to a point.  This is what the xml file contains:
> 
>  <database>
>                         <!-- 2009-07-02 07:41:00 EDT / 1246534860 --> 
> <row><v> 
> 7.3395000000e+01 </v><v> 4.7990000000e+01 </v></row>
> 
> The CSV file only shows the first value and then it does not pick up the fact 
> that it is exponential:
> 
> "2009-07-02","07:41:00","7.3395000000"
> 
> How could it be tweaked to a)account for e+01, b)include additional value 
> fields?
> 

Try:

cat test.xml | grep -v NaN | grep '<row>' | awk {'print 
"Q"$2"qcq"$3"qcq"$9"qcq"$11"Q"'} | tr Q '"' | tr c ',' | tr q '"'" > test.csv

Just to help you help yourself later: the 'tr' command "translates".
So the command 

  tr e ' '

swaps occurences of the letter e with a blank space. Removing that
command keeps the e in the numbers. (Though I am not certain how CSV
files deal with e notations...). awk prints the space-separated
fields. $2, $3, etc are the number of the field respectively. So
adding $11 allows printing the one additional field. 

This, of course, only works if you have the same number of records in
each row. 

HTH, 

W
-- 
A gossip is someone with a great sense of rumour.
Sortir en Pantoufles: up 947 days, 22:39

Reply via email to