Re: [gentoo-user] [OT] - command line read *.csv & create new file

Mark Knecht Sun, 22 Feb 2009 17:54:11 -0800

On Sun, Feb 22, 2009 at 4:57 PM, Willie Wong <ww...@princeton.edu> wrote:
> On Sun, Feb 22, 2009 at 03:15:09PM -0800, Penguin Lover Mark Knecht squawked:
>> 1) My actual input data starts with two fields which date & time. For
>> lines 2 & 3 I need exclude the 2nd & 3rd date & time from the output
>> corresponding to line 1, so these 3 lines:
>>
>> Date1,Time1,A,B,C,D,0
>> Date2,Time2,E,F,G,H,1
>> Date3,Time3,I,J,K,L,2
>>
>> should generate
>>
>> Date1,Time1,A,B,C,D,E,F,G,H,,I,J,K,L,2
>>
>> Essentially Date & Time from line 1, results from line 3.
>>
>> 2) The second is that possibly I don't need attribute G in my output
>> file. I'm thinking that possibly a 3rd sed script that counts a
>> certain number of commas and then doesn't copy up through the next
>> comma? That's messy in the sense that I probably need to drop 10-15
>> columns out as my real data is maybe 100 fields wide so I'd have 10-15
>> addition scripts which is too much of a hack to be maintainable.
>> Anyway, I appreciate the ideas. What you sent worked great.
>>
>
> For both of these cases, since you are dropping columns and not
> re-organizing, you'd have a much easier time just piping the command
> through "cut". Try 'man cut' (it is only a few hundred words) for
> usage. But with the sample you gave me, you just need to post process
> with
>
> .... | cut -d , -f 1-6,9,10,12,15-
>
> and the Date2, Time2, G, Date3, Time3 columns will be dropped.


Thanks. I'll investigate that tomorrow.

>
> As to your problem with the first two lines being mangled: I suspect
> that the first two lines were formatted differently? Maybe stray
> control characters got into your file or maybe there are leading
> spaces? It's bizarre for both Etaoin's and my scripts to
> coincidentally mess up the same lines.
>
> (Incidentally, where did you get the csv files from? When I worked in
> a physics labs and collected data, I found that a lot of times the
> processing of data using basic command-line tools like sed, bash,
> perl, and bc can be done a lot more quickly if the initial datasets were
> formatted in a sensible fashion. Of course there are times when such
> luxury cannot be afforded.)

They are primarialy coming from TradeStation. The data that I'm
working with is stock pricing data along with technical indicators
coming off of charts. Unfortunatelly I don't seem to have any control
at all as to the order that the columns show up. It doesn't seem to be
based on how I build the chart and certain things on the chart I don't
need are still output to the file. It's pretty much take 100% of
what's on the chart or take nothing.

Fortunately the csv files are very good in terms of not dropping out
data. At least every row has all the data.

Cheers,
Mark

>
> Best,
>
> W
> --
> "What's the Lagrangian for a suction dart?"
> ~DeathMech, Some Student. P-town PHY 205
> Sortir en Pantoufles: up 807 days, 23:29
>
>

Re: [gentoo-user] [OT] - command line read *.csv & create new file

Reply via email to