Hi, Very off topic other than I'd do this on my Gentoo box prior to using R on my Gentoo box. Please ignore if not of interest.
I've got a really big data file in essentially a *.csv format. (comma delimited) I need to scan this file and create a new output file. I'm wondering if there is a reasonably easy command line way of doing this using something like sed or awk which I know nothing about. Thanks in advance. The basic idea goes something like this: 1) The input file might look this the following where some of it is attributes (shown as letters) and other parts are results. (shown as numbers) A,B,C,D,1 E,F,G,H,2 I,J,K,L,3 M,N,O,P,4 Q,R,S,T,5 U,V,W,X,6 2) From the above data input file I want to take the attributes from a few preceeding lines (say 3 in this example) and write them to the output file along with the result on the last of the 3 lines. The output file might look like this: A,B,C,D,E,F,G,H,I,J,K,L,3 E,F,G,H,I,J,K,L,M,N,O,P,4 I,J,K,L,M,N,O,P,Q,R,S,T,5 M,N,O,P,Q,R,S,T,U,V,W,X,6 3) This must be done as a read/process/write operation of some sort because the input file may be far larger than system memory. (Currently it isn't, but it likely will eventually be.) 4) In my example above I suggested that there is a single result but their may be more than one. (Don't know yet.) I showed 3 lines but might be doing 10. I don't know. It's important to me to pick a moderately flexible way of dealing with this as the order of columns and number of results will likely change over time and I'll certainly need to adjust. Thanks in advance for any pointers. Happy to buy a good book if someone knows what I should look for. Cheers, Mark