Hi,
If this question would be better posted to another perl list, please let me know.
I have a very large text files (~2 GB) and it's in the following format:
header line header line header line marker 1 header line header line header line marker 2 line type 1 line type 1 line type 1 ... line type 1 line type 2 line type 2 line type 2 ... line type 2 end of file marker line
My objective is to put all "line type 1" lines to file1.txt and all "line type 2" lines to file2.txt. The "header line" and any of the marker lines will not appear in either file1.txt or file2.txt. Note there is no marker line between where line type 1 ends and where line type 2 starts, but that can be determined by examining a field in the line.
So I have a script to do this. Essentially, it visits each line in the file and decides which output file to write it to. The problem is it takes a long time to run (roughly 45 min) (dual p4, 512 ram). I'd like to cut this running time down as much as possible. What I'm looking is either suggestions on a better way to do this in perl, or suggestions or techniques I could use to speed up my current script. I have pasted the relevant parts of the script below. I noticed I could shave a bit off the runtime by reading the original file in a buffered manner instead of line by line. My outputs to file1.txt and file2.txt at this point take place with prints to their respective file handles.
It's around 1:30 in the morning here, so I haven't look real close at your code, but...
I'd write a simple script to read your input file and write it alternately ($. % 2) to two output files. Benchmark it with system buffering turned on for both the input and output files, with your buffering technique, no buffering, etc. (I'm guessing system buffering should be fastest). If there is no significant increase in speed, then the problem is likely in the program logic somewhere. I'll leave that to someone who is still awake ;-)
Regards, Randy.
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>