large files

Chris Stinemetz Tue, 05 Mar 2013 12:41:34 -0800

Hello List,

I am working on a script to parse large files, by large I mean 4 million
line+ in length and when splitting on the delimiter ( ; ) there are close
to 300 fields per record, but I am only interested in the first 44.


I have begin testing to see how fast the file can be read in a few
different scenarios:

while( <> ) {
}

It only takes about 6 seconds to read 4,112,220 lines.

But when I introduce split as such:

while (<>) {
    chomp($_);
    my @tokens = split( ";", $_ );
}

It takes around 7 minutes to reach eof.

I also tried using a LIMIT on split as shown below:
It helped greatly by only taking a little over 1 minute but I am curious if
there is a way to still improve the time to read in the file or is this a
reasonable time.

while (<>) {
    chomp($_);
    my @tokens = split( ";", $_, 44 );
}

Thank you,

Chris

large files

Reply via email to