Re: opening a big file

Gunnar Hjalmarsson Mon, 21 Apr 2008 08:18:44 -0700

Mr. Shawn H. Corey wrote:

The fastest way to do this is to read every line into Perl and disregardeverything not relevant.


Don't think so.

I did a benchmark on a text file with 100,000 lines, where I'm actuallyonly interested in the 5 last lines. Except for Tie::File, which provedto be awfully slow, reading and testing every line against a regex tookhundreds of times longer time compared to seek() or File::ReadBackwards.Please see below.


C:\home>type test.pl
use File::ReadBackwards;
use Tie::File;
use Benchmark 'cmpthese';

open my $fh, '>', 'test.txt' or die $!;
for ( 1..100000 ) {
    print $fh
      join('', map { (0..9,'A'..'Z','a'..'z')[rand 62] } 1..80), "\n";
}
close $fh;

cmpthese -10, {
    bw        => sub {
        my $bw = File::ReadBackwards->new('test.txt') or die $!;
        for ( 1..5 ) { my $line = $bw->readline; $line =~ /abc/ }
    },
    seek      => sub {
        open my $fh, '<', 'test.txt' or die $!;
        seek $fh, -500, 2 or die $!;
        <$fh>;
        while ( <$fh> ) { /abc/ }
    },
    tf        => sub {
        tie my @file, 'Tie::File', 'test.txt' or die $!;
        for ( -5 .. -1 ) { $file[$_] =~ /abc/ }
    },
    all_lines => sub {
        open my $fh, '<', 'test.txt' or die $!;
        while ( <$fh> ) { /abc/ }
    },
};

C:\home>test.pl
             Rate        tf all_lines        bw      seek
tf        0.877/s        --      -87%     -100%     -100%
all_lines  6.54/s      646%        --     -100%     -100%
bw         1781/s   202999%    27116%        --      -74%
seek       6743/s   769060%   102971%      279%        --

C:\home>

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: opening a big file

Reply via email to