Mr. Shawn H. Corey wrote:
The fastest way to do this is to read every line into Perl and disregard everything not relevant.

Don't think so.

I did a benchmark on a text file with 100,000 lines, where I'm actually only interested in the 5 last lines. Except for Tie::File, which proved to be awfully slow, reading and testing every line against a regex took hundreds of times longer time compared to seek() or File::ReadBackwards. Please see below.

C:\home>type test.pl
use File::ReadBackwards;
use Tie::File;
use Benchmark 'cmpthese';

open my $fh, '>', 'test.txt' or die $!;
for ( 1..100000 ) {
    print $fh
      join('', map { (0..9,'A'..'Z','a'..'z')[rand 62] } 1..80), "\n";
}
close $fh;

cmpthese -10, {
    bw        => sub {
        my $bw = File::ReadBackwards->new('test.txt') or die $!;
        for ( 1..5 ) { my $line = $bw->readline; $line =~ /abc/ }
    },
    seek      => sub {
        open my $fh, '<', 'test.txt' or die $!;
        seek $fh, -500, 2 or die $!;
        <$fh>;
        while ( <$fh> ) { /abc/ }
    },
    tf        => sub {
        tie my @file, 'Tie::File', 'test.txt' or die $!;
        for ( -5 .. -1 ) { $file[$_] =~ /abc/ }
    },
    all_lines => sub {
        open my $fh, '<', 'test.txt' or die $!;
        while ( <$fh> ) { /abc/ }
    },
};

C:\home>test.pl
             Rate        tf all_lines        bw      seek
tf        0.877/s        --      -87%     -100%     -100%
all_lines  6.54/s      646%        --     -100%     -100%
bw         1781/s   202999%    27116%        --      -74%
seek       6743/s   769060%   102971%      279%        --

C:\home>

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to