Mr. Shawn H. Corey wrote:
The fastest way to do this is to read every line into Perl and disregard
everything not relevant.
Don't think so.
I did a benchmark on a text file with 100,000 lines, where I'm actually
only interested in the 5 last lines. Except for Tie::File, which proved
to be awfully slow, reading and testing every line against a regex took
hundreds of times longer time compared to seek() or File::ReadBackwards.
Please see below.
C:\home>type test.pl
use File::ReadBackwards;
use Tie::File;
use Benchmark 'cmpthese';
open my $fh, '>', 'test.txt' or die $!;
for ( 1..100000 ) {
print $fh
join('', map { (0..9,'A'..'Z','a'..'z')[rand 62] } 1..80), "\n";
}
close $fh;
cmpthese -10, {
bw => sub {
my $bw = File::ReadBackwards->new('test.txt') or die $!;
for ( 1..5 ) { my $line = $bw->readline; $line =~ /abc/ }
},
seek => sub {
open my $fh, '<', 'test.txt' or die $!;
seek $fh, -500, 2 or die $!;
<$fh>;
while ( <$fh> ) { /abc/ }
},
tf => sub {
tie my @file, 'Tie::File', 'test.txt' or die $!;
for ( -5 .. -1 ) { $file[$_] =~ /abc/ }
},
all_lines => sub {
open my $fh, '<', 'test.txt' or die $!;
while ( <$fh> ) { /abc/ }
},
};
C:\home>test.pl
Rate tf all_lines bw seek
tf 0.877/s -- -87% -100% -100%
all_lines 6.54/s 646% -- -100% -100%
bw 1781/s 202999% 27116% -- -74%
seek 6743/s 769060% 102971% 279% --
C:\home>
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/