>>>>> "AY" == Alan Young <[EMAIL PROTECTED]> writes:
AY> I know, replying to myself. AY> Parsing the KJV Bible took about 7 seconds with this: AY> #!/usr/bin/perl -w AY> use strict; AY> my $text = do { AY> open my $T, '<./kjv10.txt' or die "Couldn't open kjv10.txt: $!\n"; AY> local $/; AY> <$T>; AY> }; use File::Slurp ; my $text = readfile( 'bibble' ) ; much faster that way. AY> my %unique; AY> $text =~ s{( AY> (\b\w+(?:['-]+\w+)*\b) why the multiple ['-] inside the words? could those chars ever begin or end words? so just [\w'-]+ should be fine there. AY> (??{!$unique{$^N}++?"(?=)":"(?!)"}) i am not sure why you do that boolean trick there. i have seen it before (and actually use it somewhere but what is its purpose here? AY> ) AY> }{ AY> $1 since you just replace the word by itself, why use s///? m// will get the same results and should be much faster. AY> }xg; AY> print "$_ => $unique{$_}\n" for sort keys %unique; if you want raw speed, that makes lots of calls to print which is very slow as it needs to invoke stdio code for each call. this should be faster (even with the ram usage): print map "$_ => $unique{$_}\n", sort keys %unique; i am curious how much faster it will run with all those changes. :) uri -- Uri Guttman ------ [EMAIL PROTECTED] -------- http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org