>>>>> "K" == Kryten <kryte...@googlemail.com> writes:
K> Here is what I have ( the least I can do is give you all a chance to K> laugh K> at my code! ):- here comes the laughter! :) K> #!/usr/bin/perl K> use warnings ; put use strict in there too. you are declaring some vars, strict enforces that you declare all of them. also i changed some names to use _ chars which makes them more readable. K> my $names_file = 'C:\names.log' ; K> my $exclude_list = 'C:\exclude.txt' ; K> my %names_hash ; K> open ( NAMES, $names_file ) ; K> open ( EXCLUDE, $exclude_list ) ; always check open calls for success. use lexical handles and not global ones. K> @allnames = <NAMES> ; K> @allexcludes = <EXCLUDE> ; if you want speed, that is not the best way to read in the file lines. File::Slurp (on cpan) can do that for you and is cleaner as well: use File::Slurp ; my @all_names = read_file( $exclude_file ) ; my @all_excludes = read_file( $names_file ) ; and you can chomp an array which is faster than doing it for each line: chomp @all_names ; chomp @all_excludes ; K> #---> Create the hash set #---> K> foreach $exclusion ( @allexcludes ) { K> chomp( $exclusion ) ; K> $names_hash{ $exclusion } = 1 K> } this may be more advanced than you want but you can use a slice for this: my %excluded_names ; @excluded_names{ @all_excludes } = () ; notice i didn't assign anything, all the entries will be undef. but they will also i picked a better name. names_hash tells me nothing about the hash K> #---> Walk the names array #---> K> foreach $name ( @allnames ) { K> chomp( $name ) ; K> if ( ($dn) = $name =~ /(\d{5,6})/ ) { K> if ( $names_hash{ $dn } ) { if ( exists( $excluded_names{ $dn } ) ) { K> print "This would be excluded: $name\n" K> } # Close inner IF useless comment. the indent tells you that. and any decent code editor will handle checking of paired chars. K> } # Close outer IF K> } K> This seems to work; I'm aghast at how quickly it tears through the K> data. K> My Powershell script was taking 2-3 minutes. This method completes in K> about 4 seconds. Though to be fair I'm not using an associative array K> in K> my Powershell code, so it's not this same. it isn't the language but the algorithm. a double loop as you likely wrote in powershell is VERY slow. it would be N * M checks. with the hash and one loop it is M times. so this is N times faster, easily speeding up to 4 seconds vs 120-180 seconds. K> This is an order of magnitude better than what I am doing now, so K> big, big thanks and I'd be delighted for any pointers for how to K> make my code more .. perl'ish. my improvements will make your code more solid and even do a little more speedup. uri -- Uri Guttman ------ u...@stemsystems.com -------- http://www.sysarch.com -- ----- Perl Code Review , Architecture, Development, Training, Support ------ --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com --------- -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/