Re: Recursively filter an array

Uri Guttman Thu, 02 Sep 2010 01:40:36 -0700

>>>>> "K" == Kryten  <kryte...@googlemail.com> writes:


  K> Here is what I have ( the least I can do is give you all a chance to
  K> laugh
  K> at my code! ):-

here comes the laughter! :)


  K> #!/usr/bin/perl
  K> use warnings ;

put use strict in there too. you are declaring some vars, strict
enforces that you declare all of them.

also i changed some names to use _ chars which makes them more readable.

  K> my $names_file = 'C:\names.log' ;
  K> my $exclude_list = 'C:\exclude.txt' ;
  K> my %names_hash ;

  K> open ( NAMES, $names_file ) ;
  K> open ( EXCLUDE, $exclude_list ) ;

always check open calls for success. use lexical handles and not global
ones.

  K> @allnames = <NAMES> ;
  K> @allexcludes = <EXCLUDE> ;

if you want speed, that is not the best way to read in the file
lines. File::Slurp (on cpan) can do that for you and is cleaner as well:

use File::Slurp ;

        my @all_names = read_file( $exclude_file ) ;
        my @all_excludes = read_file( $names_file ) ;

and you can chomp an array which is faster than doing it for each line:

        chomp @all_names ;
        chomp @all_excludes ;

  K> #---> Create the hash set #--->
  K> foreach $exclusion ( @allexcludes ) {
  K>    chomp( $exclusion ) ;
  K>    $names_hash{ $exclusion } = 1
  K> }

this may be more advanced than you want but you can use a slice for
this:


        my %excluded_names ;
        @excluded_names{ @all_excludes } = () ;

notice i didn't assign anything, all the entries will be undef. but they will
also i picked a better name. names_hash tells me nothing about the hash

  K> #---> Walk the names array #--->
  K> foreach $name ( @allnames ) {
  K>    chomp( $name ) ;
  K>    if ( ($dn) = $name =~ /(\d{5,6})/ ) {
  K>             if ( $names_hash{ $dn } ) {


        if ( exists( $excluded_names{ $dn } ) ) {
  K>                 print "This would be excluded: $name\n"
  K>             } # Close inner IF

useless comment. the indent tells you that. and any decent code editor
will handle checking of paired chars.

  K>     } # Close outer IF
  K> }

  K> This seems to work;  I'm aghast at how quickly it tears through the
  K> data.
  K> My Powershell script was taking 2-3 minutes. This method completes in
  K> about 4 seconds. Though to be fair I'm not using an associative array
  K> in
  K> my Powershell code, so it's not this same.

it isn't the language but the algorithm. a double loop as you likely
wrote in powershell is VERY slow. it would be N * M checks. with the
hash and one loop it is M times. so this is N times faster, easily
speeding up to 4 seconds vs 120-180 seconds.

  K> This is an order of magnitude better than what I am doing now, so
  K> big, big thanks and I'd be delighted for any pointers for how to
  K> make my code more .. perl'ish.

my improvements will make your code more solid and even do a little more
speedup.

uri

-- 
Uri Guttman  ------  u...@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Recursively filter an array

Reply via email to