Re: matching values of one hash to another

Harry Putnam Thu, 29 Apr 2010 08:35:01 -0700

"John W. Krahn" <jwkr...@shaw.ca> writes:
>> I need to do some matching of filenames in two top level directories.
>> We expect to find a number of cases where the endnames ($_) are the
>> same in both hierarchies but the full name is different.
>>
>>   base1/my/file
>>   base2/my/different_path/file

This should have been clarified better as Uri has noted...
The match was to be the part in brackets against the other part in
brackets below:

   base2/my/different_path/[file]
   base1/my/[file]

so the hashes were to have these pairs:

   for base2:
    file => base2/my/different_path/file
or: ($_) =>   ($File::Find::name)  

   for base1:
    file => base1/my/file
or: ($_) => ($File::Find::name)  

And the matching would be 
  each  $value from base1 hash matched against
   all  $values from base2 hash.

Where a match occurs... extract that value from base2  

There may be several or none.  Further processing would follow on the
matches, and further processing again on multiple matches, that code
is not present yet.

I posted my effort at digging out the matches.

I see the way you did it, even though not the exact results I was
after is 100s of percent better way to write it.

I'm curious though if the overhead is different in your compact code
compared to mine. That is, if all that spinning through dir2:
,----
|   my ($r1full,$r1end);
|  while (($r1full,$r1end) = each(%r1h)) {
|    foreach my $key (keys %r2h) {
`----

 is more costly than your compact example: 

,----
|     if ( exists $r2h{ $rlend } ) {
|        print "$r2h{$rlend} MATCHES $r1end\n";
`----

In other words is the perl interpreter working harder in one case?

[...]

Harry wrote:

>> #!/usr/local/bin/perl
>>   use strict;
>> use warnings;
>>
>> use File::Find;
>> use Cwd;
>>
>>  my $r1 = shift;
>>  my $r2 = shift;

John K. replied:

> ( my ( $r1, $r2 ) = @ARGV ) == 2
>     or die "usage: $0 dir1 dir2\n";

That is a nice compact way of putting it.  Thanks

[...]

Harry wrote:
>> find(
>>   sub {
>>     ## For use in guaranteeing the -f command uses the
>>     ## right path
>>     my $dir = getcwd;
>>       if (-f $dir . '/'. $_) {
>

John K. replied:
> The current working directory is already in $File::Find::dir
[...]
>         return unless -f;

What made me go with Cwd was tests I tried during tinkering where I
got the apparently wrong headed notion that $File::Find::dir didn't do
what I wanted.

After getting unexpected `undefined' errors, I did a less elaborate
and uglier test like this one (now trying to use (some) or your style of
writing a find() function):
------- 8< snip ---------- 8< snip ---------- 8<snip ------- 
cat ex1.pl
#!/usr/local/bin/perl

use strict;
use warnings;
use File::Find;
use Cwd;

my $topdir2recurse = shift;

## Trying to use JKs' notation

my $cnt1 = 0;
find  sub {
        if(-f $File::Find::name){
          $cnt1++;
        }
      },$topdir2recurse;

print "Finished 1st recurse, count was <$cnt1>\n";
# -------       -------       ---=---       -------      ------- 
my $cnt2 = 0;
find  sub {
       my $dir = getcwd;
       if(-f $dir . '/' . $_){
          $cnt2++;
       }
      },$topdir2recurse;

print "Finished 2nd recurse, count was <$cnt2>\n";

            __END__
-------        ---------       ---=---       ---------      -------- 
run ex1.pl ./dir1
  Finished 1st recurse, count was <0>
  Finished 2nd recurse, count was <625>
-------        ---------       ---=---       ---------      -------- 
I'm sure my test is invalid since you've shown that $File::Find:;dir
is all I needed.  But not sure I see why it works in one case but not
this case.  (Something wrong with the example no doubt)

>>         ## Determine if base dir matches r1 or r2
>>         (my $base) = $File::Find::name =~ m/^(\.*\/*\/[^\/]+)\//;
>
> Instead of all the "leaning toothpicks" use a different delimiter:

>          (my $base) = $File::Find::name =~ m!^(\.*/*/[^/]+)/!;

Thanks... point taken.

[...] snipped great example code.

(
Note: There are apparently some typos or such in there, I'm trying to
figure out. As posted it doesn't actually work when fed 2 directories:

  jkex.pl ./dir1 ./dir2

  Global symbol "$dirs" requires explicit package name at ./jkex.pl
  line 25.
  Execution of ./jkex.pl aborted due to compilation errors.

And if I create a global symbol for $dirs (my $dirs;)

  jkex.pl ./dir1 ./dir2
 <after a few moments pause>

  Can't use an undefined value as an ARRAY reference at ./jkex.pl line
  26.
)

John offered:
> Perhaps you want something like:

[...] script snipped

Yes, once I get it working... and figure out what is actually going on
in that compact code it is the kind of code I'd like to be able to
dash off (and read) some day.

I don't see (yet) what is supposed to be happening here.  There is a
lot happening that isn't obvious to me.

       $data{ $_ }{ $r2 }++
   $data{ $_ } .. (ok thats what I called the end name)

Are {} playing the role of rgx delimiters in `{ $r2 }'

Or is it the same as saying:
   $data{ $_ } eq $r2

I'm not sure what roll the `++' plays there either.    

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: matching values of one hash to another

Reply via email to