RE: Grep in Perl

King, Jason G Wed, 18 Dec 2002 15:29:35 -0800

Kenneth writes..

>I have two files, FILE1 and FILE2. Each of the two files is an array of
>numbers. I want to check for the occurrence of each element of FILE1
>in FILE2. In other words, FILE2 is checked for the occurrence of each
>of the elements in of FILE1. For any element of FILE1 that occurs in
>FILE2, this element is saved in a file, FILE3.txt.


Very simple.

>I tried the attached Perl script to perform this exercise but the
>script failed with the following error message:
>
>"main:database" used only once: possible typo at ./GREP1.pl line 8"

That's just a warning about the @database array. Your real problem is in
your 'grep' statement. Code shown below with correction:

>Kindly assist me with a solution to actualising this task.
>
>#!/usr/bin/perl -w
>
>open FILE1, "</tmp/AAA.txt" or die "Can,t open AAA.txt: $!";
>open FILE2, "</tmp/DATA.dat" or die "Can't open DATA.dat: $!";
>open OUT, ">>/tmp/FILE3.txt" or die "Can't open BBB.txt: $!";
>
>@numbers = <FILE1>;
>@database = <FILE2>;
>close(FILE1);
>
>       foreach $number (@numbers) {
>               chomp($number);

                print OUT $number if grep { $_ == $number } @database;

>}
>close(FILE2);
>close(OUT);


But that's messy, this would be how I'd code your solution:

  use warnings;
  use strict;

  open IN, '/tmp/AAA.txt' or die "Bad open: $!";
  open DB, '/tmp/DATA.dat' or die "Bad open: $!";
  open OUT, '>>/tmp/FILE3.txt' or die "Bad open: $!";

  chomp( my @numbers = <IN>);  # chomp each element as we read them in
  chomp( my @database = <DB>);

  # here we take a little detour from your method
  my %db_lookup;
  @db_lookup{ @database } = ();

  for( @numbers )
  {
     print OUT if exists $db_lookup{$_};
  }

  close OUT or die "Bad write: $!";
  __END__

In my suggested solution you use Perl's VERY fast internal hash lookup
code to do your comparisons instead of using grep to go through the
whole list each time. The only really tricky thing here for people who
are unfamiliar with this syntax is:

  @db_lookup{ @database } = ();

All this does is sets a whole heap of keys up in the %db_lookup hash in
one pass. It looks like we've got an array called @db_lookup, but in
fact that's just special syntax that Perl recognises because of the {}
braces, so it knows it's a hash and that we're setting multiple keys at
once.

You'll find that this is probably the fastest way to do what you want to
do, except if both your lists are sorted and VERY large then perhaps
shelling out to a 'diff' might be quicker. But for most situations the
above is best.

-- 
  Jason King
_______________________________________________
ActivePerl mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

RE: Grep in Perl

Reply via email to