Roman Daszczyszak wrote: > Hello all, Hello,
> I have several text files with a few thousand contacts in each, and I > am trying to pull out all the contacts from certain email domains > (about 15 of them). I wrote a script that loops through each file, > then loops through matching each domain to the line and writes the > results to two files, one for matches, one for non-matches. > > I am just curious if there is a way to match all the domains in turn, > without having a foreach looping through them? Yes there is but it is usually considered slower than using a for loop. > Here's my code: > #!/perl/bin/perl > use strict; > use warnings; > > my $program_time = time(); This variable is already provided by Perl, it is called $^T. > die "SYNTAX: strip_email_addresses.pl FILE1 FILE2 .. FILE(N)\n" unless > (@ARGV); > my $domain_filename = "intel_addresses.txt"; > my @email_domains; > > open(DOMAINS, "<$domain_filename") or die "Cannot open $domain_filename: > $!\n"; > chomp(@email_domains = <DOMAINS>); > LINE: while (<>) > { > my $filename = $ARGV; > $filename =~ s/\.csv//gi; You are saying that you want to remove ALL occurences of /\.csv/i from the file name? If you just want to remove /\.csv/i at the end of the file name (the file name extension) you should anchor the pattern: $filename =~ s/\.csv\z//i; > open(FOUND, ">>${filename}_match.csv") or die "Cannot open > ${filename}_match.csv\n"; > open(NOTFOUND, ">>${filename}_nomatch.csv") or die "Cannot open > ${filename}_nomatch.csv\n"; > > foreach my $domain (@email_domains) > { > if (m/$domain/i) > { > print(FOUND $_); > next LINE; > } > } > print(NOTFOUND $_); > } > print("Run time: ",time() - $program_time,"\n"); > --------------------------------------------------------------------------- > > Additionally, does anyone know of a better way to open the results > files, keeping the practice of making two files for each original, > without having to reopen the file on each iteration of the while loop? > Does reopening the file cause a performance hit each open? You probably want something like: #!/perl/bin/perl use strict; use warnings; die "SYNTAX: strip_email_addresses.pl FILE1 FILE2 .. FILE(N)\n" unless @ARGV; my $domain_filename = 'intel_addresses.txt'; open DOMAINS, '<', $domain_filename or die "Cannot open $domain_filename: $!\n"; my $email_domains = join '|', map { chomp; quotemeta } <DOMAINS>; my $domains = qr/$email_domains/i; while ( <> ) { if ( $. == 1 ) { # only open once at beginning ( my $filename = $ARGV ) =~ s/\.csv\z//i; open FOUND, '>', "${filename}_match.csv" or die "Cannot open ${filename}_match.csv: $!\n"; open NOTFOUND, '>', "${filename}_nomatch.csv" or die "Cannot open ${filename}_nomatch.csv: $!\n"; } print { /$domains/ ? FOUND : NOTFOUND } $_; close ARGV if eof; # must close so $. will work correctly } print 'Run time: ', time() - $^T, "\n"; John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>