John, thank you so much for your generous gift of going throughout my program and 
making suggestions. As I typed in your suggestions, I tried to make sense of what you 
were proposing, and most of the time I clearly understood it.

I'm still having trouble with the regex expression. I've pasted in the files and 
program at the end, for reference.

In your message, you said:
    my %hash = ( abstract => '', author => '', endtitle => '' );
    $hash{ lc $2 } = $3
        while $query =~ 
/QF(\d+)=[^&]*\b(abstract|author|endtitle)\b(?=.*?&QI\1=([^&]*))/ig;

My interpretation of the match statement is:
Match the letters 'QF' then one or more digits. Capture the digits in \1
Match anything, or nothing, as long as it's not '&', then look for the start of a 
word, followed by one of the words 'abstract', 'author' or 'endtitle' followed by the 
end of a word. Capture that in the \2 or $2 variable.
Here's where I have a problem: I don't understand the '(' (start the third group?) 
followed by the '?'. I'll skip up to the 'QI' for now
Match the letters "QI" followed by the digits captured in \1, then an equals sign,
then anything or nothing, as long as it's not '&', and capture that in the \4 or $4 
variable.

I tried to rewrite this statement as:
   my %hash = (abstract =>'', author =>'', endtitle =>'');
   $hash{lc $2}=$3 while $query =~ \
      /QF(\d+)=[^&]*\b(abstract|author|endtitle)\b.*?&QI\1=([^&]*)/ig;

Which I think means:
Match the letters 'QF' then one or more digits. Capture the digits in \1
Match anything, or nothing, as long as it's not '&', then look for the start of a 
word, followed by one of the words 'abstract', 'author' or 'endtitle' followed by the 
end of a word. Capture that in the \2 or $2 variable.
Match anything, but as little as possible (".*?"), up to '&QI' then the same digits as 
before, then
then anything or nothing, as long as it's not '&', and capture that in the \3 or $3 
variable.

However, this still isn't matching anything, as the output of the program with the 
data file 'v' below shows.

Did I misunderstand your original suggestion, or have I made another mistake in trying 
to correct it?

Thanks, again, for all the time and effort you spent trying to help me.

-Kevin
=============================================================
[EMAIL PROTECTED]:/opt/analog/logdata/db$ ./listqueries3.pl v|cat -vet
main::translate() called too early to check prototype at ./listqueries3.pl line 20.
B^I2004-03-28^I00:38:31^I^I^I$
[EMAIL PROTECTED]:/opt/analog/logdata/db$ cat listqueries3.pl 
#!/usr/bin/perl

use warnings;
use strict;

my $debug = 1;

while (<>) {
   next unless /&TN=popline&/i; #Just analyze the records for the POPLINE database
   
   my ($date, $time, $source, undef, undef, $host, $FQDN, $method, $file, $query, 
undef) = split;
   next unless $method eq 'GET' and substr($file, -12) eq 'dbtwpcgi.exe';
   my $type = $query =~ /QI2/? 'A': 'B';
     
   my %hash = (abstract =>'', author =>'', endtitle =>'');
   $hash{lc $2}=$3 while $query =~ \
      /QF(\d+)=[^&]*\b(abstract|author|endtitle)\b.*?&QI\1=([^&]*)/ig;

   my $outstring = join("\t", $type, $date, $time, @hash{ qw( abstract author endtitle 
)}) . "\n";
   print translate($outstring);
}# while there are more lines in the input file   

sub translate() {
   $_ = $_[0];
   s/%22/\"/g;
   s/%2C/,/g;
   s/%20/ /g;
   s#%2F#/#g;
   s/%3D/=/g;
   s/%3B/;/g;
   s/%26/&/g;
   s/%0D//g;
   s/%0A//g;
   s/\+/ /g;
   s/%29/)/g;
   s/%28/(/g;
   s/%27/'/g;
   s/%2b/+/g;
   s/%7C/|/g;
   s/%3A/:/g;
   #Debbie request all boolean logical words and sumbols be replaced with '|'
   s/\band\b/|/ig;
   s/\bor\b/|/ig;
   tr!&/!|!;
   $_;
   }
[EMAIL PROTECTED]:/opt/analog/logdata/db$ cat v
2004-03-28 00:38:31 d7.facsmf.utexas.edu - W3SVC1 DB db.jhuccp.org GET 
/dbtw-wpd/exec/dbtwpcgi.exe 
XC=%2Fdbtw-wpd%2Fexec%2Fdbtwpcgi.exe&BU=http%3A%2F%2Fdb.jhuccp.org%2Fpopinform%2Fbasic.html&QB0=AND&QF0=Abstract+%7C+KeywordsMajor+%7C+KeywordsMinor+%7C+Notes+%7C+EngTitle+%7C+TT+%7C+FREAb+%7C+SPAAb&QI0=China%0D%0A&QB1=AND&QF1=Author+%7C+CN&QI1=&MR=10&TN=popline&RF=ShortRecordDisplay&DF=LongRecordDisplay&DL=1&RL=1&NP=0&AC=QBE_QUERY&x=37&y=4
 200 0 21248 814 19391 80 HTTP/1.1 
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+.NET+CLR+1.0.3705) - 
http://db.jhuccp.org/popinform/basic.html
[EMAIL PROTECTED]:/opt/analog/logdata/db$ 

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to