I was given a perl script that opens a log file and parse each line, do a
pattern match.
Since almost 200 patterns are needed to be identified, the script I was
handed actually uses if-then-elsif x 199 to do the task.

While such approach seemed extremely inefficient, I identified some
sub-patterns, and was planning to use substring matching to lessen the
matches that are needed to identify the string.  After implementation, I
observed that at most 6 to 7 comparisons would be necessary to identify a
pattern.

Much to my surprise, the matching took 4 to 5 times the original execution
time.

The transformation of the code is:

>while (<LOG>) {
>  if ($_ =~ m/$tPattern1/) {
>    ...
>  } elsif ($_ =~ m/$tPattern2/) {
>    ...
>  } elsif .....

to 

>while (<LOG>) {
>  if ($_ =~ m/$tSubPattern1/) {
>    if ($_ =~ m/$tPattern1/) {
>      ...
>    elsif ...
>      ...
>    }
>  } elsif ($_ =~ m/$tSubPattern2/) {
>    if ....

Is there something about pattern matching with regular expression that I
should know about?  And is there a better way to parse these logs?  Thank
you.


SzeKit Hsu
Assistant Software Engineer, Common Services
TREEV, Inc.
(703) 708-7112
[EMAIL PROTECTED]

Kit Hsu.vcf

Reply via email to