I was given a perl script that opens a log file and parse each line, do a
pattern match.
Since almost 200 patterns are needed to be identified, the script I was
handed actually uses if-then-elsif x 199 to do the task.
While such approach seemed extremely inefficient, I identified some
sub-patterns, and was planning to use substring matching to lessen the
matches that are needed to identify the string. After implementation, I
observed that at most 6 to 7 comparisons would be necessary to identify a
pattern.
Much to my surprise, the matching took 4 to 5 times the original execution
time.
The transformation of the code is:
>while (<LOG>) {
> if ($_ =~ m/$tPattern1/) {
> ...
> } elsif ($_ =~ m/$tPattern2/) {
> ...
> } elsif .....
to
>while (<LOG>) {
> if ($_ =~ m/$tSubPattern1/) {
> if ($_ =~ m/$tPattern1/) {
> ...
> elsif ...
> ...
> }
> } elsif ($_ =~ m/$tSubPattern2/) {
> if ....
Is there something about pattern matching with regular expression that I
should know about? And is there a better way to parse these logs? Thank
you.
SzeKit Hsu
Assistant Software Engineer, Common Services
TREEV, Inc.
(703) 708-7112
[EMAIL PROTECTED]
Kit Hsu.vcf