if you want a generalized pattern matcher (simple generalizations may include white spaces, complex ones may include "number" patterns or "Last Name" patterns) i do not think a module is there just for you, what i would recommend is to start going over it manually, creating your own list of patterns. (in which case enumerate them, and count re occurrences in an array. ) and once you add something to the list, grep for the lines that don't match any of your patterns to see what you have left.
i would guess that pretty quickly you would cover over 90% of your logs. then it would get slower, perhaps never reaching 100%. in one word - manually 2009/3/18 Gabor Szabo <[email protected]>: > 2009/3/18 Yossi Itzkovich <[email protected]>: >> Gabor, >> The problem is that I don't know the patterns - I want the script to find. >> Let me explain the need: >> We have a big tracing log of SQL queries to DB. We want to analyze it and >> find if there are repeating sequences of same queries, and optimize them >> (make one big query, or change application code). > > > Well, if you give a real example instead of an abstract one then you > can get more help. > > I guess there are tools out there to analyze such log if that is a standard > log. > > So you could either show us a few lines of the real log file or tell > us what tool > produced it. Is it the logging message of DBI ? > > If you really have no clue of what a repeating string can look like > then just go with > if ($str =~ /(.+).*\1/) { > print $1; > } > > or better yet tell the thing you want the repeating string to be at > least 10 characters long: > > if ($str =~ /(.{10,}).*\1/) { > print $1; > } > > Gabor > _______________________________________________ > Perl mailing list > [email protected] > http://mail.perl.org.il/mailman/listinfo/perl > -- -- vish _______________________________________________ Perl mailing list [email protected] http://mail.perl.org.il/mailman/listinfo/perl
