Mike Blomgren wrote: > Hi, > > I'm trying to write a patternmatching regexp, with two optional > parenthesis, but I can't figure out how to have an 'optional' match. > I.e. I want a match, regardless if the last two fields are available or > not. But if thy are available, I want to use them... I'm confident there > is a simple solution - I just haven't found it yet... > > In practice: > The logfiles are from several Apache webservers. Some files contain two > additonal fields containing Referer and Browser type, which are last on > each line (example below, may be wrapped). > > 10.0.0.1 - - [30/Aug/2001:14:58:16 +0200] "GET /banner_1.gif HTTP/1.1" > 200 12796 > "http://example.com/" "Mozilla/5.0 (Windows; U; Win98; en-US; m18) > Gecko/20010131 Netscape6/6.01" > > My Patternmatching code look s as follows: > > if ( m/^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) # IP Address > \x20(.+?) # User > \x20(.+?) # unused > \x20(\[.+\]) # Date > \x20\"(.*?\n*?.*?) # Request > (HTTP\/.*?|)\" # Match regardless of HTTP > Version. > \x20(\d+?) # Statuscodes > \x20([\-\d]+?) # Size > \x20(\".*?\") # Optional Referer > \x20(\".*?\") # Optinal Browser type > /ox ) > > However, it's the last two fields ($9 and $10) that I want to be > optional. If they don't exist in the current line being matched, I still > want the rest of the fields to be populated ($1 - $8). I.e. an > 'optional' match... > > On alternativ is to have two different pattern matching statements, but > that would complicate matters. There are more 'optionals' than just > these examples... > > Any help would be greatly apprecieted. And yes, I have read the docs, > but simply not understood them.
The simpler way to do this is to do a split on whitespace and then it's a simple matter to determine the fields in the resultant array. Everything after field 11 can be rejoined as the browser. Basically: my ($ip, $f1, $f2, $dt, $tz, $meth, $page, $proto, $status, $len, $f10, $browser, @rest) = split /\s+/, $_; # <do stuff here> # recombine browser info $browser .= ' '; $browser .= join ' ', @rest; -- ,-/- __ _ _ $Bill Luebkert ICQ=14439852 (_/ / ) // // DBE Collectibles Mailto:[EMAIL PROTECTED] / ) /--< o // // http://dbecoll.tripod.com/ (Free site for Perl) -/-' /___/_<_</_</_ Castle of Medieval Myth & Magic http://www.todbe.com/ _______________________________________________ ActivePerl mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
