On Tue, Jan 30, 2007 at 05:06:11PM +1300, Nigel Stanger wrote:
> ^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - - \[(.*?)\] "GET /(\d{1,4}).*?
> HTTP/1.." 200 .*("[^"]+")?$
>
> and here's an example of the kind of line that it should match (again,
> insert spaces in place of line breaks):
>
> 139.80.75.149 - - [17/May/2006:14:32:57 +1200] "GET
> /1/01/EnvironmentalHarmonyandCommercialSuccess1987.pdf HTTP/1.0" 200 246158
> "http://cardrona.eprints.otago.ac.nz/perl/latest" "Mozilla/4.0 (compatible;
> MSIE 6.0; Windows NT 5.1; SV1)"
>
> Now the pattern *does* match the above just fine; the problem is that it
> doesn't capture correct sub-patterns. From the above line, I get the
> following:
>
> \1 = 139.80.75.149 (correct)
> \2 = 17/May/2006:14:32:57 +1200 (correct)
> \3 = 1 (correct)
> \4 = (incorrect)
>
> \4 should contain the user-agent string from the end of the line, and did up
> until yesterday, but for some reason it's no longer working. If I remove the
> ? just before the $, \4 is then correct, but then the pattern fails to match
> lines that have no trailing user-agent string (of which there are many). I
> added the ? yesterday for precisely that reason, *and it worked correctly*
> in both BBEdit and PHP. Today it doesn't work in either, and I don't know
> why.
.*("[^"]+")?$ matches as many characters as possible, followed by an
optional ("[^"]+"), followed by the end of the string. Anything that could
have been matched by ("[^"]+")? has already been matched by .*, so
("[^"]+")? will never match anything. I don't think this can be the same
regex that was working for you yesterday.
If you change the .* to be non-greedy, it should work.
^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - - \[(.*?)\] "GET /(\d{1,4}).*?
HTTP/1.." 200 .*?("[^"]+")?$
Ronald
--
------------------------------------------------------------------
Have a feature request? Not sure the software's working correctly?
If so, please send mail to <[EMAIL PROTECTED]>, not to the list.
List FAQ: <http://www.barebones.com/support/lists/bbedit_talk.shtml>
List archives: <http://www.listsearch.com/BBEditTalk.lasso>
To unsubscribe, send mail to: <[EMAIL PROTECTED]>