On Tue, Jan 30, 2007 at 05:06:11PM +1300, Nigel Stanger wrote:
> ^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - - \[(.*?)\] "GET /(\d{1,4}).*?
> HTTP/1.." 200 .*("[^"]+")?$
> 
> and here's an example of the kind of line that it should match (again,
> insert spaces in place of line breaks):
> 
> 139.80.75.149 - - [17/May/2006:14:32:57 +1200] "GET
> /1/01/EnvironmentalHarmonyandCommercialSuccess1987.pdf HTTP/1.0" 200 246158
> "http://cardrona.eprints.otago.ac.nz/perl/latest"; "Mozilla/4.0 (compatible;
> MSIE 6.0; Windows NT 5.1; SV1)"
> 
> Now the pattern *does* match the above just fine; the problem is that it
> doesn't capture correct sub-patterns. From the above line, I get the
> following:
> 
> \1 = 139.80.75.149              (correct)
> \2 = 17/May/2006:14:32:57 +1200 (correct)
> \3 = 1                          (correct)
> \4 =                            (incorrect)
> 
> \4 should contain the user-agent string from the end of the line, and did up
> until yesterday, but for some reason it's no longer working. If I remove the
> ? just before the $, \4 is then correct, but then the pattern fails to match
> lines that have no trailing user-agent string (of which there are many). I
> added the ? yesterday for precisely that reason, *and it worked correctly*
> in both BBEdit and PHP. Today it doesn't work in either, and I don't know
> why.

.*("[^"]+")?$ matches as many characters as possible, followed by an
optional ("[^"]+"), followed by the end of the string.  Anything that could
have been matched by ("[^"]+")? has already been matched by .*, so
("[^"]+")? will never match anything.  I don't think this can be the same
regex that was working for you yesterday.

If you change the .* to be non-greedy, it should work.

^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - - \[(.*?)\] "GET /(\d{1,4}).*?
HTTP/1.." 200 .*?("[^"]+")?$


Ronald

-- 
------------------------------------------------------------------
Have a feature request? Not sure the software's working correctly?
If so, please send mail to <[EMAIL PROTECTED]>, not to the list.
List FAQ: <http://www.barebones.com/support/lists/bbedit_talk.shtml>
List archives: <http://www.listsearch.com/BBEditTalk.lasso>
To unsubscribe, send mail to:  <[EMAIL PROTECTED]>

Reply via email to