On Tue 12-Jul-05 11:50am -0500, Konrad Szkudlarczyk wrote: > Witam Ciê, Billu. Dwunastego lipca 2005 roku napisa³eœ: > >> Also both examples have a `"` after the >> naughty extension, not a >> `>`. > > I have seen html messages without `"` chars in links. See "test" > message attached.
Yes, but that html has no "hidden" reference as in some form of "href". >> Try this as your regex - the (?s) will cause lines to be spanned: >> the (?: ... ) is a non-capturing subpattern - everything else looks >> good: > >> (?s)http://.*\.(?:scr|exe|etc)" > > If message has link (to the simple html page, for example) AND text > ".exe", this regexp will catch this (false alarm). I see that I left off the leading `"` - but that would still leave the problem of finding the extension embedded in some text (false positive). Better would be: "http://[^"]*\.(?:scr|exe|etc)" Note: the use of the negative class obviates the need to specify DOTALL. -- Best regards, Bill Beta 3.51 Pro BayesIt! 0.8.1 X-Ray 1.4.0.0 XMP 0.9.6 XP Pro SP2 POP3 ________________________________________________________ http://www.silverstones.com/thebat/TBUDLInfo.html
