On Tue 12-Jul-05 11:50am -0500, Konrad Szkudlarczyk wrote:

> Witam Ciê, Billu. Dwunastego lipca 2005 roku napisa³eœ:
>
>> Also  both  examples  have  a `"` after the
>> naughty extension, not a
>> `>`.
>
> I have seen html messages without `"` chars in links. See "test"
> message attached.

Yes, but that html has no "hidden" reference as in some
form of "href".

>> Try  this  as  your regex - the (?s) will cause lines to be spanned:
>> the  (?: ... ) is a non-capturing subpattern - everything else looks
>> good:
>
>> (?s)http://.*\.(?:scr|exe|etc)"
>
> If message has link (to the simple html page, for example) AND text
> ".exe", this regexp will catch this (false alarm).

I see that I left off the leading `"` - but that would
still leave the problem of finding the extension
embedded in some text (false positive).

Better would be:

   "http://[^"]*\.(?:scr|exe|etc)"

Note: the use of the negative class obviates the need
to specify DOTALL.

-- 
Best regards,
Bill

Beta 3.51 Pro  BayesIt! 0.8.1  X-Ray 1.4.0.0  XMP 0.9.6  XP Pro SP2  POP3



________________________________________________________

http://www.silverstones.com/thebat/TBUDLInfo.html

Reply via email to