Hi list,

I'm receiving a lot of spam of a very particular sort.

It's essentially FREEMAIL_FROM and the body only contains a fake Youtube
link like:

    <html><a
    href=3D"http://www.probono.fr/95280_pdf";>http://www.youtube.com/wa=
    tch?v=3D3VvOFqaHbL5&feature=3Dg-vrec&feature=3Dg-vrec</a></B><BR></html>


I ended with a regex for this kind of thing:

    full       AJB_UTUBE_BADLINK   
    
m'\shref=.{0,3}(https?://)?(www\.)?(?!youtube)[^\.]+\.[^>]+>(https?://)?(www\.)?youtube\.'mi
    score      AJB_UTUBE_BADLINK    0 #
    3.0                                                                         
                                                                                
         



I've been poking around with negative/positive lookaheads/lookbehinds,
full or rawbody rules.

I have some samples for my tests (FPs and FNs), sometime it matches,
often it's not... quite inconsistant and a large source of FPs.

I sense that this simple regex could be adapted to many domains (like
bank phishes, ebay phishes, ups phishes etc.) but it's not working as
is. It could eventually be turned in plugin of some sort, while this
will require much more work and brain.

What do you think of this? Do I miss something obvious in my regex?

-- 
Alex, from prypiat.
Yes, I recycle.

Reply via email to