Ooops, hehe; sorry about that...

Interesting subject for a technical mailing list, huh? :-)

I started fiddling around with the PORN_3 rule because it wasn't catching any of the 
porn spam that I got.

body PORN_3                     
/(?:(?:\bcum|\borg[iy]|\bwild|fuck|\bteen|\baction\b|spunk|\bpussy\b|\bpussies\b|suck\b|sucking\b|\bhot\b|\bhottest\b|\bvoyeur|\ble[sz]b(?:ian|o)|\banal\b|\binterracial|\basian\b|\bamateur|\bsex+\b|\bslut|explicit|xxx[^x]|\blive\b|celebrity|\blick|\bsuck|\bdorm\b|webcam|\bass\b|\bschoolgirl|\bstrip|\bhorny\b|\bhorniest\b|\berotic|\boral\b|\bpenis\b|\bhard-?core\b|\bblow[
 
-]*job|\bnast(?:y|iest)\b|\bporn|\bwhore|\bsexfest|\bnaked|\bnude|\bvirgin|\bnaughty\b|\bnaughtiest\b|\bgirls\b|\bcelebs?\b|\bbabes\b|\badult\b).{0,15}){3,}/i

I got rid of the last "\b" "\bschoolgirl\b", because that prevented it from matching 
"schoolgirls", and I added a "-?" to the middle of "hardcore", because it's ocasionaly 
written "hard-core".  Then I added a lot more words. They are (excluding the "\b"s):

- whore
- sexfest
- naked
- nude
- virgin
- naughty
- naughtiest
- girls
- celebs?
- babes
- adult

However, that still only caught a few of the porn spams I got.  I figured that we 
could add some more sensitive rules that would catch things PORN_3 wouldn't catch, but 
would also detect anything that PORN_3 would; the points assigned to PORN_3 would 
automatically be redistributed out among the different porn rules by the GA.  In fact, 
we could make several layers of porn rules, from very sensitive, to slightly 
sensitive, to PORN_3 sensitive, and as spam trigger less and less sensitive rules, the 
points would add up.

Here's what I came up with:

---------------

# Words which will almost never be seen outside of porn spam
body     PORN_9                 /(?:sexfest|cumshot)/i
describe PORN_9                 Uses words and phrases which indicate porn (9)

# Words which mainly show up in porn spam
body     PORN_10                
/(?:\bslut|\bwhore|\bxxx\b|\bporn|\b(?:Asian|Japanese|oriental)\s+(?:girls|schoolgirls)\b|\bbabes\b|gang[
 -]?bang)/i
describe PORN_10                Uses words and phrases which indicate porn (10)

# Words which often show up in spam, but also show up somewhat often in
# non-spam
body     PORN_11                /hard-?core/i
describe PORN_11                Uses words and phrases which indicate porn (11)

# Like PORN_3, but it only needs to match two terms instead of three. To
# compensate for only matching two terms, so as to not generate too many
# false positives, there can only be up to seven characters between terms,
# rather than the 15 for PORN_3; it also uses a short list of terms.
body     PORN_12                
/(?:(?:\bxxx|\bsex|\bslut|\bwhore|\bhottest\b|hard-?core|\bhorny\b|\bhorniest\b|\bvirgin|\bnaughty\b|\bnaughtiest\b|\bwebcam||\ble[sz]b(?:ian|o)
describe PORN_12                Uses words and phrases which indicate porn (12)

-------

Oh, and I also added "lust", "panty" and "panties" to the PORN_4 rule:

uri PORN_4  
/^https?:\/\/[\w\.]*(?:xxx|sex|anal|slut|pussy|cum|nympho|suck|porn|hardcore|taboo|whore|voyeur|lesbian|gurlpages|naughty|lolita|teen|schoolgirl|kooloffer|erotic|lust|panty|panties)\w*\./

-- 
Visit http://dmoz.org, the world's   | Give a man a match, and he'll be warm
largest human edited web directory.  | for a minute, but set him on fire, and
                                     | he'll be warm for the rest of his life.
[EMAIL PROTECTED]  ICQ: 132152059 |

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to