Here's an update on my gibberish.py courier pythonfilter module. This takes into account the latest metastasis of this form of gibberish spam in which different random patterns occur on alternate lines. The module will looks at successive lines, one at a time, and if no two match in succession, it looks at every other line for matches, and then every 3rd line, etc., up to skipLines lines. If a line is shorter than gibChars, it eats an extra line and continues with the same skip value.
I think this code is OK, but Gordon might want to pass off on it. I'm tired, and it took me a long time to get this working :( But it does work here. Comment out the syslog invocation to save log space. This is debugging information, but may also be useful for automated log analysis. This code could probably be tightened up considerably. I'm using a couple of python iterators, and there may (probably) be faster ways to do this using simple list index value arithmetic. I HATE spammers! -- Lindsay Haisley | "UNIX is user-friendly, it just FMP Computer Services | chooses its friends." 512-259-1190 | -- Andreas Bogk http://www.fmp.com |
#!/usr/bin/python # vim: set expandtab ai ts=4: import sys import os.path import courier.config import courier.control maxMsgSize = 2000000 # Maximum message size. Pass if larger. checkLines = 100 # Number of lines (including headers) to check for repetitive gibberish gibLines = 40 # Number of consecutive gibberish lines required for rejection gibChars = 10 # Number of characters to check in each line for repetitive gibberish skipLines = 4 # Number of lines to scan for repetitive duplicates def initFilter(): courier.config.applyModuleConfig('gibberish.py', globals()) # Record in the system log that this filter was initialized. sys.stderr.write('Initialized the "gibberish" python filter\n') def piter(arr,n): iterLines = iter(arr) iterArr = [] for foo in iterLines: for r in range(n): foo = next(iterLines,False) if foo and len(foo) >= gibChars: iterArr.append(foo) else: foo = next(iterLines,False) return iter(iterArr) def gibDetect(bf): global gLskip a = [] bfh = open(bf) for i in range(checkLines): a.append(bfh.readline()) lfcount = 0 lcount = 0 lastlf = '' subject = '' for l in a: if not subject: if l[:8] == "Subject:": subject = l[9:] break for lskip in range(skipLines): for l in piter(a,lskip): lf = l[:gibChars] if lf == lastlf and not " " in lf: lfcount += 1 if lfcount >= gibLines: gLskip = lskip return ("gibberish: %s: match: %s" % (subject.rstrip(), lastlf)) else: lastlf = lf lfcount = 0 gLskip = lskip return None def doFilter(bodyFile, controlFileList): msgSize = os.path.getsize(bodyFile) if msgSize > maxMsgSize: return '' n = gibDetect(bodyFile) if n: sender = courier.control.getSendersMta(controlFileList) S.syslog(S.LOG_INFO | S.LOG_MAIL, n + "; " + sender[5:] + ": lskip=%s" % gLskip) return "500 gibberish spam from %s" % sender return ''
------------------------------------------------------------------------------
_______________________________________________ courier-users mailing list courier-users@lists.sourceforge.net Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users