On Wed, 2014-09-03 at 23:36 +0200, Axb wrote: > On 09/03/2014 11:17 PM, Jesse Norell wrote: > > Hello, > > > > Looking at recent botnet spam, comparing messages from one day to the > > next, I see new URL's being advertised that resolve to the same IP > > address as ones in the past. Eg. some at http://pastie.org/9525224 > > > > The first of those was already on URIBL/RBL lists when it hit, but the > > others were not - they all resolve to the same IP address. The message > > are hitting BAYES_50, on fairly well trained databases. I dug around > > some and as best I can tell, SpamAssassin does not resolve the IP > > addresses of URL's and add them to Bayes when training, is that correct? > > Would it not make sense to do so? > > SA does query BLs for a domain's A record's IP. > There are not many public lists which make a point of listing these. > the SBL lookups are probably the most efficient. > URIBL_SBL_A for the A rec's IP and > URIBL_SBL for the NS rec's IP > > > I could write a program to extract url's and add a X-URL-IP header or > > something which bayes could use, but would this not be useful enough to > > be in the normal part of training? > > Imo, unless you have hundreds of these withing a couple of minutes it > won't make a much of a difference
Hmm, ok. Without "hapaxes" enabled, how many hits on a token do you need for it to start being useful? > > Also in the discussion, am I correct that a spamassassin "rule" wouldn't > > be what does that, you would have to write a plugin? > > iirc, there isn't a _URI_ template tag for addheader "rules" > You could open a bug & request such a feature to be added. > (https://issues.apache.org/SpamAssassin/) I actually meant to clarify that a plugin is what would need to perform the IP lookup and add it as a bayes token. You can't say "increment URL_IP:x.x.x.x spam token count when training" in the rule language. (I've written some rules, never delved into plugins.) -- Jesse Norell Kentec Communications, Inc. 970-522-8107 - www.kci.net