John Rudd wrote: > [EMAIL PROTECTED] wrote: >> John Rudd wrote: > >> You *will* not be getting a BAYES_90 or >> BAYES_99 from that. > > My first one got BAYES_80, without having seen that zombie/relay before. > That's enough for 2 points.
Which only tells me it had more than just the PDF attachment, which is not what we're seeing here. You're also avoiding the point I was making by saying "hey this spam I got which has all this additional content for bayes to work with happened to score high". Well of course it did. > It does matter, because it's not a "late receiver effect" unless > someone, anyone, has received spam from that host before. And there's > no relationship between "previous email from that host at all" and > "being listed in the PBL". > > Show me that the "that they have recieved spam from" part is how they > built their list, and not just "that appear to be end-user IP space". > "Additional IP address ranges are added and maintained by the Spamhaus PBL Team, particularly for networks which are not participating themselves (either because the ISP/block owner does not know about, is proving difficult to contact, or because of language difficulties), and where spam received from those ranges, rDNS and server patterns are consistent with end-user IP space which typically contain high concentrations of "botnet zombies", a major source of spam." > >> And? That's still a late-receiver effect, this particular message scored >> X points because of what Y host did Z minutes ago, where Z could be >> days/weeks/months of minutes. > > Any time I have seen the phrase "late receiver effect" used before, it > is about the message itself, and not the relay. Thus, Razor, which is > entirely about the message and not even remotely about the relay, is > entirely "late receiver effect" based. It applies to bayes as well. They both operate on matching hashes and coming to some sort of confidence factor for the new message based on historical data. The bayes system just happens to work at a finer grained level than Razor/Pyzor/DCC and isn't a distributed system. The only "late receiver effect" you can experience with bayes in the absence of content in the body of the message for it to work with, is from auto-training from a local spamtrap+greylisting. If there's additional content besides headers+attachment bayes might work if you're using per-user. If you've got site-wide and some of your users subscribe to certain common mailing lists, well you could just as easily end up with BAYES_00. >> Yes I failed to exclude BOTNET from that, it's the only score from the >> original message that started this that is solid. The reason is because >> BOTNET is proactive, all the others are either 100% reactionary or >> nearly so (PBL). > > My first one was caught by Botnet, Bayes_80 (again, no previous pdf > spam, and no previous activity from that relay), and UNIQUE_WORDS. Even > if Botnet alone hadn't been enough, and only had a score of 3 ... > _either_ of the other two would have been enough to push it up to 5. So it hit UNIQUE_WORDS, which means it had more than just the attachment, so yeah BAYES had something more to work with than just the headers, consider yourself fortunate.