[sniffer] Re: Rules for Large International ISPs
Hello Andy, Thursday, December 28, 2006, 10:34:15 AM, you wrote: Hi, This morning I had to file to false positive reports because emails from Wanadoo.FR and UOL.COM.BR were triggering SNIFFER-IP. I don't know if this is a coincidence or if this is a worrisome new trend snip/ Our IP rule coding policies have not changed in quite some time and the false positive rates for IP rules have dropped significantly since the last change. IP rules are now coded only by a specialized bot which has very strict rules and looks only at clean spamtraps for recurring abuse. 20061228150347 16 0 Match 799799 63 1 48 75 20061228150347 16 0 Final 799799 63 0 174475 The above rule had been in place for 346 days without any false positive reports. The rule was coded by the previous robot and at the time was verified by 3 additional blacklists. 20061228110558 15 16 Match 1235160 63 1 46 73 20061228110558 15 16 Final 1235160 63 0 298073 This was coded by the new bot (F001) approximately 28 days ago - no prior false positives. IP rules are currently coded by the F001 bot based on direct, repeated observations at clean spamtraps. IP rules are excluded on the first false positive report so that they cannot be reactivated without direct human intervention. It is not practical for us to keep tabs on, nor deeply research every possible IP that may be used by any large (or otherwise) ISP. Instead we have the above policy and very strict observational rules to prevent the addition of IPs that are likely to produce significant legitimate traffic and to quickly and permanently remove IPs that cause false positives. (some exceptions, of course, apply). It is inevitable that there will be a nonzero error rate - but that error rate is demonstrably small given our current process, and we are constantly researching and developing techniques to improve on that rate. Hope this helps, _M -- Pete McNeil Chief Scientist, Arm Research Labs, LLC. # This message is sent to you because you are subscribed to the mailing list sniffer@sortmonster.com. To unsubscribe, E-mail to: [EMAIL PROTECTED] To switch to the DIGEST mode, E-mail to [EMAIL PROTECTED] To switch to the INDEX mode, E-mail to [EMAIL PROTECTED] Send administrative queries to [EMAIL PROTECTED]
[sniffer] Re: Rules for Large International ISPs
Hi Pete, Thanks. Let me apologize for the accusatory tone of my message. Someone pointed out to me that my annoyance made me cross the line of being offensive. I would suggest to add some intelligence to the bot F001, where it compares implicated address ranges against a table of excepted IPs, which you would build over time (or use some public sources of known-good IP ranges to get a start). I understand the reporting rate of false positives is low. But that may just be because most false positives simply are never reported. In my case, I couldn't use Sniffer to block outright - so for years I never worried much about false positives. Only very recently, I have tightened some weights AND I have improved the reporting to the point that it's now easier for me to spot certain false positives and have started to report them more consistently. Yet, I only review ONE out of a thousand mail boxes and many hundreds of domains - so chances are a large number of false positives are never even noticed by me on a daily basis (and I'm a very small operation). So - the FP rates might be misleading, because they only reflect REPORTED FPs - no one knows how tiny or possibly how huge UNREPORTED FPs might be. Consequently, it may be worthwhile to improve F001 as mentioned before. Best Regards Andy Schmidt Phone: +1 201 934-3414 x20 (Business) Fax:+1 201 934-9206 -Original Message- From: Message Sniffer Community [mailto:[EMAIL PROTECTED] On Behalf Of Pete McNeil Sent: Thursday, December 28, 2006 12:04 PM To: Message Sniffer Community Subject: [sniffer] Re: Rules for Large International ISPs Hello Andy, Thursday, December 28, 2006, 10:34:15 AM, you wrote: Hi, This morning I had to file to false positive reports because emails from Wanadoo.FR and UOL.COM.BR were triggering SNIFFER-IP. I don't know if this is a coincidence or if this is a worrisome new trend snip/ Our IP rule coding policies have not changed in quite some time and the false positive rates for IP rules have dropped significantly since the last change. IP rules are now coded only by a specialized bot which has very strict rules and looks only at clean spamtraps for recurring abuse. 20061228150347 16 0 Match 799799 63 1 48 75 20061228150347 16 0 Final 799799 63 0 174475 The above rule had been in place for 346 days without any false positive reports. The rule was coded by the previous robot and at the time was verified by 3 additional blacklists. 20061228110558 15 16 Match 1235160 63 1 46 73 20061228110558 15 16 Final 1235160 63 0 298073 This was coded by the new bot (F001) approximately 28 days ago - no prior false positives. IP rules are currently coded by the F001 bot based on direct, repeated observations at clean spamtraps. IP rules are excluded on the first false positive report so that they cannot be reactivated without direct human intervention. It is not practical for us to keep tabs on, nor deeply research every possible IP that may be used by any large (or otherwise) ISP. Instead we have the above policy and very strict observational rules to prevent the addition of IPs that are likely to produce significant legitimate traffic and to quickly and permanently remove IPs that cause false positives. (some exceptions, of course, apply). It is inevitable that there will be a nonzero error rate - but that error rate is demonstrably small given our current process, and we are constantly researching and developing techniques to improve on that rate. Hope this helps, _M -- Pete McNeil Chief Scientist, Arm Research Labs, LLC. # This message is sent to you because you are subscribed to the mailing list sniffer@sortmonster.com. To unsubscribe, E-mail to: [EMAIL PROTECTED] To switch to the DIGEST mode, E-mail to [EMAIL PROTECTED] To switch to the INDEX mode, E-mail to [EMAIL PROTECTED] Send administrative queries to [EMAIL PROTECTED] # This message is sent to you because you are subscribed to the mailing list sniffer@sortmonster.com. To unsubscribe, E-mail to: [EMAIL PROTECTED] To switch to the DIGEST mode, E-mail to [EMAIL PROTECTED] To switch to the INDEX mode, E-mail to [EMAIL PROTECTED] Send administrative queries to [EMAIL PROTECTED]
[sniffer] Re: Rules for Large International ISPs
Well, I guess I will ruffle someones feathers again with my response here, but like your oringial message, I think we need to be honest here. This is not a message sniffer 'popularity' contest after all, we are paying customers and need to ensure SNF causes no False Postives. Over the last few months, I've seen more an more false postives from Message Sniffer. The few that I sent to their FALSE address have always been challenged as legitimate. It's difficult at best for me to believe that our Local Newspaper and other legitimate sites that are classified by the SNF EXPERIMENTAL-IP rule are solid. As a result, I've constructed SA rules to counteract SNF False Postives. It got so bad within the last two weeks or so that I completely disabled SNF lookups to avoid complaints from our users. To add insult to injury, last year they drastically up the service price. Now my subscritpion is up for renewal. I am honestly thinking of NOT renewing it. IMO, seems that things have gone down hill since ARM bought the little company that could Couple that with two years worth of promises to update the MDaemon Plugin code, and all the various improvement that Spam Assassin and SARE rulesets have made... well I question if it's worth the inflated cost anymore. Shoot away Sniffer Cheer-leaders... at least I am being honest. -Original Message- From: Message Sniffer Community [mailto:[EMAIL PROTECTED] On Behalf Of Andy Schmidt Sent: Thursday, December 28, 2006 1:26 PM To: Message Sniffer Community Subject: [sniffer] Re: Rules for Large International ISPs Hi Pete, Thanks. Let me apologize for the accusatory tone of my message. Someone pointed out to me that my annoyance made me cross the line of being offensive. I would suggest to add some intelligence to the bot F001, where it compares implicated address ranges against a table of excepted IPs, which you would build over time (or use some public sources of known-good IP ranges to get a start). I understand the reporting rate of false positives is low. But that may just be because most false positives simply are never reported. In my case, I couldn't use Sniffer to block outright - so for years I never worried much about false positives. Only very recently, I have tightened some weights AND I have improved the reporting to the point that it's now easier for me to spot certain false positives and have started to report them more consistently. Yet, I only review ONE out of a thousand mail boxes and many hundreds of domains - so chances are a large number of false positives are never even noticed by me on a daily basis (and I'm a very small operation). So - the FP rates might be misleading, because they only reflect REPORTED FPs - no one knows how tiny or possibly how huge UNREPORTED FPs might be. Consequently, it may be worthwhile to improve F001 as mentioned before. Best Regards Andy Schmidt Phone: +1 201 934-3414 x20 (Business) Fax:+1 201 934-9206 -Original Message- From: Message Sniffer Community [mailto:[EMAIL PROTECTED] On Behalf Of Pete McNeil Sent: Thursday, December 28, 2006 12:04 PM To: Message Sniffer Community Subject: [sniffer] Re: Rules for Large International ISPs Hello Andy, Thursday, December 28, 2006, 10:34:15 AM, you wrote: Hi, This morning I had to file to false positive reports because emails from Wanadoo.FR and UOL.COM.BR were triggering SNIFFER-IP. I don't know if this is a coincidence or if this is a worrisome new trend snip/ Our IP rule coding policies have not changed in quite some time and the false positive rates for IP rules have dropped significantly since the last change. IP rules are now coded only by a specialized bot which has very strict rules and looks only at clean spamtraps for recurring abuse. 20061228150347 16 0 Match 799799 63 1 48 75 20061228150347 16 0 Final 799799 63 0 174475 The above rule had been in place for 346 days without any false positive reports. The rule was coded by the previous robot and at the time was verified by 3 additional blacklists. 20061228110558 15 16 Match 1235160 63 1 46 73 20061228110558 15 16 Final 1235160 63 0 298073 This was coded by the new bot (F001) approximately 28 days ago - no prior false positives. IP rules are currently coded by the F001 bot based on direct, repeated observations at clean spamtraps. IP rules are excluded on the first false positive report so that they cannot be reactivated without direct human intervention. It is not practical for us to keep tabs on, nor deeply research every possible IP that may be used by any large (or otherwise) ISP. Instead we have the above policy and very strict observational rules to prevent the addition of IPs that are likely to produce significant legitimate traffic and to quickly and permanently remove IPs that cause false positives. (some exceptions, of course, apply
[sniffer] Re: Rules for Large International ISPs
Hello Andy, Thursday, December 28, 2006, 3:16:57 PM, you wrote: snip/ need to ensure SNF causes no False Positives I agree here. While I can excuse the occasional accidental FP - there should NOT be the mindset that customers just have to live with the fact that the IP rules WILL always catch a certain amount of good emails, because no effort has been made to exempt known good IP/RevDNS ranges. The bot does make this effort, though that can always be improved. Most IP FPs these days are for older rules that at the time they were created were valid and have shown consistent activity without FP reports over their lifetime. Those where activity has fallen off have been automatically removed. I also think that the low false positive argument is built on unproven assumptions. To me, researching and reporting a single false positives takes a very significant amount of time. Bigger users may simply have no practical way to reporting their false positives and instead just cope with it by using weight-based systems to compensate. To be sure larger systems do tend to have large weight-based systems in place. None the less we do hear from them when false positives occur, and we also hear from smaller systems that are more focused on individual customers and domains. Where we get our FP data: We have a range of customers who reliably report false positives to us including a number of larger ISPs who consistently research and report their FPs in detail. We also have smaller service providers -- guys who live in their system who do the same thing-- so we get a fairly wide perspective. In addition to that we have links into a number of systems to provide us with rule IDs for messages that are released from quarantines, etc... In the new version of SNF we are adding an automated reputation system component called GBUdb (Good, Bad, Ugly / Unknown, Ignore / Infrastructure). This system will (among other things) learn the good IP sources for a given system and automatically override pattern matching rules that hit known good messages. The system will also report these conflicts to us and in extreme cases will be able to auto-panic bad pattern rules so that they not only have no effect on the local systems but are also automatically withdrawn from the core rulebase. (Rule panics are rare, but also destructive. The auto-panic mechanism should completely mitigate them if/when one slips trough.) All that by way of saying - we are constantly working to improve our access to good sources of FP data - even while reducing the system admin's workload. The process of finding clues in the header, then finding the correct log file and then matching log file lines in Sniffer, then creating an evidence email, is just far too cumbersome. I should be able to forward any falsely identified emails (with SMTP headers) as easily as I can submit real spam for analysis. If that requires that Sniffer has to insert header information with the rule number - so be it. My inclination is, if it's currently 10 times harder to report false positives than it is to report missed spam, then I suspect that the false positive rates could be 10 times higher than what's actually being reported. In many cases this is true -- the cases tend to be platform specific. In MDaemon, for example, rule id information is injected into the headers so that FP reporting is a relatively painless process (no research required). The same is true on most *nix implementations. On IMail/SmarterMail type implementations it may be possible to add the ability to add headers to the message - but only at a significant I/O cost (rewriting the entire message with the new headers more than once). I should also note that in most cases our system is able to identify the rules that matched an FP submission without any additional research on the part of the submitting admin. Our FP system re-scans each submission with every known rule -- it is unfortunately also true that there are some systems that for a variety of reasons modify the message during the submission process so that the rules no longer match -- in those cases the research is required in order to move forward. The good news (if it can be called that) is that the need to do the research tends to be consistent--- if you are able to submit an FP without finding the matching log lines then you are likely to be able to do this consistently, and most folks do fall into this category. --- Along with the new engine I am considering some mechanisms that might be able to store rule matching data along with a message id hash on the local SNF node for a period of time. If research on this mechanism indicates that it would be useful and desirable then we may be able to add a feature that would allow an SNF node to provide the data upon request when an FP is submitted without having to modify the message in any way -- provided the FP is discovered and submitted within the storage window... This is all