Rob McEwen wrote: >(2) ivmSIP/24 is attempting a very dangerous mission... which is to >preemptively block snowshoe spam by listing entire /24 blocks when >only a handful of IPs on that block have sent spam so far. But keep >in mind that (a) specifically--ivmSIP is going to block some spam >where that snowshoer hadn't sent from enough IPs to possibly be >listed (yet!) on ivmSIP/24 AND (b) The reason I call ivmSIP/24's >mission as "dangerous" is because there is a high risk of FPs >whereby spammers and legit senders share blocks of IPs within the >*same* /24 block. I've taken steps to greatly minimize that amount >of time that happens... but it is almost impossible to prevent this >altogether. Therefore, both medium-to-large ISPs and those who are >extremely concerned about FPs should use ivmSIP/24 for scoring >instead of blocking--in spite of my continued attempts to get >ivmSIP/24 to have just as few FPs as ivmSIP. (and I'm still working >on that!)
Rob, yes, I'm with you there. :) I'm also sympathetic to your lawsuit concerns. There's abundant horror stories, here and elsewhere, about unskilled sysadmins improperly implementing an RBL, and outright blocking on DNS data that was meant to be ADVISORY only. However, the snowshoe problem has gotten so bad, I've started "labelling" all ranges of any host when I find enough "pure" snowshoe blocks in their space. I do NOT score on these merely "labelled" ranges, but use them as the equivalent of an SA "meta", in combination with the other tests I mentioned previously (i.e. on Barracuda, has an unsubscribe phrase, has a "teaser" phrase in From/Subject). I'm finding that is extremely effective, and some combos (reliable "teaser" + any other single test) have zero FPs (so far). My own IP-to-Nation data file (both real and hand-classified "virtual" nations) is only used by my own people (all somewhat cautiously screened). I write all the "base" rules, and we have a kick-butt FP pipeline, so I don't have to worry about a random user misunderstanding what a particular IP block classification is for. I can be far more aggressive than most. :) What I want to do is expand my merely-labelled IP ranges, and was hoping I could do a straight import of your /24 list into its own unique country code, then run some MassChecks, and see how that goes. Ideally, that should be helpful to both us and you. >(4) And I'm about to implement a large improvement to ivmSIP. I >found a bug in the programming (that had been there all along) which >was preventing some deserving IP from getting into ivmSIP. So ivmSIP >is about to get better. Therefore, substantial improvements are >about to happen to BOTH ivmSIP and ivmSIP.24 --therefore, I'd prefer >that any publicly available stats/testing be done in a week or two >from now--AFTER these improvements are made. I understand about you wanting to review your data first, so no pressure. :) I would be happy to do a non-published "quick" look if you like, then send you any FPing-IPs I see, and wait until you're happy with your own data before I shared any public results. >(5) regarding the "shared hosting environment"... if ALL of these >mail servers resolve their queries using the *same* locally hosted >DNS server for resolving queries, then there is only need for a >single setup of the lists, for that one DNS server--and then there'd >be a single price based on the cumulative total number of >mailboxes--and, therefore, many quantity discounts would apply (or, >am I not understanding you? Aren't these all hosted at the same >physical location?... or multiple datacenters owned by the same >company?) I should have been clearer: I am _NOT_ a sysadmin/mailadmin. I'm justaprogrammer. :) About five years ago, a volunteer written filter at the main host I was using, broke. I ended up fixing it, which started me down the path of filter programming. :) Initially, my goal was merely to "fill in the holes" that exist in a shared hosting environment (where SA's full potential is limited by the need to target the lowest-common-denominator). It turned into a much larger project when I realized the data analysis potential of hand-classified data from a diverse group of smallish domains. All my volunteers grasp that they're helping each other, and are very motivated & enthusiastic. The project is still rather small (about forty domains, with about half a million spams per month), however it's a nice size and quality for doing serious research. :) We're split among several different hosts, so the only way it would be viable to use your lists in real-time, would be to set up our own DNS server, only known to project members. Since most of us are only receiving a trickle of snowshoe spam, that's not viable at this time. The ones who receive more than a trickle, receive a FLOOD. As I mentioned, in some cases 80% of their FNs are from snowshoers. - "Chip"