> Can you recommend an alternate process, or changes to the existing > process that would be an improvement and would continue to achieve > these goals? We are always looking for ways to improve.
I've been thinking about this recently. I'm mostly concerned with FPs for the best tests, like Sniffer, so I was thinking about grouping held messages by highest weight test that they failed.... something I'm considering for the spam queue review app I'm working on in "spare" time. I do have a framework in place in the app to assign a keystroke to an action, like copying the message, altering the copy to send to your false@ address, and releasing back for delivery. That makes FP processing on my end much easier, with one keystroke doing everything we need. This also got me thinking of the flip side, spam reporting. There's a significant untapped load of spam that sniffer doesn't fail that we filter. I was thinking about creating a filter to copy your spam@ address with messages that get moved to our archive (we archive held spam for 30 days in case we missed an FP) that did not fail Sniffer. This would be after we have already processed for FPs. Thoughts? Darin. ----- Original Message ----- From: "Pete McNeil" <[EMAIL PROTECTED]> To: "Message Sniffer Community" <sniffer@sortmonster.com> Sent: Tuesday, June 06, 2006 7:29 PM Subject: [sniffer]Re[2]: [sniffer]A design question - how many DNS based tests? Hello Matt, Tuesday, June 6, 2006, 12:37:56 PM, you wrote: <snip/> > appropriately and tend to hit less often, but the FP issues with > Sniffer have grown due to cross checking automated rules with other > lists that I use, causing two hits on a single piece of data. For > instance, if SURBL has an FP on a domain, it is possible that > Sniffer will pick that up too based on an automated cross reference, > and it doesn't take but one additional minor test to push something > into Hold on my system. Please note. It has been quite some time now that the cross-reference style rule-bots have been removed from our system. In fact, at the present time we have no automated systems that add new domain rules. Another observation I might point out is that many RBLs will register a hit on the same IP - weighting systems using RBLs actually depend on this. An IP rule hit in SNF should be treated similarly to other RBL type tests. This is one of the reasons that we code IP rules to group 63 - so that they are "tumped" by a rule hit in any other group and therefore are easily isolated from the other rules. <snip/> > handling false positive reports with Sniffer is cumbersome for both > me and Sniffer. The current process has a number of important goals: * Capture as much information as possible about any false positive so that we can improve our rule coding processes. * Preserve the relationship with the customer and ensure that each case reaches a well-informed conclusion with the customer's full knowledge. * Protect the integrity of the rulebase. This link provides a good description of our false positive handling process: http://kb.armresearch.com/index.php?title=Message_Sniffer.FAQ.FalsePositives Can you recommend an alternate process, or changes to the existing process that would be an improvement and would continue to achieve these goals? We are always looking for ways to improve. > I would hope that any changes > seek to increase accuracy above all else. Sniffer does a very good > job of keeping up with spam, and it's main issues with leakage are > caused by not being real-time, but that's ok with me. At the same > time Sniffer is the test most often a part of false positives, being > a contributing factor in about half of them. Log data shows that SNF tags on average more than 74% of all email traffic and a significantly higher percentage of spam typically. It would seem that it is likely that SNF would also represent highly in the percentage of false positives (relative to other tests with lower capture rates) for any given system since it is represented highly in email traffic as a whole. You've also indicated that you weight SNF differently than your other tests - presumably giving it more weight (this is frequently the case on many systems). How much do you feel these factors contribute to your findings? > About 3/4 of all FP's (things that are blocked by my system) are > some form of automated or bulk E-mail. That's not to say that other > tests are more accurate; they are just scored more appropriately and > tend to hit less often, but the FP issues with Sniffer have grown > due to cross checking automated rules with other lists that I use, > causing two hits on a single piece of data, W/regard "causing two hits on a single piece of data": SNF employs a wide variety of techniques to classify messages so it is likely that a match in SNF will coincide with a match in some other tests. In fact, as I pointed out earlier, filtering systems that apply weights to tests depend on this very fact to some extent. What makes weighting systems powerful is that when more than one test does trigger on a piece of data, such as an IP or URI fragment, that the events leading up to that match were distinct for each of the matching test. This is the critical component to reducing errors through a "voting process". Test A uses process A to reach conclusion Z. Test B uses process B to reach conclusion Z. Process A is different from process B and so the inherent errors in process A are different than the errors in process B and so we presume it is unlikely that an error in Test A will occur under the same conditions as the errors in Test B. If a valid test result is the "signal" we want, and an erroneous test result is "noise" on top of that signal then it follows: By combining the results of Test A and Test B we have the opportunity to increase the signal to noise ratio to the extent our assumptions about errors are true. In fact, if no error occurs in both A and B under the same circumstances, then defining a new test C as (A+B/2) will produce a signal that is "twice as clear" as test A or B on it's own. If I follow what you have said about false positives and SNF matching other tests, then you are describing a situation where the process for SNF and the alternate tests are the same - or put another way, that SNF somehow represents a copy of the other test and so will also contain the same errors. If that's the case then the equation changes and the advantage of combining the tests evaporates because the errors (noise) would be amplified as much as the desired result (signal). I assure you that this is NOT the case with SNF. There are no components of SNF's filtering scheme that are copied from any other system. This is one of our primary design constraints precisely because "starting from scratch" ensures SNF's results are distinct from other tests. Previously when we had bots that cross-referenced other tests as part of their validation process: The result of an SNF rule being present _DID_ represent a unique process (unique perspective) for that result. That is, the "meaning" of an IP matching in SNF was distinct from the meaning of the same IP matching in another test - even if that test had been used as part of the validation process. This is because the origination of the SNF rule was based on a distinct process - most commonly a spam reaching our spamtraps some significant number of times and being sourced from the same IP, and more commonly, several different spam reaching our spamtraps from a given source IP. Additionally, IP rules removed from SNF are removed permanently so the absence of an IP match in SNF where it existed in the alternate test might have that special meaning. In that context, an SNF IP match result would add significant value to the equation because it would serve to validate the match in the alternate test. Put another way, the vast majority of IP matches in the alternate test would not be present in SNF, and so those that are present in SNF are significant because they represent that BOTH distinct testing processes agreed on the result. These days, our new bots use entirely different processes to create IP rules and they do not validate these rules against any other single RBL. To the extent validation might have made the previous test cases less distinct, the new test cases are certainly more distinct and so you should probably reconsider how you view SNF's bot generated rules. --- In summary, if FPs with SNF have grown it is not due to cross checking since that process is not used anymore. Also, if FPs with SNF have grown at all then we need to understand your data better: Overall, the rate of reported false positives is measurably lower even while our subscriber base has grown significantly. http://reports.messagesniffer.com/Performance/FalseReportsRates.jsp > and the growth of the Sniffer userbase which has become more likely > to report first-party advertising as spam, either manually or > through an automated submission mechanism. Firstly, we handle spam submitted by humans (any humans) with different rules than spam that hits our spamtraps. Also, we generally discourage the use of broadly deployed, automated spam submission precisely because these kinds of submissions tend not to agree between individual users and frequently not with the policies of the system's administrator(s). In any case, first-party advertising is not automatically considered to be legitimate traffic on many systems, and it does appear that a significant portion of our userbase holds this view. Another segment of our customer base seems to prefer to take each case on it's own - This seems to be the largest group and it is also our strategy for the core rulebase. The fact that many "first-party advertisements" run afoul of our spamtraps is also an important factor since the only way those addresses could make it onto their lists is through harvesting - either directly or indirectly. This is a clear indicator that some of this content is reaching people who may not have a first party relationship at all, or if they do may have explicitly opted out of any further contact w/ the advertiser and especially it's affiliates. (I have heard this complaint more than once, and have made the complaint myself.) It is also true that this kind of traffic frequently contains obfuscation and tracking mechanisms that are also used by hard core spammers, and that there is a segment of advertisers that will leverage both legitimate and illegitimate bulk mail providers and "marketing services" either by choice or by accident. - All of these things make the subject of "first-party advertising" problematic at best. None the less, we almost never code a rule for what appears to be legitimate first party advertising, and even the questionable items must be heavily submitted before we will consider coding for it. Layered on top of this is the fact that our system prevents us from repeating rules, our protocols tend to force us to create very specific forms of rules (that would likely match if sourced from similar messages) - and rules that have already been removed due to false positives remain in the system as reminders of what not to code. As a result we nearly never make the same "mistake" twice, and we tend to learn quickly as a group. Our strategy in these cases is to keep the core rule-base focused on the preferences of the greatest segment of our subscriber base and to customize individual subscribers in cases where their policy disagrees. This customization process most frequently occurs as a result of our false positive handling process... though it is worth noting that the vast majority of reported false positives result in rules being removed from the core rulebase. To date, only a very small fraction of our customers have any customization. Ongoing development work and upcoming features are focused on improving accuracy (on both the spam and ham sides of the equation), improving response time, increasing SNFs flexibility and breadth, reducing complexity, maintenance & administration, and improving speed & efficiency. _M -- Pete McNeil Chief Scientist, Arm Research Labs, LLC. ############################################################# This message is sent to you because you are subscribed to the mailing list <sniffer@sortmonster.com>. To unsubscribe, E-mail to: <[EMAIL PROTECTED]> To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]> To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]> Send administrative queries to <[EMAIL PROTECTED]> ############################################################# This message is sent to you because you are subscribed to the mailing list <sniffer@sortmonster.com>. To unsubscribe, E-mail to: <[EMAIL PROTECTED]> To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]> To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]> Send administrative queries to <[EMAIL PROTECTED]>