Hello Loren, Monday, July 25, 2005, 9:55:36 PM, you wrote:
>> That's why we use 70_sare_name_eng.cf files, to indicate that these >> rules work well only on systems which expect almost 100% English ham, >> and little to no ham in other languages. >> I've begun to wonder whether it might be worth while having >> 50_scores.cf for English emails, and then 50_scores_de.cf for German >> emails, and have SA pick the score appropriately depending upon the >> language of the email... LW> This is why I'd like to see a report-home option in SA that was enabled by default. LW> We could invent a class of rules that were 'test rules'. LW> They would have nil score and wouldn't report on the mail summary LW> if they hit. But they would show up in the report-home summary is LW> to whether they hit, and whether it was ham or spam. How would we determine ham/spam? At this point all we have is SA's first estimation, and no way of knowing whether this is accurate, FN, or FP. More accurate would be to do this after human verification, but that would greatly reduce the amount of feedback. LW> Then we can make rules that pass initial testing and stick LW> them out for what we believe is good use, or maybe even for pure LW> testing purposes. SA systems around the world would pick up these LW> rules with sa-update, and would report home on the hit stats. If LW> we have a good hitter that sucks in 'de', then we move it to an LW> english-only ruleset, or we have an exclude-de option on the front LW> of the rule or rule grouping. If the sysadmin has set his local LW> language correctly, things should work out correctly. The ideal sounds great to me. It'd be really good to figure out how to distribute "rules under consideration" around the world, and get feedback on how they work in real life, before giving them a score. The difficulty I see it is to determine just how they do work. Bob Menschel
