On Tue, 2004-05-25 at 15:55, E. Falk wrote: > Hello Bram, > > I'll try to answer you at least insofar as why my organization wouldn't > allow a hoax ruleset installed on the MTA...
Thanks for responding, I like this line of thought but got quite confused reading the thread. The other messages in this thread already explained that bayes does indeed look at the from but in a different way than AWL does. Things are starting to make sense to me and the clouds surrounding bayes and AWL appear to be clearing a bit further! :) [...] > Signatures - most corporate employees have sigs (fairly distinct ones, > often). That's a Bayes problem right there, because your hoaxes are > going to get pretty high scores due to the limited variety of them (they > are much easier to identify than spam, in general). I can see that signatures can be a problem when using bayes *when treating hoaxes as spam* because the same sigs - and therefore the same bayes tokens - would appear both in ham and hoaxes(spam in this scenario). [a light starts shining!][deletes more questions] I see it now! Because we're trying to increase the score for hoaxes all tokens found in a hoax would get a higher score in bayes including the tokens found in the (corporate) sigs! Of course, it makes sense now, we would need some way (negative scoring rules perhaps) to compensate for this but that would cause problems as well because it would be easy to forge by spammers... This is proving more difficult than I imagined at first indeed! [...] > As above, hoaxes will probably score fairly high, and so AWL will > compound that problem. On a personal e-mail system, this is an > acceptable level of risk, since volume is probably low and FP's can be > examined. On a corporate e-mail system, you're looking at risk vs. > benefit scenarios. When a lost e-mail could cost a lot of money, you've > got to be a little careful. Indeed for larger companies the cost of having to manually delete hoaxes is much lower than the cost of losing a (potential) client [...] > Hope this helps - this is by no means authoritative, just one company's > policy. :) It did indeed, I even figured some things out while replying! Thanks again Bram -- # Mertens Bram "M8ram" <[EMAIL PROTECTED]> Linux User #349737 # # SuSE Linux 8.2 (i586) kernel 2.4.20-4GB i686 256MB RAM # # 7:33pm up 64 days 23:09, 7 users, load average: 0.01, 0.04, 0.00 #
