--On 9 November 2006 18:28:10 +0000 Chris Lightfoot <[EMAIL PROTECTED]> wrote:
> On Thu, Nov 09, 2006 at 06:05:28PM +0000, Ian Eiloart wrote: >> --On 9 November 2006 17:28:07 +0000 Chris Lightfoot >> <[EMAIL PROTECTED]> wrote: > [...] >> > If a user decides a piece of mail is spam, it's spam (if >> > they change their decision then obviously the most recent >> > decision holds). >> >> Ah, well by this definition, a human can never make a wrong decision, >> just a decision that they might later revise. > > yes, that's kind of the point -- it's a bit futile for the > machine to try to tell the user what kind of email they do > or don't want. the flow of information is the other way > around, and the best the machine can do is to make the > same decisions that the user would if presented with the > mail. Oh, FFS. You said humans never made errors. Well, they do. They do accidentally delete messages they wouldn't want to, when they're buried in spam. That's why spam filtering is desirable. And, actually a machine can do it better, in the case where a user gets a lot of spam. It's been measured: <http://crm114.sourceforge.net/> "For comparison, I measured my human accuracy to be around 99.84%, by classifying the same set of about 3000 messages twice over a period of about a week, reading each message from the top until I feel "confident" of the message status, (one message per screen unless I want more than one screen to decide on a message.) and doing the classification in small batches with plenty of breaks and other office tasks to avoid fatigue. Then I diff()ed the two passes to generate a result. Assuming I never duplicate the same mistake, I, as an unassisted human, under nearly optimal conditions, am 99.84% accurate.). CRM114 was more than ten times better." -- Ian Eiloart IT Services, University of Sussex -- ## List details at http://www.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://www.exim.org/eximwiki/
