On Tue, May 23, 2006 19:04, [EMAIL PROTECTED] said:
>
>     Amedee> I have noticed that a lot of spam contains disclaimer-ish
> text.
>     Amedee> If I train spambayes with "disclaimed" ham, I fear this will
>     Amedee> "pollute" the sb database.  The result might be that any email
>     Amedee> with a disclaimer-ish text will get a relatively high ham
> score.
>     Amedee> At the moment, I don't see a solution for this possible
> problem.
>     Amedee> I *could* not train on disclaimed ham, but if most of my
>     Amedee> correspondents have such boilerplates, training spambayes
> won't
>     Amedee> be very efficient.
>
> That depends.  Most common English words (most of the words in disclaimers
> are probably pretty common) should probably score around 0.5 and thus not
> be
> used in ranking messages, e.g.:

Interesting.
However, English is not my mother language and most of my correspondence
is in Dutch.
As a consequence, most common English words are quite uncommon for me. The
result is that common English words will score a bit above 0.5. Perhaps
not much, but enough to be significant after a while.

-- 
Disclaimer:
By sending an email to ANY of my addresses you are agreeing that:

   1. I am by definition, "the intended recipient"
   2. All information in the email is mine to do with as I see fit and
make such financial profit, political mileage, or good joke as it lends
itself to. In particular, I may quote it on usenet.
   3. I may take the contents as representing the views of your company.
   4. This overrides any disclaimer or statement of confidentiality that
may be included on your message.

_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to