From: "Robert Brooks" <[EMAIL PROTECTED]> > Jeremy Kister wrote: > > On Monday, May 17, 2004 1:38 AM, I wrote: > > > >>while 20 words does seem accurate in detecting poison, there are too many > >>false positives in ham, such as > > > > > > playing with the regex just a bit more produces a far more accurate result > > in both ham and spam: > > > > rawbody BAYES_POISON_01 /([a-z]{3,}\s+){20}/ > > a lot of the bayes poison is either in a tiny font or in the background colour, > if we could exclude text on this basis then it would be a winner - SA3.0?
There are already sets of external rules that handle these issues quite well, individually. It'd be duck soup to make meta rules for combined strings of long words without short words or punctuation with micro or invisible text. {^_^}