On Wed, 20 Jan 2016, Dianne Skoll wrote:

On Wed, 20 Jan 2016 11:52:35 -0800
Marc Perkel <[email protected]> wrote:

Again - Bayes compares what matches. My filter compares what doesn't
match.

Your filter is exactly equivalent to Bayes if you do the following
things:

1) Use combinations of up to four words as tokens, instead of just
single tokens.

2) Throw out any tokens whose probability is not either 100% spam or 100% ham.

Idea (1) is probably good.  We use words and word-pairs.  I'm not sure the
extra storage for more than pairs is justifiable.

Personally I'd rather see SA implement *that*.


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 [email protected]    FALaholic #11174     pgpk -a [email protected]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
 3 days until John Moses Browning's 161st Birthday

Reply via email to