Hi, > Might anyone be in a position to offer an authoritative response to
> these questions? > > I continue to see messages that are very similar to dozens of messages > that have been marked as SPAM slipping through with *no Bayes scoring* > (this is *after* fixing the SQL syntax error issue): > > bayes: cannot use bayes on this message; not enough usable tokens found > bayes: not scoring message, returning undef > Have you tried to find out how many tokens are in your bayes DB? As the user specified by bayes_sql_username (actually, it probably doesn't matter, but you should to be sure) run the following: # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 466417 0 non-token data: nspam 0.000 0 508868 0 non-token data: nham 0.000 0 10788203 0 non-token data: ntokens 0.000 0 1320901921 0 non-token data: oldest atime 0.000 0 1366385643 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 1366348380 0 non-token data: last expiry atime 0.000 0 28651364 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count This should show you the number of spam (nspam) and ham (nham) tokens in the db. > Is this normal? If so, what is the explanation for this behavior? I have > marked dozens of nearly-identical messages with the subject "Garden hose > expands up to three times its length" as SPAM (over the course of > several weeks) as SPAM, and yet SA reports "not enough usable tokens > found". > If they are identical, I don't believe it will create new tokens, per se. > Is SA referring to the number of tokens in the message? Or the Bayes DB? > I believe it would be talking about the database, not the message. Regards, Alex