Hi,

> Might anyone be in a position to offer an authoritative response to

> these questions?
>
> I continue to see messages that are very similar to dozens of messages
> that have been marked as SPAM slipping through with *no Bayes scoring*
> (this is *after* fixing the SQL syntax error issue):
>
> bayes: cannot use bayes on this message; not enough usable tokens found
> bayes: not scoring message, returning undef
>

Have you tried to find out how many tokens are in your bayes DB? As the
user specified by bayes_sql_username (actually, it probably doesn't matter,
but you should to be sure) run the following:

# sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0     466417          0  non-token data: nspam
0.000          0     508868          0  non-token data: nham
0.000          0   10788203          0  non-token data: ntokens
0.000          0 1320901921          0  non-token data: oldest atime
0.000          0 1366385643          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal sync
atime
0.000          0 1366348380          0  non-token data: last expiry atime
0.000          0   28651364          0  non-token data: last expire atime
delta
0.000          0          0          0  non-token data: last expire
reduction count

This should show you the number of spam (nspam) and ham (nham) tokens in
the db.

> Is this normal? If so, what is the explanation for this behavior? I have

> marked dozens of nearly-identical messages with the subject "Garden hose
> expands up to three times its length" as SPAM (over the course of
> several weeks) as SPAM, and yet SA reports "not enough usable tokens
> found".
>

If they are identical, I don't believe it will create new tokens, per se.


> Is SA referring to the number of tokens in the message? Or the Bayes DB?
>

I believe it would be talking about the database, not the message.

Regards,
Alex

Reply via email to