Amir Caspi wrote:
On Nov 9, 2018, at 8:10 AM, Matus UHLAR - fantomas <uh...@fantomas.sk> wrote:
how many spams and hams did you train then?
As of right now:
0.000 0 258427 0 non-token data: nspam
0.000 0 106813 0 non-token data: nham
0.000 0 438310 0 non-token data: ntokens
I have increased to this number, on some servers even to double of that
number.
I increased to your recommendation, so per above, am now storing more tokens...
hopefully this helps.
My target for tweaking bayes_expiry_max_db_size at work has been to try
to hit no more than 5-10% daily churn in tokens; IIRC I've asked once
or twice but nobody else has spoken up with any of their own rules of
thumb. Right now it's probably a bit high at 2450000 (given that every
so often, there are a couple of days with no tokens expired), but the
default of 250K was far too low.
-kgd