Steven Stern wrote:
We had a server go crazy last night and reset its date into August of
2277. In any case, we've resolved that, but now I can't get bayes to
expire.
After the clocks was correctly set, I deleted all tokens that had a
lastupdate in the future, and also removed similar bayes_seen rows. I
then reset the the token count in bayes_vars to the correct value.
d
When I try to run sa-learn --force-expire, nothing gets expired and
the token list keeps growing. Will this get better on its own or do I
need to intervene?
You might need to ditch your bayes database.
The database will, over time, partially fix itself, but right now any
"one off" tokens learned while the date was off are stuck in your bayes
DB until 2277. SA's expiry method is based on the "age" of a token,
based on when it was last accessed. That method has absolutely no way to
deal with atimes that are in the future, so it will never try to expire
those tokens.
It can partially fix itself, because every time a token gets accessed,
its atime gets updated. So as the more common tokens get used, they'll
start rotating out as they would normally. However, any unique tokens
are stuck there.
If you're *really* desperate to preserve the bayes DB, you could wait a
couple days, do a sa-learn --backup, use grep to remove all the lines
with absurd atimes, then use sa-learn --restore. That's a good bit of
work to go through...
If you decide to go this route: For reference, and assuming my
scratchpad math is right, the atimes for 2277 should be around 9.6
billion, while the ones for 2008 should be around 1.2 billion. Of
course, that's assuming the atimes are stored 64 bit and aren't wrapping
as 32 bit numbers.. However, if that were the case, they'd be wrapping
to 2004, and your expire numbers should show really high token
eliminations, not really low..