At 08:31 PM (-0500) 2/27/2004 (Friday), Vandervort, David wrote:
I've had this problem several times when I fed spamassassin several hundred messages at once. The only solution I found (admittedly quite an annoying one) was to blow away the old database files and retrain from scratch - in smaller increments than previously so as not to break it again.

I have no clue why it happens. Sometimes I can get away with large numbers of messages, sometimes I can't. Generally, I try to stick to no more than one or two hundred at a time so as to avoid the issue entirely.


I've been training it with 2000 to 3000 spam and 100 to 300 ham every week for the last ~7 weeks with no problems.

I just broke my ham down into smaller batches, and it didn't learn from anything:

# sa-learn --ham --mbox Ham-001.mbox
Learned from 0 message(s) (450 message(s) examined).

# sa-learn --ham --mbox Ham-002.mbox
Learned from 0 message(s) (541 message(s) examined).

# sa-learn --ham --mbox Ham-003.mbox
Learned from 0 message(s) (540 message(s) examined).

# sa-learn --ham --mbox Ham-004.mbox
Learned from 0 message(s) (542 message(s) examined).

# sa-learn --ham --mbox Ham-005.mbox
Learned from 0 message(s) (540 message(s) examined).

# sa-learn --ham --mbox Ham-006.mbox
Learned from 0 message(s) (214 message(s) examined).


So if I just delete bayes_seen and bayes_toks and then re-train that will solve my problems? Should I set use_bayes to 0 while I'm deleting / retraining to not make stuff barf?


Thx.

-JR





Reply via email to