> -----Original Message----- > From: Mark Martinec [mailto:mark.martinec...@ijs.si] > Sent: Wednesday, August 29, 2012 5:00 PM > To: users@spamassassin.apache.org > Subject: Re: When force-expire doesn't work... > > Rob, > > > Because bayes_seen was also quite big I read up on that too. > > Since the table doesn't include any age information and (most) > > everything I found says "just delete it", I emptied the table. > > Although I think it's strange to just throw away information > > about previous seen messages that have been classified as > > either spam or ham. Any other insight in this would be > > valued.. > > No need to bother with bayes_seen, just purge it every once > in a while when it grows large. > > > > Some people include atime information for that purpose. > > > > Yes, thanks.. I ran into a post that mentioned that some time > > after I posted, and added such field which will indeed do what > > I want. (It isn't going to help with the imported data though, > > because that info is not > available in the original bdb > > files.) > > The main purpose of bayes_seen is to prevent a stream of same- > contents messages arriving in a short succession from polluting > a bayes database. > It is unlikely that a same contents message arrives more than once > during a long interval, and even if it does, there's not much harm > done even if re-learnt. > > I believe the bayes_seen had its purpose when mail viruses were > frequent and spam messages were arriving in non-personalized > batches. These times have long since gone.
Mark, thanks for the explanation! -- Rob