On Sep 8, 2008, at 1:25 PM, Jeremy Hughes wrote:
SpamSieve seems to have become really slow for me recently, and I
suspect it is because my spam corpus is pretty large (345,672
messages,
2,390,729 words).
Does this seem unreasonably large?
Yes, it's normal to have under 2,000 messages and 200,000 words in the
corpus.
There used to be a Prune Corpus option, but this has disappeared -
should I just open the corpus and delete everything that was "Last
Used"
before 2008?
That would certainly make it faster. It would probably be better for
the accuracy, however, if you reset the corpus and then re-trained
SpamSieve with a smaller number of recent messages:
<http://c-command.com/spamsieve/manual-ah/using-spamsieve-with-yo>
SpamSieve's auto-training feature has been improved since you started
using it, so the corpus will no longer grow so large by itself.
I first noticed the slowness after moving from Tiger to Leopard a few
weeks back - not sure if this is connected.
Yes, it runs faster on Tiger. Apple made some of the APIs much slower
in Leopard.
--
Michael Tsai <http://c-command.com>