On Wed, 26 Nov 2003, alan premselaar wrote:
> I've recently noticed something I think is a little strange but I'd
> like to confirm it with the list.
>
> My bayes database seems excessively large at 967M:
>
> -rw-rw-rw- 1 defang defang 61k Nov 26 16:34 bayes_journal
> -rw-rw-rw- 1 defang defang 624k Nov 26 15:58 bayes_seen
> -rw-rw-rw- 1 defang defang 967M Nov 26 15:58 bayes_toks
>
> sa-learn --dump magic
> 0.000 0 2 0 non-token data: bayes db version
> 0.000 0 3236 0 non-token data: nspam
> 0.000 0 2628 0 non-token data: nham
> 0.000 0 121176 0 non-token data: ntokens
> 0.000 0 1066969971 0 non-token data: oldest atime
> 0.000 0 1069829904 0 non-token data: newest atime
> 0.000 0 1069829905 0 non-token data: last journal
> sync atime
> 0.000 0 1069735390 0 non-token data: last expiry
> atime
> 0.000 0 2764800 0 non-token data: last expire
> atime delta
> 0.000 0 38065 0 non-token data: last expire
> reduction count
>
>
> is this really larger than it should be? or am i delusional?
>
> i'm running redhat 7.3 , sendmail 8.12.10 , mimedefang 2.37 and
> spamassassin 2.60
>
> any ideas are welcome
Yes, that size seems way out of line. It should be using about 30~50
bytes per token, assuming typical token size.
According to your 'non-token data: ntokens' that bayes_toks file should
be using about 5~6 Mbytes; unless something is whacko, or you have some
-very- large tokens in there.
One possibility, the "--dump magic" may be looking at a different set
of files. Just to double-check do a "sa-learn -D --dump magic" to see
which set of files it is looking at.
Dave
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk