Package: spamassassin Version: 3.3.1-1 Severity: normal Hello,
I experience some troubles about bayes tokens expiration. I use a per-user MYSQL database to store bayes tokens. I set `bayes_auto_expire 0` and `bayes_expiry_max_db_size 150000`. I run a cron to force bayes expiry for each users like this: ----8<---- sa-learn --username=exam...@example.com --force-expire ---->8---- The problem is that bayes tokens doesn't seem to expire and the `bayes_token` table recently exceeds 80 million records. For example one of the user seem to have 288278 tokens: ----8<---- $ sa-learn --username=us...@example.com --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 261 0 non-token data: nspam 0.000 0 6339 0 non-token data: nham 0.000 0 288278 0 non-token data: ntokens 0.000 0 1277480932 0 non-token data: oldest atime 0.000 0 1381762106 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 1381754818 0 non-token data: last expiry atime0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count ---->8---- So I try to manually run an expire for this user and I got this debug logs: ----8<---- $ sa-learn --username=us...@example.com --force-expire -D dbg: bayes: bayes journal sync starting dbg: bayes: bayes journal sync completed dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x22da088) implements 'learner_expire_old_training', priority 0 bayes: expiry starting dbg: bayes: expiry check keep size, 0.75 * max: 112500 dbg: bayes: token count: 288103, final goal reduction size: 175603 dbg: bayes: first pass? current: 1381754809, Last: 1381716636, atime: 0, count: 0, newdelta: 0, ratio: 0, period: 43200 dbg: bayes: can't use estimation method for expiry, unexpected result, calculating optimal atime delta (first pass) dbg: bayes: expiry max exponent: 9 Odbg: bayes: atime token reduction dbg: bayes: ======== =============== dbg: bayes: 43200 287217 dbg: bayes: 86400 287217 dbg: bayes: 172800 287189 dbg: bayes: 345600 285030 dbg: bayes: 691200 281466 dbg: bayes: 1382400 276177 dbg: bayes: 2764800 266007 dbg: bayes: 5529600 256971 dbg: bayes: 11059200 237786 dbg: bayes: 22118400 201142 dbg: bayes: couldn't find a good delta atime, need more token difference, skipping expire ---->8---- It seems that sa-learn try to reduce the tokens numbers but finally skip the expire process because it "couldn't find a good delta atime". I try to reduce or increase the 'bayes_expiry_max_db_size' but it doesn't fix the issue and the tokens numbers continue to grow out of the 150000 limit. Is it an issue or maybe I misunderstood something? Should I mannualy purge old tokens using something like `DELETE FROM bayes_token WHERE bayes_token.atime <= ...`? Thanks in advance for your help. Best regards, Thomas Pierson -- System Information: Debian Release: 6.0.7 APT prefers oldstable-updates APT policy: (500, 'oldstable-updates'), (500, 'oldstable'), (100, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 2.6.32-5-amd64 (SMP w/8 CPU cores) Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) Shell: /bin/sh linked to /bin/dash Versions of packages spamassassin depends on: pn libarchive-tar-perl <none> (no description available) ii libdigest-sha1-perl 2.13-1 NIST SHA-1 message digest algorith ii libhtml-parser-perl 3.66-1 collection of modules that parse H ii libnet-dns-perl 0.66-2 Perform DNS queries from a Perl sc ii libnetaddr-ip-perl 4.028+dfsg-1 IP address manipulation module ii libsocket6-perl 0.23-1 Perl extensions for IPv6 ii libsys-hostname-long-p 1.4-2 Figure out the long (fully-qualifi ii libwww-perl 5.836-1 Perl HTTP/WWW client/server librar ii perl 5.10.1-17squeeze6 Larry Wall's Practical Extraction ii perl-modules [libio-zl 5.10.1-17squeeze6 Core Perl modules Versions of packages spamassassin recommends: ii gcc 4:4.4.5-1 The GNU C compiler ii gnupg 1.4.10-4+squeeze3 GNU privacy guard - a free PGP rep ii libc6-dev 2.11.3-4 Embedded GNU C Library: Developmen ii libio-socket-inet6-per 2.65-1.1 Object interface for AF_INET6 doma ii libmail-spf-perl 2.007-1 Perl implementation of Sender Poli ii make 3.81-8 An utility for Directing compilati ii perl [libsys-syslog-pe 5.10.1-17squeeze6 Larry Wall's Practical Extraction ii re2c 0.13.5-1 tool for generating fast C-based r ii spamc 3.3.1-1 Client for SpamAssassin spam filte Versions of packages spamassassin suggests: ii libdbi-perl 1.612-1 Perl Database Interface (DBI) pn libio-socket-ssl-perl <none> (no description available) ii libmail-dkim-perl 0.38-1 cryptographically identify the sen pn libnet-ident-perl <none> (no description available) ii perl [libcompress-zlib 5.10.1-17squeeze6 Larry Wall's Practical Extraction pn pyzor <none> (no description available) pn razor <none> (no description available) -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org