Kris Deugau <[EMAIL PROTECTED]> writes:

> I've been having user quota problems due to AWL bloat in a growing
> number of accounts.  Most customers' AWL files include a *long* list of
> one-off spam addresses which *SIGNIFICANTLY* increase disk usage.

Definitely!
 
> I finally got disgusted with this, and hacked check_whitelist into
> trim_whitelist.  It makes a backup copy of the "old" AWL db, creates a
> fresh db and copies only those addresses that have a count greater than
> 1 from old to new.  It then moves the new db over the old one and makes
> sure ownership of the new db is correct if running as root.  I didn't
> want to autodelete the old db in case something broke.

Makes sense to me.
 
> At the moment, it only understands AWL files in "Berkeley DB (Hash,
> version 5, native byte-order)" format (or any other file-based hash with
> files that end with .db), but it could probably be expanded to
> understand others without too much trouble;  and could probably accept
> other options to control which addresses it discards (ie, anything with
> a *really* high AWL entry likely doesn't need to be kept; chose the
> count cutoff, etc).  It could also be adapted to upgrade AWL dbs as
> necessary.

Why not keep really high AWL entries?  It can't hurt.
 
> Size reduction varied a LOT;  I checked it on a number of users whose
> AWL db has grown to over 8M.  Typical reduction was ~8:1, with a few
> dropping to ~300K (~27:1).  Smaller dbs showed even more drastic
> reductions;  one went from 4500K to 86K (!!!).  Given that I have this
> server set up for per-user AWLs, and a 20M per-user quota on the home
> directory, this is pretty significant.  (I've had to move quite a few
> user's SA directories into another partition, and symlink them back in
> order to allow them 20M of "non-inbox" email folder space.)
> 
> If you or your users are running short on disk space due to ballooning
> AWL files, (in total, or within the system quota) you may want to play
> with this.
> 
> Download at http://www.deepnet.cx/~kdeugau/spamtools/trim_whitelist

Sounds like an initial version of what has been proposed in this bug:

  http://bugzilla.spamassassin.org/show_bug.cgi?id=3082

Separate program seems like the way to go, but I am very hesitant at
adding new commands/options to handle expiry rather than just doing it
all automatically behind the scenes.

Daniel

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/    and open source consulting

Reply via email to