Hi Folks,

We recently deployed amavis in a scenario where we are reloading the daemon every 5 minutes due to updated preferences read using read_hash.  We have encountered about 5 times this fortnight that the daemon failed to restart.

I have tracked this down to a stale __db.nanny.db file in the /var/lib/amavis/db directory.  I checked into the daemon with strace and I can see that during startup first __db.nanny.db is created, then renamed to nanny.db.  I haven’t been able to reproduce the situation on demand but I imagine that basically it restarts, __db.nanny.db is created but then the rename fails and the daemon exits and gets stuck in this state.  More investigation needed here but likely something like a connection is still in progress and happens to write nanny.db or some such.

The db cleanup on start only cleans up nanny.db and not __db.nanny.db and thus it never starts, which leaves it in a broken state.


I have also found numerous mailing list posts of other people experiencing the same issue (and the same fix of rm /var/lib/amavis/db/*)




Attached patch will clean up both files on restart, we could optionally apply the same regex to snmp.db but I have not yet seen that issue in production so have not done that for now.
It’s also not clear to me which database __db.NNN applies to, and if that check should change to be || instead of &&.. I tried to find docs on this in bdb but was failing, hoping this patch will do the job for now :)



Thanks,

Trent Lloyd





p
 +61 8 9481 0366 |  e [email protected] |  w
www.webinabox.net.au

PO Box 328, LEEDERVILLE WA 6903

Attachment: 0001-Cleanup-__db.nanny.db-in-addition-to-nanny.db-during.patch
Description: Binary data

Reply via email to