https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6127





--- Comment #36 from John Hardin <[email protected]>  2009-06-11 09:09:39 PST 
---
(In reply to comment #33)
>
> I don't believe I have db_verify installed.

db_verify should have been installed with the BDB libraries, along with other
BDB utilities like db_dump and db_load.

> Do you know of docs that cover the maintenance of db_file(s).
> What do I need add to cron to prevent future problems?

I have no idea, sorry. I've never had to deal with a corrupt BDB file myself.
I've been researching this from scratch to offer you advice here... :)

It's also completely possible that the file _isn't_ corrupt. I'm primarily
suggesting this as a data point for troubleshooting the segfault and making
mitigation suggestions on the assumption that it is corrupt, while Karsten
investigates the SA code itself.

If the file _is_ corrupt, and repairing it (I don't know how, perhaps db_dump
followed by db_load) makes the segfault go away, then that suggests the BDB
library has a bug. The library shouldn't segfault on corrupt data, it should
gracefully return a failure code.

So: run db_verify to check whether the AWL database file is corrupted; if it
is, check whether there's an update for your BDB libraries available, and if
so, see whether installing that update changes the behavior of the system. If
it still crashes, report a bug to the BDB devs. Keep the original file in case
they need a repro.

Still, it would be best if SA responds gracefully to DB-File blowing up and
logs meaningful error messages while continuing to process the message.
Karsten: We know AWL and Bayes will get big, perhaps it would be prudent to
wrap their DB_File calls in signal trapping and recover gracefully from major
failures?

As it is, I suspect you have four possibilities regardless of any fixes in SA:

(1) Stop using AWL completely;

(2) Attempt to recover the bulk of your AWL database using db_dump and db_load
or the AWL maintenance utility script mentioned in comment 32, and resume using
AWL with your current config;

(3) Completely wipe your AWL database file, and resume using AWL with your
current config;

(4) Reconfigure SA to use MySQL AWL, which will probably result in losing your
current AWL data (there may be a data migration procedure, I don't know).

If you continue using AWL you probably should add a periodic process to manage
the database size using the maintenance script or corresponding MySQL queries.
A 4.4TB AWL database is ridiculous, even if it is a sparse file.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to