Greetings. 

We have had several applications crashing (resultsdb,
resultsdb_ci_listener) or being slow (bodhi) of late.

I did some digging today and discovered that db01 is pretty saturated on
I/O. This means all the apps that use db01 are fighting i/o and
returning things slower than they should.

On looking more, it was mailman that was using the vast amount of the
i/o. I of course thought at first that it was crawlers, but it is not.

Instead it seems to be the bounce processor.
This processor wakes up every few minutes and does a query for any
bounces in the bounceevent table that are processed = 'false'.
If it finds any, it processes them. 

However, that table is now 50GB and contains 152167015 rows
(all of them pretty much processed = 'True').

From the logs (which logs slow queries), an example:

2025-04-08 21:32:40.510 GMT [7073] LOG:  duration: 267423.928 ms  plan:
        Query Text: SELECT bounceevent.id AS bounceevent_id, 
bounceevent.list_id AS bounceevent_
list_id, bounceevent.email AS bounceevent_email, bounceevent.timestamp AS 
bounceevent_timestamp,
 bounceevent.message_id AS bounceevent_message_id, bounceevent.context AS 
bounceevent_context, b
ounceevent.processed AS bounceevent_processed 
        FROM bounceevent 
        WHERE bounceevent.processed = false
        Gather  (cost=1000.00..7441540.83 rows=1 width=137)
          Workers Planned: 2
          ->  Parallel Seq Scan on bounceevent  (cost=0.00..7440540.73 rows=1 
width=137)
                Filter: (NOT processed)

Yes, thats 267seconds to process that query, all the time hammering I/O
because the table is too large to cache well.

This all pointed me to find this 7 year old bug report:
https://gitlab.com/mailman/mailman/-/issues/343
Hopefully abompard finds it a fun blast from the past. :)

Anyhow, a quick fix I think would be:

* Save a copy of the latest database dump that should have that table
backed up.
* 'truncate bounceevent' to wipe it

Thoughts? +1s? counter proposals?

I'd like to do this so the other db01 users stop having problems.

kevin
-- 
_______________________________________________
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to