On Tue, Apr 8, 2025 at 5:42 PM Kevin Fenzi via infrastructure <infrastructure@lists.fedoraproject.org> wrote: > > Greetings. > > We have had several applications crashing (resultsdb, > resultsdb_ci_listener) or being slow (bodhi) of late. > > I did some digging today and discovered that db01 is pretty saturated on > I/O. This means all the apps that use db01 are fighting i/o and > returning things slower than they should. > > On looking more, it was mailman that was using the vast amount of the > i/o. I of course thought at first that it was crawlers, but it is not. > > Instead it seems to be the bounce processor. > This processor wakes up every few minutes and does a query for any > bounces in the bounceevent table that are processed = 'false'. > If it finds any, it processes them. > > However, that table is now 50GB and contains 152167015 rows > (all of them pretty much processed = 'True'). > > From the logs (which logs slow queries), an example: > > 2025-04-08 21:32:40.510 GMT [7073] LOG: duration: 267423.928 ms plan: > Query Text: SELECT bounceevent.id AS bounceevent_id, > bounceevent.list_id AS bounceevent_ > list_id, bounceevent.email AS bounceevent_email, bounceevent.timestamp AS > bounceevent_timestamp, > bounceevent.message_id AS bounceevent_message_id, bounceevent.context AS > bounceevent_context, b > ounceevent.processed AS bounceevent_processed > FROM bounceevent > WHERE bounceevent.processed = false > Gather (cost=1000.00..7441540.83 rows=1 width=137) > Workers Planned: 2 > -> Parallel Seq Scan on bounceevent (cost=0.00..7440540.73 rows=1 > width=137) > Filter: (NOT processed) > > Yes, thats 267seconds to process that query, all the time hammering I/O > because the table is too large to cache well. > > This all pointed me to find this 7 year old bug report: > https://gitlab.com/mailman/mailman/-/issues/343 > Hopefully abompard finds it a fun blast from the past. :) > > Anyhow, a quick fix I think would be: > > * Save a copy of the latest database dump that should have that table > backed up. > * 'truncate bounceevent' to wipe it > > Thoughts? +1s? counter proposals? > > I'd like to do this so the other db01 users stop having problems. >
+1 (Maybe this should be automated and run once a year automatically?) -- 真実はいつも一つ!/ Always, there's only one truth! -- _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue