Every now and again, I will encounter an unexplained long-running query. It’s a head scratcher moment, because this query that is still running for 20 minutes (not blocking) can be run independently in about 500ms
I can only assume that the problem query ran against the table(s) at a time when it was perhaps in need of a vacuum analyze... I’m guessing here, that the table had seen some amount of change and simply had out of date statistics. How can I avoid this? The auto-vacuum daemon is doing it’s thing, but there is always going to be an opportunity for a query to sneak in against a table that has recently seen large change, but not yet been analysed. On the application side, we can explicitly issue a VACUUM ANALYZE after each bulk operation - and often that is precisely what happens.. But - I am keenly aware that this cannot be performed within a transaction. That means there is always a (small) window in which a query can still execute in this scenario. Are there any other best practices that can mitigate this kind of problem? It’s rare, sure - but I don’t like sweeping these under the rug. I’m on PG 9.6.. perhaps there are planner improvements since then that might reduce the incidence of these (rare) issues. Any advice appreciated, thanks.