On 2026-03-18 03:01, Merlin Moncure wrote:
I've been maintaining an airflow style orchestrator in pl/pgsql, and
it's revealed a performance issue I just can't solve. There is a
table, task, which may normally contain billions of rows, but only a
tiny portion is interesting for specific reasons—a common pattern in
task-type systems.
...
I'm wondering if there are other tricks that might apply here, for
example, multi column index statistics...curious if anyone has
thoughts on that.
Any suggestions?
merlin
Hello. I think planner doesn't have information about distribution of
*async.task_execution_state(task)* unless it's part of any full index. I
would try to give that with extended statistics (postgresql 14+):
create statistics (mcv) task_task_execution_state_stat on
((async.task_execution_state(task))) from async.task;
analyze async.task;
If that won't help - please show distribution from pg_stats_ext view for
extended statistic above.
--
Alexey Ermakov