Since the optimizer is choosing a seq scan over index scan when it seems
like it has good row estimates in both cases, to me that may mean costs of
scanning index are expected to be high. Is this workload on SSD? Has the
random_page_cost config been decreased from default 4 (compared with cost
of 1 unit for sequential scan)?

Your buffer hits aren't great. What is shared_buffers set to? How much ram
on this cluster?

With this table being insert only, one assumes correlation is very high on
the data in this column as shown in pg_stats, but have your confirmed?

To me, distinct ON is often a bad code smell and probably can be re-written
to be much more efficient with GROUP BY, lateral & order by, or some other
tool. Same with the window function. It is a powerful tool, but sometimes
not the right one.

Is "source" a function that is called on field1? What is it doing/how is it
defined?

Reply via email to