Re: [RFC] speed up count(*)

Tomas Vondra Wed, 20 Oct 2021 11:23:39 -0700

On 10/20/21 19:57, Tom Lane wrote:

John Naylor <john.nay...@enterprisedb.com> writes:

Perennially our users have complaints about slow count(*) when coming from
some other systems. Index-only scans help, but I think we can do better. I
recently wondered if a BRIN index could be used to answer min/max aggregate
queries over the whole table, and it turns out it doesn't. However, then it
occurred to me that if we had an opclass that keeps track of the count in
each page range, that would be a way to do a fast count(*) by creating the
right index. That would require planner support and other work, but it
seems doable. Any opinions on whether this is worth the effort?


The core reason why this is hard is that we insist on giving the right
answer.  In particular, count(*) is supposed to count the rows that
satisfy the asker's snapshot.  So I don't see a good way to answer it
from an index only, given that we don't track visibility accurately
in indexes.

Couldn't we simply inspect the visibility map, use the index data onlyfor fully visible/summarized ranges, and inspect the heap for theremaining pages? That'd still be a huge improvement for tables with mostonly a few pages modified recently, which is a pretty common case.

I think the bigger issue is that people rarely do COUNT(*) on the wholetable. There are usually other conditions and/or GROUP BY, and I'm notsure how would that work.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [RFC] speed up count(*)

Reply via email to