On Wed, Apr 12, 2017 at 12:40 AM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote:
> On Tue, Apr 11, 2017 at 8:24 PM, Alexander Kuzmenkov < > a.kuzmen...@postgrespro.ru> wrote: > >> I would like to propose a patch that speeds up the queries of the form >> 'select >> count(*) ... where ...', where the restriction clauses can be satisfied >> by some >> indexes. At the moment, such queries use index-only scans followed by >> aggregation. Index-only scans are only possible for indexes that are >> capable of >> returning indexed tuples, that is, support the 'amgettuple' access >> method. They >> are not applicable to indexes such as GIN and RUM. However, it is >> possible to >> improve count(*) performance for indexes that support bitmap scans. Having >> performed a bitmap index scan or a combination of such, the bits in >> bitmap can >> be counted to obtain the final result. Of course, the bitmap pages that >> are >> lossy or not visible to all existing transactions will still require heap >> access. >> > > That's a cool feature for FTS users! Please, register this patch to the > next commitfest. > > This patch has some important limitations: >> * It only supports targetlist consisting of a single expression that can >> be >> projected from count(*). >> * count(expr) is not supported. We could support it for cases where the >> "expr is not null" restriction can be satisfied with an index. >> * The current implementation does not support parallel execution. It >> could be >> implemented during the PostgreSQL 11 release cycle. >> * For some indexes, the bitmap index scan will always require rechecking >> all >> the tuples. Bitmap count plans should not be used in such cases. For now, >> this >> check is not implemented. >> > > Does this limitation cause a performance drawback? When bitmap index scan > returns all rechecks, alternative to Bitmap Count is still Aggregate + > Bitmap Heap Scan. Thus, I think Bitmap Count would have the same > performance or even slightly faster. That's worth testing. > > Also, what is planning overhead of this patch? That's worth testing too. > Another thing catch my eye. The estimated number of rows in Bitmap Count node is the same as in Bitmap Index Scan node. Should it be 1 because it always returns single row? test1=# explain analyze select count(*) from pglist where fts @@ > to_tsquery( 'tom & lane' ); > QUERY > PLAN > > -------------------------------------------------------------------------------------------------------------------------------------------- > Bitmap Count on pglist (cost=550.65..1095.68 rows=54503 width=8) (actual > time=1120.281..1120.281 rows=1 loops=1) > Recheck Cond: (fts @@ to_tsquery('tom & lane'::text)) > Heap Fetches: 0 > Heap Blocks: exact=105992 > -> Bitmap Index Scan on idx_pglist_rum_fts (cost=0.00..537.02 > rows=54503 width=0) (actual time=1056.060..1056.060 rows=222813 loops=1) > Index Cond: (fts @@ to_tsquery('tom & lane'::text)) > Planning time: 119.568 ms > Execution time: 1121.409 ms > (8 rows) ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company