> I find those results moderately unconvincing, primarily because they > are based on choosing the least efficient possible data representation > (viz char(n)), and thus the btree indexes suffer severe and quite > artificial bloat. A database schema chosen with even minimal attention The specific data was chosen in presentation, because it comes from TPC-H definition (data that everybody understands) and they were the only columns where bitmap index will be more beneficial (i.e. the columns were low cardinality ones).
> to PG-specific tuning would probably have btree indexes half the size. > That wouldn't completely eliminate the performance differential shown, > but it would bring it down into the ballpark where you have to question > whether it makes sense to support another index AM. Comparison of index creation time and index size for similar cardinalities for *integer* and *numeric* data-types for b-tree and bitmap index. Benefits are seen in all the cases, index creation time, space taken by index as well as querying: Total rows in the table: 2 million Size of table in kbytes: 431976 Table definition and column cardinalities: Column | Type | Cardinality --------+-------------------+----------- c1 | integer | i10 | integer | 10 i50 | integer | 50 n10 | numeric | 10 n50 | numeric | 50 ctext | character varying | Note: Time in seconds and size in kbytes (as seen from select pg_relation_size()): ------------------------------------------------------------------------- Column B-tree size Bitmap size B-tree time Bitmap time ------------------------------------------------------------------------- i10 35056 2392 11.8 6.0 i50 35056 4328 10.8 6.4 n10 52264 2656 34.8 9.6 n50 52616 4272 34.3 11.7 ------------------------------------------------------------------------- Query timings (in seconds): ------------------------------------------------------------------------- Query B-tree Bitmap ------------------------------------------------------------------------- select count(*) from btree_test where 4.08 1.61 i10=5 and i50=2 and n10=0.1 and n50=0.05; ------------------------------------------------------------------------- This test case fits in memory, the benefits seen will be more if we take bigger test case. I will like to re-iterate the benefits of bitmap index: 1. Size and time to create bitmap index is less than b-tree index for low cardinality column of any data-type. 2. The gain in query performance can be attributed to Bitmap And¹ and Bitmap Or¹ operations being more efficient for bitmap indexes as compared to b-trees. Note that both bitmap and b-tree indexes use the bitmap index scan access method, however the behavior is different for each. With a b-tree index, the b-tree indexes are scanned to create a temporary in-memory bitmap index. With the Bizgres on-disk bitmap index, the bitmap scan retrieves several on-disk bitmap pages at once and provides them to the Bitmap And¹ and Bitmap Or¹ operators. The smaller size of the bitmap indexes combined with the efficiency of the And and Or operators are the reasons for the increase in query speed. Ayush ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org