Tom Lane wrote:


And your point is?  Assuming a reasonable datatype like int4, a btree
index on this table would occupy say 20GB (16 bytes per entry plus
fillfactor overhead).  The bitmap version would require 10,000 bitmaps
of 1G bits apiece, or 1250GB (not even counting overhead).  You need
some wildly optimistic assumptions about the compressibility of the
bitmaps to get even within hailing distance of the btree, let alone
fit in RAM.  A realistic assumption for the numbers you mention is
that the bitmaps have 1-bits about 100 bits apart, which means you
could get maybe 3-to-1 compression using the runlength scheme that's
in there ... leaving the bitmap a factor of 20 bigger than the btree.

AFAICS "low cardinality" has to mean just that, a few dozen distinct
values at most, for this scheme to have any hope.


I did a little testing of this when Jie first submitted the patch - for
a basically Zipfian distribution of int4's the bitmap is still clearly
smaller even at 1000 distinct values (see below). It is twice as big at
10000, so the crossover point is somewhere in the 1000-10000 range (for
this test - however the results seem to be reasonably typical).

In DSS distinct values < 1000 is very common (days of week, months of
year, lineitems of order, sex of animal...) so the applicability of this
range is perhaps larger than it seems at first sight.

Cheers

Mark
------------------------------------------------------------------------

bitmap=# \d bitmaptest
  Table "public.bitmaptest"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer | not null
 val0   | integer |
 val1   | integer |
 val2   | integer |
 val3   | integer |
 fil    | text    |


bitmap=# select count(distinct val0),
                count(distinct val1),
                count(distinct val2),
                count(distinct val3)
         from bitmaptest;
 count | count | count | count
-------+-------+-------+-------
    10 |   100 |  1000 | 10000


bitmap=# \i relsizes.sql  (BTREE)
     relname     | relpages
-----------------+----------
 bitmaptest      |    93458
 bitmaptest_val0 |    21899
 bitmaptest_val1 |    21899
 bitmaptest_val2 |    21899
 bitmaptest_val3 |    21899


bitmap=# \i relsizes.sql (BITMAP)
     relname     | relpages
-----------------+----------
 bitmaptest      |    93458
 bitmaptest_val0 |     1286
 bitmaptest_val1 |     2313
 bitmaptest_val2 |     5726
 bitmaptest_val3 |    41292


For the interested, the data generator, schema, index files are here:
http://homepages.paradise.net.nz/markir/download/bizgres/bitmaptest.tar.gz


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Reply via email to