Hello Jonah, Simon, and the hackers, I am going to implement a simple kind of "encoded bitmap indexes" (EBI). That is an index type where the bitmap columns may not only contain only a single '1' in the set of bits belonging to a tuple. Instead, an additional mapping table translates the distinct values of the table column into a unique encoding. To select for a given value all bitmap columns must be compared instead of only one. Queries that match multiple different values (like IN lists or range queries) simplify to less than the full set of bitmaps that needs to be compared because of boolean logic. The total number of bitmaps required to represent unique encodings for all different values is ceil(ld(n)), where n is the number of distinct values. Compared to normal bitmap indexes this solves the problem of high-cardinality columns. It is targetet at data warehousing scenarios with insert only data.
The respective scientific paper can be found at http://www.dvs.tu-darmstadt.de/publications/pdf/ebi_a4.pdf I thought, it could be a good idea to base my work on the long proposed on-disk bitmap index implementation. Regarding to the wiki, you, Jonah and Simon, were the last devs that touched this thing. Unfortunately I could not find the patch representing your state of that work. I could only capture the development history up to Gianni Ciolli & Gabriele Bartolini from the old pgsql-patches archives. Other people involved were Jie Zhang, Gavin Sherry, Heikki Linnakangas, and Leonardo F. Are you and the others still interested in getting this into PG? A rebase of the most current bitmap index implementation onto master HEAD will be the first 'byproduct' that I am going to deliver back to you. 1. Is anyone working on this currently? 2. Who has got the most current source code? 3. Is there a git of that or will I need to reconstruct the history from the patches I collected? Disclaimer: Although I read and manually processed most of the executor's code during my last work, I would still call myself a newbie and therefore will need some assistance most probably. Regards, Daniel -- Daniel Bausch Wissenschaftlicher Mitarbeiter Technische Universität Darmstadt Fachbereich Informatik Fachgebiet Datenbanken und Verteilte Systeme Hochschulstraße 10 64289 Darmstadt Germany Tel.: +49 6151 16 6706 Fax: +49 6151 16 6229 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers