Playing around with my hashtable implementation  and using it for
bitmapscans/tidbitmap.c I noticed something curious. When using small
work_mem (4MB in my case) for large-ish tables (~5GB), the performance
tanks. That's primarily caused by some issues in the hashtable code
(since fixed), but it made me look more at the relevant tidbitmap
code. Namely tbm_lossify():
while ((page = pagetable_iterate(tbm->pagetable, &i)) != NULL)
continue; /* already a chunk
* If the page would become a chunk header, we won't save
* converting it to lossy, so skip it.
if ((page->blockno % PAGES_PER_CHUNK) == 0)
/* This does the dirty work ... */
if (tbm->nentries <= tbm->maxentries / 2)
/* we have done enough */
tbm_mark_page_lossy() in turn deletes the individual page entry, and
adds one for the chunk (spanning several pages).
The relevant part is that the loop stops when enough page ranges have
been lossified. This leads to some problems:
Because iteration (both in my implementation and dynahash) is
essentially in bucket order, the front of the hashtable will be mostly
empty, whereas later parts of the hashtable will be full (i.e. a lot of
collisions). The reason for that is that we'll find all parts of the
hashtable that are uncompressed and "early" in the hashspace, and
they'll possibly hash to something later in the table. As the number of
entries in the hashtable doesn't increase, we'll never grow the
hashtable to decrease the number of collisions. I've seen that cause
quite noticeable number of conflicts (which of course are worse in open
addressing table, rather than a separately chained one).
Secondly it'll lead to pretty random parts of the bitmap being
compressed. For performance it'd be good to compress "heavily used"
areas of the bitmap, instead of just whatever happens to be early in the
I'm not entirely sure how to resolve that best, but it does seem worth
addressing. One thing that might be nice is to record the last N
insertions once at 90% full or so, and then lossify in a more targeted
manner using those. E.g. lossify block ranges (256 long atm) that
contain a lot of individual entries or such.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: