Hi, On 2021-07-07 20:46:38 +0900, Masahiko Sawada wrote: > Currently, the TIDs of dead tuples are stored in an array that is > collectively allocated at the start of lazy vacuum and TID lookup uses > bsearch(). There are the following challenges and limitations:
> So I prototyped a new data structure dedicated to storing dead tuples > during lazy vacuum while borrowing the idea from Roaring Bitmap[2]. > The authors provide an implementation of Roaring Bitmap[3] (Apache > 2.0 license). But I've implemented this idea from scratch because we > need to integrate it with Dynamic Shared Memory/Area to support > parallel vacuum and need to support ItemPointerData, 6-bytes integer > in total, whereas the implementation supports only 4-bytes integers. > Also, when it comes to vacuum, we neither need to compute the > intersection, the union, nor the difference between sets, but need > only an existence check. > > The data structure is somewhat similar to TIDBitmap. It consists of > the hash table and the container area; the hash table has entries per > block and each block entry allocates its memory space, called a > container, in the container area to store its offset numbers. The > container area is actually an array of bytes and can be enlarged as > needed. In the container area, the data representation of offset > numbers varies depending on their cardinality. It has three container > types: array, bitmap, and run. How are you thinking of implementing iteration efficiently for rtbm? The second heap pass needs that obviously... I think the only option would be to qsort the whole thing? Greetings, Andres Freund