On Tue, Sep 1, 2015 at 12:33 PM Alexander Korotkov <[email protected]> wrote: > > Hi, Tomas! > > On Mon, Aug 31, 2015 at 6:26 PM, Tomas Vondra <[email protected]> > wrote: >> >> On 08/31/2015 09:41 AM, Anastasia Lubennikova wrote: >>> >>> I'm going to begin work on effective storage of duplicate keys in B-tree >>> index. >>> The main idea is to implement posting lists and posting trees for B-tree >>> index pages as it's already done for GIN. >>> >>> In a nutshell, effective storing of duplicates in GIN is organised as >>> follows. >>> Index stores single index tuple for each unique key. That index tuple >>> points to posting list which contains pointers to heap tuples (TIDs). If >>> too many rows having the same key, multiple pages are allocated for the >>> TIDs and these constitute so called posting tree. >>> You can find wonderful detailed descriptions in gin readme >>> <https://github.com/postgres/postgres/blob/master/src/backend/access/gin/README> >>> and articles <http://www.cybertec.at/gin-just-an-index-type/>. >>> It also makes possible to apply compression algorithm to posting >>> list/tree and significantly decrease index size. Read more in >>> presentation (part 1) >>> <http://www.pgcon.org/2014/schedule/attachments/329_PGCon2014-GIN.pdf>. >>> >>> Now new B-tree index tuple must be inserted for each table row that we >>> index. >>> It can possibly cause page split. Because of MVCC even unique index >>> could contain duplicates. >>> Storing duplicates in posting list/tree helps to avoid superfluous splits. >>> >>> So it seems to be very useful improvement. Of course it requires a lot >>> of changes in B-tree implementation, so I need approval from community. >> >> >> In general, index size is often a serious issue - cases where indexes need >> more space than tables are not quite uncommon in my experience. So I think >> the efforts to lower space requirements for indexes are good. >> >> But if we introduce posting lists into btree indexes, how different are they >> from GIN? It seems to me that if I create a GIN index (using btree_gin), I >> do get mostly the same thing you propose, no? > > > Yes, In general GIN is a btree with effective duplicates handling + support > of splitting single datums into multiple keys. > This proposal is mostly porting duplicates handling from GIN to btree.
Is it worth to make a provision to add an ability to control how duplicates are sorted ? If we speak about GIN, why not take into account our experiments with RUM (https://github.com/postgrespro/rum) ? > >> Sure, there are differences - GIN indexes don't handle UNIQUE indexes, > > > The difference between btree_gin and btree is not only UNIQUE feature. > 1) There is no gingettuple in GIN. GIN supports only bitmap scans. And it's > not feasible to add gingettuple to GIN. At least with same semantics as it is > in btree. > 2) GIN doesn't support multicolumn indexes in the way btree does. Multicolumn > GIN is more like set of separate singlecolumn GINs: it doesn't have composite > keys. > 3) btree_gin can't effectively handle range searches. "a < x < b" would be > hangle as "a < x" intersect "x < b". That is extremely inefficient. It is > possible to fix. However, there is no clear proposal how to fit this case > into GIN interface, yet. > >> >> but the compression can only be effective when there are duplicate rows. So >> either the index is not UNIQUE (so the b-tree feature is not needed), or >> there are many updates. > > > From my observations users can use btree_gin only in some cases. They like > compression, but can't use btree_gin mostly because of #1. > >> Which brings me to the other benefit of btree indexes - they are designed >> for high concurrency. How much is this going to be affected by introducing >> the posting lists? > > > I'd notice that current duplicates handling in PostgreSQL is hack over > original btree. It is designed so in btree access method in PostgreSQL, not > btree in general. > Posting lists shouldn't change concurrency much. Currently, in btree you have > to lock one page exclusively when you're inserting new value. > When posting list is small and fits one page you have to do similar thing: > exclusive lock of one page to insert new value. > When you have posting tree, you have to do exclusive lock on one page of > posting tree. > > One can say that concurrency would became worse because index would become > smaller and number of pages would became smaller too. Since number of pages > would be smaller, backends are more likely concur for the same page. But this > argument can be user against any compression and for any bloat. > > ------ > Alexander Korotkov > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company -- Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
