Re: [HACKERS] [PROPOSAL] Effective storage of duplicates in B-tree index.

Oleg Bartunov Sun, 15 Sep 2019 03:48:18 -0700

On Tue, Sep 1, 2015 at 12:33 PM Alexander Korotkov
<[email protected]> wrote:
>
> Hi, Tomas!
>
> On Mon, Aug 31, 2015 at 6:26 PM, Tomas Vondra <[email protected]> 
> wrote:
>>
>> On 08/31/2015 09:41 AM, Anastasia Lubennikova wrote:
>>>
>>> I'm going to begin work on effective storage of duplicate keys in B-tree
>>> index.
>>> The main idea is to implement posting lists and posting trees for B-tree
>>> index pages as it's already done for GIN.
>>>
>>> In a nutshell, effective storing of duplicates in GIN is organised as
>>> follows.
>>> Index stores single index tuple for each unique key. That index tuple
>>> points to posting list which contains pointers to heap tuples (TIDs). If
>>> too many rows having the same key, multiple pages are allocated for the
>>> TIDs and these constitute so called posting tree.
>>> You can find wonderful detailed descriptions in gin readme
>>> <https://github.com/postgres/postgres/blob/master/src/backend/access/gin/README>
>>> and articles <http://www.cybertec.at/gin-just-an-index-type/>.
>>> It also makes possible to apply compression algorithm to posting
>>> list/tree and significantly decrease index size. Read more in
>>> presentation (part 1)
>>> <http://www.pgcon.org/2014/schedule/attachments/329_PGCon2014-GIN.pdf>.
>>>
>>> Now new B-tree index tuple must be inserted for each table row that we
>>> index.
>>> It can possibly cause page split. Because of MVCC even unique index
>>> could contain duplicates.
>>> Storing duplicates in posting list/tree helps to avoid superfluous splits.
>>>
>>> So it seems to be very useful improvement. Of course it requires a lot
>>> of changes in B-tree implementation, so I need approval from community.
>>
>>
>> In general, index size is often a serious issue - cases where indexes need 
>> more space than tables are not quite uncommon in my experience. So I think 
>> the efforts to lower space requirements for indexes are good.
>>
>> But if we introduce posting lists into btree indexes, how different are they 
>> from GIN? It seems to me that if I create a GIN index (using btree_gin), I 
>> do get mostly the same thing you propose, no?
>
>
> Yes, In general GIN is a btree with effective duplicates handling + support 
> of splitting single datums into multiple keys.
> This proposal is mostly porting duplicates handling from GIN to btree.


Is it worth to make a provision to add an ability to control how
duplicates are sorted ?  If we speak about GIN, why not take into
account our experiments with RUM (https://github.com/postgrespro/rum)
?

>
>> Sure, there are differences - GIN indexes don't handle UNIQUE indexes,
>
>
> The difference between btree_gin and btree is not only UNIQUE feature.
> 1) There is no gingettuple in GIN. GIN supports only bitmap scans. And it's 
> not feasible to add gingettuple to GIN. At least with same semantics as it is 
> in btree.
> 2) GIN doesn't support multicolumn indexes in the way btree does. Multicolumn 
> GIN is more like set of separate singlecolumn GINs: it doesn't have composite 
> keys.
> 3) btree_gin can't effectively handle range searches. "a < x < b" would be 
> hangle as "a < x" intersect "x < b". That is extremely inefficient. It is 
> possible to fix. However, there is no clear proposal how to fit this case 
> into GIN interface, yet.
>
>>
>> but the compression can only be effective when there are duplicate rows. So 
>> either the index is not UNIQUE (so the b-tree feature is not needed), or 
>> there are many updates.
>
>
> From my observations users can use btree_gin only in some cases. They like 
> compression, but can't use btree_gin mostly because of #1.
>
>> Which brings me to the other benefit of btree indexes - they are designed 
>> for high concurrency. How much is this going to be affected by introducing 
>> the posting lists?
>
>
> I'd notice that current duplicates handling in PostgreSQL is hack over 
> original btree. It is designed so in btree access method in PostgreSQL, not 
> btree in general.
> Posting lists shouldn't change concurrency much. Currently, in btree you have 
> to lock one page exclusively when you're inserting new value.
> When posting list is small and fits one page you have to do similar thing: 
> exclusive lock of one page to insert new value.
> When you have posting tree, you have to do exclusive lock on one page of 
> posting tree.
>
> One can say that concurrency would became worse because index would become 
> smaller and number of pages would became smaller too. Since number of pages 
> would be smaller, backends are more likely concur for the same page. But this 
> argument can be user against any compression and for any bloat.
>
> ------
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company



-- 
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] [PROPOSAL] Effective storage of duplicates in B-tree index.

Reply via email to