[jira] [Updated] (IGNITE-17320) Implement B+Tree based sorted index storage

Kirill Tkalenko (Jira) Tue, 13 Sep 2022 11:26:04 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-17320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kirill Tkalenko updated IGNITE-17320:
-------------------------------------
    Fix Version/s: 3.0.0-alpha6

> Implement B+Tree based sorted index storage
> -------------------------------------------
>
>                 Key: IGNITE-17320
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17320
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Assignee: Kirill Tkalenko
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-alpha6
>
>
> Like in IGNITE-17318, binary tuple format is a main goal here, because that's 
> what we're gonna pass into API.
> Implementing a tree for indexes isn't hard by itself. I expect it to be 
> stored in same partition files as raw storage, everything should be colocated 
> in a single file.
> What's complicated is the inlining of data into trees pages. I see it as the 
> following:
>  * If tuple has no offset table, we can always store the entire payload in 
> the tree. This is the best case scenario, because the size is fixed and known 
> a priori, we don't even need to store it before the payload.
>  * If tuple has an offset table, the inline size will immediately get bigger. 
> In my view, we will have to:
>  ** store the size of inlined payload, that's 4 bytes
>  ** store null table if it's there, that's a known amount of bytes
>  ** store header and offset table:
>  *** if there are non-fixed-length columns, then a single entry in offset 
> table can be up to 4 bytes
>  *** if there are only fixed-length columns, like ints, floats or even 
> bitsets, the amount of bytes per single entry can be accurately estimated 
> with the upper bound
>  ** then store a good amount of the actual columns data. How much? I'd be 
> generous, but then we would probably have too much space, so all of this is 
> debatable:
>  *** for columns with fixed size, allocate room for entire value
>  *** for strings and numbers (is there something else?) we have to 
> pre-allocate a reasonable amount of bytes. Like, 8, for example. I don't 
> know, there are defaults somewhere in the code of Ignite 2.x, we can use them.
> So, my point is, there's no new data format for inlined section of the tuple, 
> we should reuse it and thus avoid many possible errors. And, if record fits 
> into a tree page, there's no need in inserting it into a free list. Good!
> And yes, of course, there has to be a room for row id in inlined section as 
> well.
> Last, meta tree is still a thing. But, we shouldn't identify indexes by their 
> names, cause there's UUID id or even integer id (see IGNITE-17318).
> h3. Other ideas
> I don't like how durable background tasks work in Ignite 2.x, there are 
> always some issues. I would prefer having a general-purposed "recycle bin" in 
> partition and a background cleaner process that would clean it. Maybe this 
> queue should contain other entities in the future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-17320) Implement B+Tree based sorted index storage

Reply via email to