[
https://issues.apache.org/jira/browse/IGNITE-17320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirill Tkalenko updated IGNITE-17320:
-------------------------------------
Fix Version/s: 3.0.0-alpha6
> Implement B+Tree based sorted index storage
> -------------------------------------------
>
> Key: IGNITE-17320
> URL: https://issues.apache.org/jira/browse/IGNITE-17320
> Project: Ignite
> Issue Type: Improvement
> Reporter: Ivan Bessonov
> Assignee: Kirill Tkalenko
> Priority: Major
> Labels: ignite-3
> Fix For: 3.0.0-alpha6
>
>
> Like in IGNITE-17318, binary tuple format is a main goal here, because that's
> what we're gonna pass into API.
> Implementing a tree for indexes isn't hard by itself. I expect it to be
> stored in same partition files as raw storage, everything should be colocated
> in a single file.
> What's complicated is the inlining of data into trees pages. I see it as the
> following:
> * If tuple has no offset table, we can always store the entire payload in
> the tree. This is the best case scenario, because the size is fixed and known
> a priori, we don't even need to store it before the payload.
> * If tuple has an offset table, the inline size will immediately get bigger.
> In my view, we will have to:
> ** store the size of inlined payload, that's 4 bytes
> ** store null table if it's there, that's a known amount of bytes
> ** store header and offset table:
> *** if there are non-fixed-length columns, then a single entry in offset
> table can be up to 4 bytes
> *** if there are only fixed-length columns, like ints, floats or even
> bitsets, the amount of bytes per single entry can be accurately estimated
> with the upper bound
> ** then store a good amount of the actual columns data. How much? I'd be
> generous, but then we would probably have too much space, so all of this is
> debatable:
> *** for columns with fixed size, allocate room for entire value
> *** for strings and numbers (is there something else?) we have to
> pre-allocate a reasonable amount of bytes. Like, 8, for example. I don't
> know, there are defaults somewhere in the code of Ignite 2.x, we can use them.
> So, my point is, there's no new data format for inlined section of the tuple,
> we should reuse it and thus avoid many possible errors. And, if record fits
> into a tree page, there's no need in inserting it into a free list. Good!
> And yes, of course, there has to be a room for row id in inlined section as
> well.
> Last, meta tree is still a thing. But, we shouldn't identify indexes by their
> names, cause there's UUID id or even integer id (see IGNITE-17318).
> h3. Other ideas
> I don't like how durable background tasks work in Ignite 2.x, there are
> always some issues. I would prefer having a general-purposed "recycle bin" in
> partition and a background cleaner process that would clean it. Maybe this
> queue should contain other entities in the future.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)