[
https://issues.apache.org/jira/browse/IGNITE-13063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nikolay Izhikov updated IGNITE-13063:
-------------------------------------
Labels: iep-22 index_improvement (was: iep-22)
> Bottom-up index rebuild
> -----------------------
>
> Key: IGNITE-13063
> URL: https://issues.apache.org/jira/browse/IGNITE-13063
> Project: Ignite
> Issue Type: Improvement
> Reporter: Maxim Muzafarov
> Assignee: Maxim Muzafarov
> Priority: Major
> Labels: iep-22, index_improvement
>
> As part of [IEP-22: Direct Data
> Load|https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load]
> the PoC needs to be implemented for the new algorithm of rebuilding an index.
> Compare the approach of the bottom-up index rebuild with the default
> implementation (from the root).
> See details in the IEP-22.
> h4. High-level overview
> We will not update PK and secondary indexes during the data load, so it is
> necessary to rebuild them in the end. The most efficient way to build indexes
> is bottom-up approach, when the lowest level of BTree is built first, and the
> root is build last. We will need a buffer where indexed values and respective
> links will be sorted in index order. If the buffer is big enough and all the
> data fits into it, index will be created in one hop. Otherwise it is
> necessary to sort indexed values in several runs using an external sort. It
> is necessary to let users configure sort parameters - buffer size (ideally -
> in bytes), and the file system path where temp files will be stored. The
> latter is critical - typically users would like to keep temp files on a
> separate disk, so that WAL and checkpoint operations are not affected.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)