[
https://issues.apache.org/jira/browse/IGNITE-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981517#comment-16981517
]
Sergey Kalashnikov edited comment on IGNITE-11075 at 11/25/19 12:41 PM:
Implemented the following solution (PR
[https://github.com/apache/ignite/pull/7070):]
Goals:
- Restart failed attempts to rebuild the indexes (due to node crash).
- Minimize the scope of recovery rebuilds to those caches and partitions that
have not been able to complete the rebuild before the crash.
- Provide ability (and API) to rebuild arbitrarily selected partitions of
cache indexes.
Design:
1) A set of partition rebuild markers is kept inside {{index.bin}} file (i.e.
persisted).
For that purpose, the new {{"IndexRebuildMarkers"}} tree is introduced.
Item size for shared cache group: 6 bytes (4 for cacheId and 2 for partition)
Item size for single cache: 2 bytes (just partition)
So, for a node with 2000 local partitions: it takes 6(shared-group) or
1(single-cache) additional page(s) per cache.
For an extreme case of 65500 local partitions per node: it takes 194 or 64
pages per cache.
However, this tree is normally empty (only requires 1 page) and only takes
space when the index rebuild is in progress.
2) Before the index rebuild start:
- Store the partition ids that will be rebuilt into {{index.bin}}.
- Log a new WAL record {{START_BUILD_INDEX_RECORD}} to protect the new
information from the crash before the first checkpoint.
3) After successful completion of each partition rebuild:
- Remove the partition id from the \{{"IndexRebuildMarkers"}} tree.
4) On memory recovery:
- If during logical records recovery we happen to meet
{{START_BUILD_INDEX_RECORD}}, store partitions from the record into the
{{index.bin}} unless the file was removed.
5) On cache start:
- Check if {{index.bin}} exists for a cache-group and then retrieve partition
build markers from the {{"IndexRebuildMarkers"}} tree.
- Start index-rebuild for the marked partitions.
6) New API is provided for use by P2P rebalance:
{{public IgniteInternalFuture rebuildIndexesByPartition(CacheGroupContext
grp, int partId);}}
was (Author: skalashnikov):
Implemented the following solution (PR
https://github.com/apache/ignite/pull/7070):
Goals:
- Restart failed attempts to rebuild the indexes (due to node crash).
- Minimize the scope of recovery rebuilds to those caches and partitions that
have not been able to complete the rebuild before the crash.
- Provide ability (and API) to rebuild arbitrarily selected partitions of cache
indexes.
Design:
1) A set of partition rebuild markers is kept inside {{index.bin}} file (i.e.
persisted).
For that purpose, the new {{"IndexRebuildMarkers"}} tree is introduced.
Item size for shared cache group: 6 bytes (4 for cacheId and 2 for partition)
Item size for single cache: 2 bytes (just partition)
So, for a node with 2000 local partitions: it takes 6(shared-group) or
1(single-cache) additional page(s) per cache.
For an extreme case of 65500 local partitions per node: it takes 194 or 64
pages per cache.
However, this tree is normally empty (only requires 1 page) and only takes
space when the index rebuild is in progress.
2) Before the index rebuild start:
- Store the partition ids that will be rebuilt into {{index.bin}}.
- Log a new WAL record {{START_BUILD_INDEX_RECORD}} to protect the new
information from the crash before the first checkpoint.
3) After successful completion of each partition rebuild:
- Remove the partition id from the {{"IndexRebuildMarkers"}}tree.
4) On memory recovery:
- If during logical records recovery we happen to meet
{{START_BUILD_INDEX_RECORD}}, store partitions from the record into the
{{index.bin}} unless the file was removed.
5) On cache start:
- Check if {{index.bin}} exists for a cache-group and then retrieve partition
build markers from the {{"IndexRebuildMarkers"}} tree.
- Start index-rebuild for the marked partitions.
6) New API is provided for use by P2P rebalance:
{{public IgniteInternalFuture rebuildIndexesByPartition(CacheGroupContext
grp, int partId);}}
> Index rebuild procedure over cache partition file
> -
>
> Key: IGNITE-11075
> URL: https://issues.apache.org/jira/browse/IGNITE-11075
> Project: Ignite
> Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Sergey Kalashnikov
>Priority: Major
> Labels: iep-28
> Fix For: 2.9
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The node can own partition when partition data is rebalanced and cache
> indexes are ready. For the message-based cluster rebalancing, approach
> indexes are rebuilding simultaneously with cache data loading. For the
> file-based rebalancing approach, the index rebuild