[
https://issues.apache.org/jira/browse/IGNITE-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Kovalenko updated IGNITE-10799:
-------------------------------------
Affects Version/s: (was: 2.1)
2.4
> Optimize affinity initialization/re-calculation
> -----------------------------------------------
>
> Key: IGNITE-10799
> URL: https://issues.apache.org/jira/browse/IGNITE-10799
> Project: Ignite
> Issue Type: Improvement
> Components: cache
> Affects Versions: 2.4
> Reporter: Pavel Kovalenko
> Assignee: Pavel Kovalenko
> Priority: Major
> Fix For: 2.8
>
>
> In case of persistence enabled and a baseline is set we have 2 main
> approaches to recalculate affinity:
> {noformat}
> org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager#onServerJoinWithExchangeMergeProtocol
> org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager#onServerLeftWithExchangeMergeProtocol
> {noformat}
> Both of them following the same approach of recalculating:
> 1) Take a current baseline (ideal assignment).
> 2) Filter out offline nodes from it.
> 3) Choose new primary nodes if previous went away.
> 4) Place temporal primary nodes to late affinity assignment set.
> Looking at implementation details we may notice that we do a lot of
> unnecessary online nodes cache lookups and array list copies. The performance
> becomes too slow if we do recalculate affinity for replicated caches (It
> takes P * N on each node, where P - partitions count, N - the number of nodes
> in the cluster). In case of large partitions count or large cluster, it may
> take few seconds, which is unacceptable, because this process happens during
> PME and freezes ongoing cluster operations.
> We should investigate possible bottlenecks and improve the performance of
> affinity recalculation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)