[
https://issues.apache.org/jira/browse/IGNITE-324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Semen Boikov resolved IGNITE-324.
---------------------------------
Resolution: Fixed
Implemented late affinity assignment mode (enabled by default, can be disabled
using new property IgniteConfiguration.lateAffinityAssignment).
Implementation details:
- coordinator should maintain affinity information about all caches (even for
caches not started on coordinator)
- joining node should always request affinity from some other node (since
current assignment can differ from one calculated by affinity function)
- when new server joins then all existing nodes are able to calculate affinity
locally (if affinity function assigns joined node as primary then it is
temporary assigned as backup)
Coordinator knows about all primaries waiting for rebalancing, when coordinator
receives partitions state update it checks if all new primaries rebalanced
required partitions, and sends special discovery message
(CacheAffinityChangeMessage). This messages initiates partitions exchanges on
all nodes, during this exchange new affinity assignment is applied.
- when server node fails new affinity is calculated on coordinator after it
receives GridDhtPartitionsSingleMessage from others nodes.
If affinity assigns primary node which does not own partition, coordinator
tries to find existing owner which, if owner node is found then it is temporary
assigned as primary.
Then coordinator should pass calculated affinity to others nodes, if exchange
is completed by GridDhtPartitionsFullMessage then following scenario is
possible:
coordinator sends GridDhtPartitionsFullMessage to some nodes and fails, nodes
received this message complete exchange, and it is possible that new
coordinator for exchange can compute different affinity (since it can have
locally another information about current partition owners).
To avoid this issue exchange started for server node left event is completed by
discovery message which contains new affinity assignments (the same
CacheAffinityChangeMessage is used).
> Partition exchange: node should be assigned as primary only after preloading
> is finished
> ----------------------------------------------------------------------------------------
>
> Key: IGNITE-324
> URL: https://issues.apache.org/jira/browse/IGNITE-324
> Project: Ignite
> Issue Type: Task
> Components: cache
> Affects Versions: sprint-2
> Reporter: Alexey Goncharuk
> Assignee: Semen Boikov
> Priority: Critical
> Fix For: 1.6
>
>
> After node joins topology, affinity assignment should not be changed
> immediately. New node is assigned as a backup node even for those partitions
> that are supposed to be primary. Node becomes primary only when all
> partitions are loaded.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)