[
https://issues.apache.org/jira/browse/IGNITE-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069394#comment-15069394
]
Semen Boikov edited comment on IGNITE-647 at 12/23/15 10:26 AM:
----------------------------------------------------------------
There is one more race here:
- node 1 starts, exchange is finished
- node 2 starts, exchange is finished
- node 2 starts cache with fair affinity function, starts exchange and sends
GridDhtAffinityAssignmentRequest to node1. At this point node1 started exchange
process, but did start cache yet and did not register required message handler
so message will be ignored.
So any dynamic cache start with fair affinity initiated from non-oldest node
can easily hang.
This behaviour is caused by wrong logic in
GridDhtPartitionsExchangeFuture#canCalculateAffinity - when cache is started
then all nodes can calculate affinity and there is no need in
GridDhtAffinityAssignmentRequest send.
was (Author: sboikov):
There is one more race here:
- node 1 starts, exchange is finished
- node 2 starts, exchange is finished
- node 2 starts cache with fair affinity function, starts exchange and sends
GridDhtAffinityAssignmentRequest to node1. At this point node1 started exchange
process, but did start cache yet and did not register required message handler
so message will be ignored.
So any dynamic cache start with fair affinity initiated from non-oldest node
can easily hang.
> org.apache.ignite.IgniteCacheAffinitySelfTest.testAffinity() hangs
> ------------------------------------------------------------------
>
> Key: IGNITE-647
> URL: https://issues.apache.org/jira/browse/IGNITE-647
> Project: Ignite
> Issue Type: Sub-task
> Reporter: Yakov Zhdanov
> Assignee: Semen Boikov
> Priority: Blocker
> Labels: Muted_test
> Attachments: FairAffinityDynamicCacheSelfTest.testStartStopCache.txt,
> threaddump.txt
>
>
> 1-2 runs out of ~10 local runs hanged for me
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)