[ 
https://issues.apache.org/jira/browse/IGNITE-8400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456600#comment-16456600
 ] 

Aleksey Plekhanov commented on IGNITE-8400:
-------------------------------------------

Node is dropped out of topology because another node (previous in the ring) in 
some cases can't send message to this node and get reply within given failure 
detection timeout. To solve this I set reconnect count to 2 (this change also 
disables failure detection timeout and sets separate timeouts for each IO 
method invocation). I also remove {{sleep()}} in {{checkSegmented}} since this 
doesn't affect test logic, but brings extra delay to test (with disabled 
failure detection timeout test run longer).
Looped test runs on TC [1] after this fix doesn't contain {{Grid is in invalid 
state}} error anymore. But there are still {{Test has been timed out}} error 
sometimes (with current implementation this error also fired). I think another 
ticket should be filled for {{Test has been timed out}} error after merge of 
this ticket and new test failure statistics collected.

[1] 
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Cache3&branch_IgniteTests24Java8=pull%2F3930%2Fhead&tab=buildTypeStatusDiv


> Flaky failure of 
> IgniteTopologyValidatorGridSplitCacheTest.testTopologyValidatorWithCacheGroup 
> (Grid is in invalid state)
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8400
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8400
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Aleksey Plekhanov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
>
> Test fails sometimes on TeamCity with exception:
> {noformat}
> java.lang.IllegalStateException: Grid is in invalid state to perform this 
> operation. It either not started yet or has already being or have stopped 
> [igniteInstanceName=cache.IgniteTopologyValidatorGridSplitCacheTest6, 
> state=STOPPED]
> {noformat}
> Before this exception node is dropped out of topology by coordinator:
> {noformat}
> [tcp-disco-msg-worker-#7831%cache.IgniteTopologyValidatorGridSplitCacheTest6%][IgniteCacheTopologySplitAbstractTest$SplitTcpDiscoverySpi]
>  Node is out of topology (probably, due to short-time network problems).
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to