[ 
https://issues.apache.org/jira/browse/IGNITE-11121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755842#comment-16755842
 ] 

Ivan Pavlukhin commented on IGNITE-11121:
-----------------------------------------

I was not able to reproduce the issue. Fix in the patch addresses the problem 
technically by adding one more check. But actually it would be great to 
reproduce an issue and find a root cause. Because currently it seems impossible 
which might mean that we make wrong assumptions during _mvcc coordinator_ 
object initialization.

> MVCC TX: AssertionError in discovery manager on BLT change.
> -----------------------------------------------------------
>
>                 Key: IGNITE-11121
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11121
>             Project: Ignite
>          Issue Type: Bug
>          Components: mvcc
>            Reporter: Igor Seliverstov
>            Assignee: Ivan Pavlukhin
>            Priority: Critical
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The next exception occurred in logs on BLT change.
> {noformat}
> [12:11:36,912][SEVERE][sys-#87][GridClosureProcessor] Closure execution 
> failed with error.
> java.lang.AssertionError
>         at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.node(GridDiscoveryManager.java:1794)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1693)
>         at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.lambda$onTimeout0$16553d7$1(IgniteTxManager.java:2592)
>         at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
>         at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:354)
>         at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.onTimeout0(IgniteTxManager.java:2588)
>         at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject.access$3300(IgniteTxManager.java:2505)
>         at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$NodeFailureTimeoutObject$1.run(IgniteTxManager.java:2623)
>         at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6874)
>         at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
>         at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {noformat}
> From the stack trace I see there is a node failure which causes transactions 
> recovery and uninitialized Mvcc coordinator (it means there are no server 
> nodes, or there is a coordinatorAssignClosure which returns no result, or a 
> recovering node was not activated)
> the scenario, where the exception may be observed:
> # Start a cluster
> # Load some data (from client node, the client node is shut down after that)
> # Calculate hash
> # Add new server node
> # Change BLT
> # Wait for rebalance
> # Calculate new hash and check it is the same as previously calculated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to