[
https://issues.apache.org/jira/browse/IGNITE-19136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Koptilin updated IGNITE-19136:
-----------------------------------------
Fix Version/s: 3.0.0-beta2
> Handling timeout on waiting for replica readiness
> -------------------------------------------------
>
> Key: IGNITE-19136
> URL: https://issues.apache.org/jira/browse/IGNITE-19136
> Project: Ignite
> Issue Type: Bug
> Reporter: Vladislav Pyatkov
> Assignee: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3
> Fix For: 3.0.0-beta2
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> *Motivation*
> There are several reasons by the replica can respond
> _ReplicaNotReadyException_ (storage recovery has not completed yet, indexes
> have not created). In this case, required sending AwaitReplicaRequest and
> don't try requesting any more until AwaitReplicaResponse doesn't be received.
> But the reason is not obvious when we receive a timeout on waiting for the
> replica readiness. The result is an exception, which is easy to confuse with
> that we don't try handling _ReplicaNotReadyException_:
> {noformat}
> Replica is not ready
> [replicationGroupId=474283c9-a39e-431a-895f-751003052d7a_part_10,
> nodeName=irott_n_1]
> at
> app//org.apache.ignite.internal.replicator.ReplicaManager.sendReplicaUnavailableErrorResponse(ReplicaManager.java:385)
> at
> app//org.apache.ignite.internal.replicator.ReplicaManager.onReplicaMessageReceived(ReplicaManager.java:167)
> at
> app//org.apache.ignite.network.DefaultMessagingService.onMessage(DefaultMessagingService.java:358)
> at
> app//org.apache.ignite.network.DefaultMessagingService.lambda$onMessage$3(DefaultMessagingService.java:314)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at [email protected]/java.lang.Thread.run(Thread.java:834)
> {noformat}
> *Definition of Done*
> A message that describes the situation where we cannot wait for replica for
> timeout.
> {noformat}
> Could not wait for the replica become ready for the timeout
> [replicationGroupId=474283c9-a39e-431a-895f-751003052d7a_part_10,
> nodeName=irott_n_1, timeout=3000]
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)