sanpwc commented on code in PR #5179:
URL: https://github.com/apache/ignite-3/pull/5179#discussion_r1949162986
##########
modules/partition-replicator/src/main/java/org/apache/ignite/internal/partition/replicator/PartitionReplicaLifecycleManager.java:
##########
@@ -491,27 +512,43 @@ private CompletableFuture<?>
createZonePartitionReplicationNode(
);
Supplier<CompletableFuture<Boolean>> startReplicaSupplier = () -> {
- try {
- return replicaMgr.startReplica(
- zonePartitionId,
- raftClient -> new ZonePartitionReplicaListener(
- new
ExecutorInclinedRaftCommandRunner(raftClient, partitionOperationsExecutor)),
- new FailFastSnapshotStorageFactory(),
- stablePeersAndLearners,
- raftGroupListener,
- raftGroupEventsListener,
- busyLock
- ).thenCompose(replica ->
executeUnderZoneWriteLock(zonePartitionId.zoneId(), () -> {
- replicationGroupIds.add(zonePartitionId);
-
- var eventParams = new
LocalPartitionReplicaEventParameters(zonePartitionId, revision);
-
- return
fireEvent(LocalPartitionReplicaEvent.AFTER_REPLICA_STARTED, eventParams);
- }))
- .thenApply(unused -> false);
- } catch (NodeStoppingException e) {
- return failedFuture(e);
- }
+ var eventParams = new
LocalPartitionReplicaEventParameters(zonePartitionId, revision);
+
+ return
fireEvent(LocalPartitionReplicaEvent.BEFORE_REPLICA_STARTED, eventParams)
+ .thenCompose(v -> {
+ try {
+ return replicaMgr.startReplica(
+ zonePartitionId,
+ raftClient -> {
+ var runner = new
ExecutorInclinedRaftCommandRunner(raftClient, partitionOperationsExecutor);
+
+ var replicaListener = new
ZonePartitionReplicaListener(runner);
+
+
listeners.replicaListenerFuture.complete(replicaListener);
+
+ return replicaListener;
+ },
+ new FailFastSnapshotStorageFactory(),
+ stablePeersAndLearners,
+ raftGroupListener,
+ raftGroupEventsListener,
+ busyLock
+ );
+ } catch (NodeStoppingException e) {
+ return failedFuture(e);
+ }
+ })
+ .thenCompose(replica ->
executeUnderZoneWriteLock(zonePartitionId.zoneId(), () -> {
+ replicationGroupIds.add(zonePartitionId);
+
+ return
fireEvent(LocalPartitionReplicaEvent.AFTER_REPLICA_STARTED, eventParams);
+ }))
+ .whenComplete((v, e) -> {
+ if (e != null) {
+ listenersByZonePartitionId.remove(zonePartitionId);
Review Comment:
> What is the expected behavior?
I believe that a fail handler should be called. On it's turn, since we have
log only FH we will need to proceed with request processing on such node.
Obviously, if there's no replica, all user requests will fail with tx timeout
waiting for replica readiness.
From our perspective, it seems that we may treat future unavailability in
the map as removed after exception. In that case, we may add the [error]
message to the log, and skip further table loading processing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]