vldpyatkov commented on code in PR #7030:
URL: https://github.com/apache/ignite-3/pull/7030#discussion_r2564821550


##########
modules/distribution-zones/src/main/java/org/apache/ignite/internal/distributionzones/rebalance/ZoneRebalanceRaftGroupEventsListener.java:
##########
@@ -254,32 +257,13 @@ public void onLeaderElected(
 
                             PeersAndLearners peersAndLearners = 
PeersAndLearners.fromConsistentIds(peers, learners);
 
-                            partitionMover.movePartition(peersAndLearners, 
term, entry.revision())
-                                    .whenComplete((unused, ex) -> {
-                                        // TODO 
https://issues.apache.org/jira/browse/IGNITE-23633 remove !hasCause(ex, 
TimeoutException.class)
-                                        if (ex != null && !hasCause(ex, 
NodeStoppingException.class) && !hasCause(ex,
-                                                TimeoutException.class)) {
-                                            String errorMessage = 
String.format(
-                                                    "Unable to start rebalance 
[zonePartitionId=%s, term=%s]",
-                                                    zonePartitionId,
-                                                    term
-                                            );
-                                            failureProcessor.process(new 
FailureContext(ex, errorMessage));
-                                        }
-                                    });
+                            
changePeersAndLearnersWithRetry.executeOnLeader(peersAndLearners, term, 
entry.revision())
+                                    .whenComplete((unused, ex) -> 
maybeRunFailHandler(ex, term));

Review Comment:
   We can rewrite it through the _exceptionally_ method, but it is your choice.
   ```
   changePeersAndLearnersWithRetry.executeOnLeader(peersAndLearners, term, 
entry.revision())
                                       .exceptionally(ex -> {
                                           maybeRunFailHandler(ex, term);
   
                                           return null;
                                       });
   ```
   I don't insist.



##########
modules/raft-api/src/main/java/org/apache/ignite/internal/raft/rebalance/RaftWithTerm.java:
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.raft.rebalance;
+
+import org.apache.ignite.internal.raft.service.RaftGroupService;
+
+/**
+ * Wrapper for RaftGroupService and term.
+ */
+public class RaftWithTerm {

Review Comment:
   I am not sure how popular construction would be because we have 
IgniteBiTuple. 



##########
modules/raft-api/src/main/java/org/apache/ignite/internal/raft/rebalance/RaftCommandWithRetry.java:
##########
@@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.raft.rebalance;
+
+import static java.util.concurrent.CompletableFuture.failedFuture;
+import static java.util.concurrent.TimeUnit.MILLISECONDS;
+import static 
org.apache.ignite.internal.raft.rebalance.ExceptionUtils.recoverable;
+import static org.apache.ignite.internal.util.CompletableFutures.copyStateTo;
+import static org.apache.ignite.lang.ErrorGroups.Common.NODE_STOPPING_ERR;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.function.Function;
+import org.apache.ignite.internal.lang.IgniteInternalException;
+import org.apache.ignite.internal.lang.NodeStoppingException;
+import org.apache.ignite.internal.logger.IgniteLogger;
+import org.apache.ignite.internal.logger.Loggers;
+import org.apache.ignite.internal.util.CompletableFutures;
+import org.apache.ignite.internal.util.IgniteBusyLock;
+
+/**
+ * Helper class that executes raft command with retries.
+ */
+public class RaftCommandWithRetry {
+    private static final IgniteLogger LOG = 
Loggers.forClass(RaftCommandWithRetry.class);
+
+    private static final long MOVE_RESCHEDULE_DELAY_MILLIS = 100;

Review Comment:
   We should add TODO here.
   I'm not sure what exactly has to do with this number. Either avoid it after 
adding a possible infinity while waiting on the RAFT client, or add it to the 
configuration.



##########
modules/raft-api/src/main/java/org/apache/ignite/internal/raft/rebalance/ChangePeersAndLearnersWithRetry.java:
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.raft.rebalance;
+
+import static 
org.apache.ignite.internal.util.CompletableFutures.nullCompletedFuture;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.function.Function;
+import java.util.function.Supplier;
+import org.apache.ignite.internal.raft.PeersAndLearners;
+import org.apache.ignite.internal.raft.service.RaftGroupService;
+import org.apache.ignite.internal.util.IgniteBusyLock;
+import org.jetbrains.annotations.Nullable;
+
+/**
+ * Helper class that executes change peers and learners async with retries.
+ */
+public class ChangePeersAndLearnersWithRetry {
+
+    private final RaftCommandWithRetry raftCommand;
+
+    private final Supplier<CompletableFuture<RaftGroupService>> 
raftGroupServiceSupplier;
+
+    /**
+     * Creates a new instance of ChangePeersAndLearnersWithRetry.
+     *
+     * @param busyLock The busy lock.
+     * @param rebalanceScheduler The scheduler for rebalance tasks.
+     * @param raftGroupServiceSupplier The supplier of raft group service.
+     */
+    public ChangePeersAndLearnersWithRetry(
+            IgniteBusyLock busyLock,
+            ScheduledExecutorService rebalanceScheduler,
+            Supplier<CompletableFuture<RaftGroupService>> 
raftGroupServiceSupplier
+    ) {
+        this.raftGroupServiceSupplier = raftGroupServiceSupplier;
+
+        raftCommand = new RaftCommandWithRetry(busyLock, rebalanceScheduler);
+    }
+
+    /**
+     * Performs {@link RaftGroupService#changePeersAndLearnersAsync} on a 
provided raft group service of a partition, so nodes of the
+     * corresponding raft group can be reconfigured. Retry mechanism is 
applied to repeat
+     * {@link RaftGroupService#changePeersAndLearnersAsync} if previous one 
failed with some exception.
+     *
+     * @return Function which performs {@link 
RaftGroupService#changePeersAndLearnersAsync}.
+     */
+    public CompletableFuture<Void> execute(
+            PeersAndLearners peersAndLearners,
+            long sequenceToken,
+            Function<RaftGroupService, CompletableFuture<@Nullable 
RaftWithTerm>> leaderFilter) {
+
+        return raftCommand.execute(() ->
+                raftGroupServiceSupplier
+                        .get()
+                        .thenCompose(leaderFilter)
+                        .thenCompose(raftWithTerm -> {
+                            if (raftWithTerm == null) {
+                                return nullCompletedFuture();
+                            }
+
+                            return raftWithTerm.raftClient()
+                                    
.changePeersAndLearnersAsync(peersAndLearners, raftWithTerm.term(), 
sequenceToken);
+                        }));
+    }
+
+    /**
+     * Performs {@link RaftGroupService#changePeersAndLearnersAsync} on a 
provided raft group service of a partition, so nodes of the
+     * corresponding raft group can be reconfigured. Retry mechanism is 
applied to repeat
+     * {@link RaftGroupService#changePeersAndLearnersAsync} if previous one 
failed with some exception.
+     *
+     * @return Function which performs {@link 
RaftGroupService#changePeersAndLearnersAsync}.
+     */
+    public CompletableFuture<Void> executeOnLeader(PeersAndLearners 
peersAndLearners, long term, long sequenceToken) {

Review Comment:
   This method is very similar to a previous one. What do you think about 
reusing that one as
   ```
   execute(peersAndLearners, sequenceToken, raftClient -> completedFuture(new 
RaftWithTerm(raftClient, term)));
   ```



##########
modules/raft-api/src/main/java/org/apache/ignite/internal/raft/rebalance/ChangePeersAndLearnersWithRetry.java:
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.raft.rebalance;
+
+import static 
org.apache.ignite.internal.util.CompletableFutures.nullCompletedFuture;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.function.Function;
+import java.util.function.Supplier;
+import org.apache.ignite.internal.raft.PeersAndLearners;
+import org.apache.ignite.internal.raft.service.RaftGroupService;
+import org.apache.ignite.internal.util.IgniteBusyLock;
+import org.jetbrains.annotations.Nullable;
+
+/**
+ * Helper class that executes change peers and learners async with retries.
+ */
+public class ChangePeersAndLearnersWithRetry {
+
+    private final RaftCommandWithRetry raftCommand;
+
+    private final Supplier<CompletableFuture<RaftGroupService>> 
raftGroupServiceSupplier;
+
+    /**
+     * Creates a new instance of ChangePeersAndLearnersWithRetry.
+     *
+     * @param busyLock The busy lock.
+     * @param rebalanceScheduler The scheduler for rebalance tasks.
+     * @param raftGroupServiceSupplier The supplier of raft group service.
+     */
+    public ChangePeersAndLearnersWithRetry(
+            IgniteBusyLock busyLock,
+            ScheduledExecutorService rebalanceScheduler,
+            Supplier<CompletableFuture<RaftGroupService>> 
raftGroupServiceSupplier
+    ) {
+        this.raftGroupServiceSupplier = raftGroupServiceSupplier;
+
+        raftCommand = new RaftCommandWithRetry(busyLock, rebalanceScheduler);
+    }
+
+    /**
+     * Performs {@link RaftGroupService#changePeersAndLearnersAsync} on a 
provided raft group service of a partition, so nodes of the
+     * corresponding raft group can be reconfigured. Retry mechanism is 
applied to repeat
+     * {@link RaftGroupService#changePeersAndLearnersAsync} if previous one 
failed with some exception.
+     *
+     * @return Function which performs {@link 
RaftGroupService#changePeersAndLearnersAsync}.
+     */
+    public CompletableFuture<Void> execute(
+            PeersAndLearners peersAndLearners,
+            long sequenceToken,
+            Function<RaftGroupService, CompletableFuture<@Nullable 
RaftWithTerm>> leaderFilter) {

Review Comment:
   Why is this parameter called filter? It looks like a RAFT client provider.



##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/replicator/PartitionReplicaListener.java:
##########
@@ -656,6 +656,7 @@ private CompletableFuture<Void> 
processChangePeersAndLearnersReplicaRequest(Chan
                             peersAndLearners
                     );
 
+                    // TODO: Intentionally did not change as colocation 
disabled mode is officially not supported.

Review Comment:
   TODO without a ticket.



##########
modules/raft-api/src/main/java/org/apache/ignite/internal/raft/rebalance/ChangePeersAndLearnersWithRetry.java:
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.raft.rebalance;
+
+import static 
org.apache.ignite.internal.util.CompletableFutures.nullCompletedFuture;
+
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.function.Function;
+import java.util.function.Supplier;
+import org.apache.ignite.internal.raft.PeersAndLearners;
+import org.apache.ignite.internal.raft.service.RaftGroupService;
+import org.apache.ignite.internal.util.IgniteBusyLock;
+import org.jetbrains.annotations.Nullable;
+
+/**
+ * Helper class that executes change peers and learners async with retries.
+ */
+public class ChangePeersAndLearnersWithRetry {

Review Comment:
   All the class descriptions hint it would retry a cache peers command, but I 
do not see retry logic here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to