sanpwc commented on code in PR #3467:
URL: https://github.com/apache/ignite-3/pull/3467#discussion_r1541221886
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/Replica.java:
##########
@@ -196,7 +221,8 @@ private CompletableFuture<LeaseGrantedMessageResponse>
processLeaseGrantedMessag
// Replica must wait till storage index reaches the current
leader's index to make sure that all updates made on the
// group leader are received.
- return
waitForActualState(msg.leaseExpirationTime().getPhysical())
+ return
sendPrimaryReplicaChangeToReplicationGroup(msg.leaseStartTime().longValue())
+ .thenCompose(v ->
waitForActualState(msg.leaseExpirationTime().getPhysical()))
Review Comment:
I expect it to be reversed: initially we waitForAcualState and then notify
raft group about new leasholder. By the way I'd actually add new method for
wait + notify in order to call it here and within collocation logic.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/Replica.java:
##########
@@ -168,7 +175,25 @@ private CompletableFuture<ClusterNode> leaderFuture() {
*/
public CompletableFuture<? extends NetworkMessage>
processPlacementDriverMessage(PlacementDriverReplicaMessage msg) {
if (msg instanceof LeaseGrantedMessage) {
- return processLeaseGrantedMessage((LeaseGrantedMessage) msg);
+ return processLeaseGrantedMessage((LeaseGrantedMessage) msg)
+ .handle((v, e) -> {
Review Comment:
What's the point of this handle? Seems that the only goal is to send
non-accepted response in case of NodeStoppingException that generally won't be
send because network was already disabled.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/Replica.java:
##########
@@ -219,6 +246,14 @@ private CompletableFuture<LeaseGrantedMessageResponse>
processLeaseGrantedMessag
}));
}
+ private CompletableFuture<Void>
sendPrimaryReplicaChangeToReplicationGroup(long leaseStartTime) {
Review Comment:
How do we handle raft exceptions within lease granting process?
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/exception/TransactionRetryAllowingException.java:
##########
@@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.replicator.exception;
+
+/**
+ * Marker interface for exception that allow transaction retries.
+ */
+public interface TransactionRetryAllowingException {
Review Comment:
- First of all I don't think that we need such exception for the given
ticket. Please check the comment above.
- I believe it should be a part of tx module instead of `package
org.apache.ignite.internal.replicator.exception;`.
- Despite the fact that it's a tiny class itself the idea is rather big and
thus should be both broadcasted to colleagues. Meaning that after merging we
should highlight that new marker interface was introduced.
- BTW I'd rename it to TransactionRetriableException.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/exception/FullTransactionPrimaryReplicaMissException.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.replicator.exception;
+
+import static org.apache.ignite.lang.ErrorGroups.Replicator.REPLICA_MISS_ERR;
+
+import org.apache.ignite.internal.lang.IgniteInternalException;
+
+/**
+ * This exception is used when a full transaction misses primary replica.
+ */
+public class FullTransactionPrimaryReplicaMissException extends
IgniteInternalException implements TransactionRetryAllowingException {
Review Comment:
Same as below, it looks confusing to have FullTransaction... in `package
org.apache.ignite.internal.replicator.`. Does it really matters that it's the
transaction, why not to throw general replication PrimaryReplicaMissException
and if it was handled within full transaction retry an entire one? May discuss
this in person if you want.
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/command/UpdateAllCommand.java:
##########
@@ -38,6 +40,9 @@ public interface UpdateAllCommand extends PartitionCommand {
String txCoordinatorId();
+ /** Lease start time, hybrid timestamp as long, see {@link
HybridTimestamp#longValue()}. Should be non-null for the full transactions.*/
+ @Nullable Long leaseStartTime();
Review Comment:
Same as above, enlistmentConsistencyToken it is. And we definitely need
proper javadoc explaining why we need it, when it'll be null, etc.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/ReplicaService.java:
##########
@@ -48,8 +49,11 @@
/** The service is intended to execute requests on replicas. */
public class ReplicaService {
+ /** RPC timeout system property name. */
+ public static final String REPLICA_SERVICE_RPC_TIMEOUT =
"REPLICA_SERVICE_RPC_TIMEOUT";
Review Comment:
Should we have the constant in IgniteSystemProperties itself?
##########
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/command/UpdateCommand.java:
##########
@@ -39,6 +39,9 @@ public interface UpdateCommand extends PartitionCommand {
String txCoordinatorId();
+ /** Lease start time, hybrid timestamp as long, see {@link
HybridTimestamp#longValue()}. Should be non-null for the full transactions.*/
+ @Nullable Long leaseStartTime();
Review Comment:
I actually agree that it's better to duplicate the field instead of
introducing new interface or abstract class or similar.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/message/PrimaryReplicaChangeCommand.java:
##########
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.ignite.internal.replicator.message;
+
+import static
org.apache.ignite.internal.replicator.message.ReplicaMessageGroup.PRIMARY_REPLICA_CHANGE_COMMAND;
+
+import org.apache.ignite.internal.hlc.HybridTimestamp;
+import org.apache.ignite.internal.network.annotations.Transferable;
+import org.apache.ignite.internal.raft.WriteCommand;
+
+/**
+ * Command to write the primary replica change to the replication group.
+ */
+@Transferable(PRIMARY_REPLICA_CHANGE_COMMAND)
+public interface PrimaryReplicaChangeCommand extends WriteCommand {
+ /** Lease start time, hybrid timestamp as long, see {@link
HybridTimestamp#longValue()}. */
+ long leaseStartTime();
Review Comment:
I believe that an enlistmentConsistencyToken naming fits better here.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/ReplicaService.java:
##########
@@ -48,8 +49,11 @@
/** The service is intended to execute requests on replicas. */
public class ReplicaService {
+ /** RPC timeout system property name. */
+ public static final String REPLICA_SERVICE_RPC_TIMEOUT =
"REPLICA_SERVICE_RPC_TIMEOUT";
+
/** Network timeout. */
- private static final long RPC_TIMEOUT = 3000;
+ private final long rpcTimeout =
IgniteSystemProperties.getInteger(REPLICA_SERVICE_RPC_TIMEOUT, 3000);
Review Comment:
Why it's a system property and not the configuration? Just curious.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]