vldpyatkov commented on code in PR #4329:
URL: https://github.com/apache/ignite-3/pull/4329#discussion_r1751899387
##########
modules/placement-driver/src/main/java/org/apache/ignite/internal/placementdriver/LeaseUpdater.java:
##########
@@ -230,23 +253,60 @@ private CompletableFuture<Boolean>
denyLease(ReplicationGroupId grpId, Lease lea
Collection<Lease> leases = leasesCurrent.leaseByGroupId().values();
+ return denyLeaseImMetaStorage(deniedLease, leases,
leasesCurrent.leasesBytes());
+ }
+
+ private CompletableFuture<Boolean> denyLeaseImMetaStorage(
+ Lease deniedLease,
+ Collection<Lease> currentLeases,
+ byte[] currentLeasesBytes
+ ) {
+ ByteArray key = PLACEMENTDRIVER_LEASES_KEY;
+
+ IgniteBiTuple<List<Lease>, Boolean> renewedLeasesTup =
replaceLeaseInCollection(currentLeases, deniedLease);
+
+ if (!renewedLeasesTup.get2()) {
+ return falseCompletedFuture();
+ } else {
+ return msManager.invoke(
+ or(notExists(key), value(key).eq(currentLeasesBytes)),
+ put(key, new LeaseBatch(renewedLeasesTup.get1()).bytes()),
+ noop()
+ ).thenCompose(res -> {
+ if (res) {
+ return trueCompletedFuture();
+ } else {
+ return refreshLeasesFromMetaStorage()
Review Comment:
I am concerned about thread starvation in this approach. Probably we can
return to the base one and won't retry invoke on the server side.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/ReplicaManager.java:
##########
@@ -548,21 +552,39 @@ private void
onPlacementDriverMessageReceived(NetworkMessage msg0, ClusterNode s
* @param redirectNodeId Node consistent id to redirect.
*/
private void stopLeaseProlongation(ReplicationGroupId groupId, @Nullable
String redirectNodeId) {
- LOG.info("The replica does not meet the requirements for the
leaseholder [groupId={}, redirectNodeId={}]", groupId, redirectNodeId);
+ stopLeaseProlongation(groupId, redirectNodeId, false);
+ }
+
+ /**
+ * Sends stop lease prolongation message to all participants of placement
driver group.
+ *
+ * @param groupId Replication group id.
+ * @param redirectNodeId Node consistent id to redirect.
+ * @param waitForPrimary Whether wait for primary to appear.
+ * @return Future that is completed when {@link
StopLeaseProlongationMessage} is sent.
+ */
+ private CompletableFuture<Void> stopLeaseProlongation(
+ ReplicationGroupId groupId,
+ @Nullable String redirectNodeId,
+ boolean waitForPrimary
+ ) {
+ CompletableFuture<ReplicaMeta> primaryReplicaFuture = waitForPrimary
+ ? placementDriver.awaitPrimaryReplica(groupId,
clockService.now(), 120, TimeUnit.SECONDS)
Review Comment:
How do you identify a timeout to wait primary here?
After a short discussion, we decided to avoid waiting primary here. Instead
of this, we will send the stop lease prolangation messaeg at any time, even if
the node is not a primary in MC.
##########
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/ReplicaManager.java:
##########
@@ -548,21 +552,39 @@ private void
onPlacementDriverMessageReceived(NetworkMessage msg0, ClusterNode s
* @param redirectNodeId Node consistent id to redirect.
*/
private void stopLeaseProlongation(ReplicationGroupId groupId, @Nullable
String redirectNodeId) {
- LOG.info("The replica does not meet the requirements for the
leaseholder [groupId={}, redirectNodeId={}]", groupId, redirectNodeId);
+ stopLeaseProlongation(groupId, redirectNodeId, false);
+ }
+
+ /**
+ * Sends stop lease prolongation message to all participants of placement
driver group.
+ *
+ * @param groupId Replication group id.
+ * @param redirectNodeId Node consistent id to redirect.
+ * @param waitForPrimary Whether wait for primary to appear.
+ * @return Future that is completed when {@link
StopLeaseProlongationMessage} is sent.
+ */
+ private CompletableFuture<Void> stopLeaseProlongation(
+ ReplicationGroupId groupId,
+ @Nullable String redirectNodeId,
+ boolean waitForPrimary
+ ) {
+ CompletableFuture<ReplicaMeta> primaryReplicaFuture = waitForPrimary
+ ? placementDriver.awaitPrimaryReplica(groupId,
clockService.now(), 120, TimeUnit.SECONDS)
Review Comment:
Additionally, we are going to improve the protocol for stopping lease
prolongation. In the proposal, the server (PD) sends a response with the right
border for the current lease. This border allows the node to remove primary
reservations from the replica.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]