This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
     new 9184dd5a99 When doing a host replacement, we need to check that the 
node is a live node before failing with "Cannot replace a live node..."
9184dd5a99 is described below

commit 9184dd5a998366dc2b5c18d4954b13b033efcf80
Author: Francisco Guerrero <[email protected]>
AuthorDate: Mon Aug 15 09:23:56 2022 -0700

    When doing a host replacement, we need to check that the node is a live 
node before failing with "Cannot replace a live node..."
    
    patch by Francisco Guerrero; reviewed by Brandon Williams, David Capwell 
for CASSANDRA-17805
---
 CHANGES.txt                                               |  1 +
 src/java/org/apache/cassandra/service/StorageService.java | 13 +++++++++++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index dd09b25c56..7dce2b90bd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.2
+ * When doing a host replacement, we need to check that the node is a live 
node before failing with "Cannot replace a live node..." (CASSANDRA-17805)
  * Add support to generate a One-Shot heap dump on unhandled exceptions 
(CASSANDRA-17795)
  * Rate-limit new client connection auth setup to avoid overwhelming bcrypt 
(CASSANDRA-17812)
  * DataOutputBuffer#scratchBuffer can use off-heap or on-heap memory as a 
means to control memory allocations (CASSANDRA-16471)
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index ba04ccec91..b7756c0c21 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -1855,14 +1855,23 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
                 }
 
                 // check for operator errors...
+                long nanoDelay = MILLISECONDS.toNanos(ringTimeoutMillis);
                 for (Token token : bootstrapTokens)
                 {
                     InetAddressAndPort existing = 
tokenMetadata.getEndpoint(token);
                     if (existing != null)
                     {
-                        long nanoDelay = ringTimeoutMillis * 1000000L;
-                        if 
(Gossiper.instance.getEndpointStateForEndpoint(existing).getUpdateTimestamp() > 
(nanoTime() - nanoDelay))
+                        EndpointState endpointStateForExisting = 
Gossiper.instance.getEndpointStateForEndpoint(existing);
+                        long updateTimestamp = 
endpointStateForExisting.getUpdateTimestamp();
+                        long allowedDelay = nanoTime() - nanoDelay;
+
+                        // if the node was updated within the ring delay or 
the node is alive, we should fail
+                        if (updateTimestamp > allowedDelay || 
endpointStateForExisting.isAlive())
+                        {
+                            logger.error("Unable to replace node for token={}. 
The node is reporting as {}alive with updateTimestamp={}, allowedDelay={}",
+                                         token, 
endpointStateForExisting.isAlive() ? "" : "not ", updateTimestamp, 
allowedDelay);
                             throw new UnsupportedOperationException("Cannot 
replace a live node... ");
+                        }
                         collisions.add(existing);
                     }
                     else


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to