Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]

via GitHub Mon, 05 Feb 2024 18:50:08 -0800


aratno commented on code in PR #1743:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1743#discussion_r1479160472



##########
core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java:
##########
@@ -96,14 +99,38 @@ public class DefaultLoadBalancingPolicy extends 
BasicLoadBalancingPolicy impleme
   private static final int MAX_IN_FLIGHT_THRESHOLD = 10;
   private static final long RESPONSE_COUNT_RESET_INTERVAL_NANOS = 
MILLISECONDS.toNanos(200);
 
-  protected final Map<Node, AtomicLongArray> responseTimes = new 
ConcurrentHashMap<>();
+  protected final LoadingCache<Node, AtomicLongArray> responseTimes;
   protected final Map<Node, Long> upTimes = new ConcurrentHashMap<>();
   private final boolean avoidSlowReplicas;
 
   public DefaultLoadBalancingPolicy(@NonNull DriverContext context, @NonNull 
String profileName) {
     super(context, profileName);
     this.avoidSlowReplicas =
         
profile.getBoolean(DefaultDriverOption.LOAD_BALANCING_POLICY_SLOW_AVOIDANCE, 
true);
+    CacheLoader<Node, AtomicLongArray> cacheLoader =
+        new CacheLoader<Node, AtomicLongArray>() {
+          @Override
+          public AtomicLongArray load(Node key) throws Exception {
+            // The array stores at most two timestamps, since we don't need 
more;
+            // the first one is always the least recent one, and hence the one 
to inspect.
+            long now = nanoTime();
+            AtomicLongArray array = responseTimes.getIfPresent(key);
+            if (array == null) {
+              array = new AtomicLongArray(1);
+              array.set(0, now);
+            } else if (array.length() == 1) {
+              long previous = array.get(0);
+              array = new AtomicLongArray(2);
+              array.set(0, previous);
+              array.set(1, now);
+            } else {
+              array.set(0, array.get(1));
+              array.set(1, now);
+            }
+            return array;
+          }
+        };
+    this.responseTimes = 
CacheBuilder.newBuilder().weakKeys().build(cacheLoader);

Review Comment:
   I think we should add a 
[RemovalListener](https://guava.dev/releases/21.0/api/docs/com/google/common/cache/RemovalListener.html)
 here.
   
   If a GC happens and response times for a Node are purged, then we'll end up 
treating that as "insufficient responses" in `isResponseRateInsufficient`, 
which can lead us to mark a node as unhealthy. I recognize that this is a bit 
of a pathological example, but this behavior does depend on GC timing and would 
be a pain to track down, so adding logging could make someone's life easier 
down the line.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] JAVA-3051: Memory leak [cassandra-java-driver]

Reply via email to