aratno commented on code in PR #1743:
URL:
https://github.com/apache/cassandra-java-driver/pull/1743#discussion_r1479160472
##########
core/src/main/java/com/datastax/oss/driver/internal/core/loadbalancing/DefaultLoadBalancingPolicy.java:
##########
@@ -96,14 +99,38 @@ public class DefaultLoadBalancingPolicy extends
BasicLoadBalancingPolicy impleme
private static final int MAX_IN_FLIGHT_THRESHOLD = 10;
private static final long RESPONSE_COUNT_RESET_INTERVAL_NANOS =
MILLISECONDS.toNanos(200);
- protected final Map<Node, AtomicLongArray> responseTimes = new
ConcurrentHashMap<>();
+ protected final LoadingCache<Node, AtomicLongArray> responseTimes;
protected final Map<Node, Long> upTimes = new ConcurrentHashMap<>();
private final boolean avoidSlowReplicas;
public DefaultLoadBalancingPolicy(@NonNull DriverContext context, @NonNull
String profileName) {
super(context, profileName);
this.avoidSlowReplicas =
profile.getBoolean(DefaultDriverOption.LOAD_BALANCING_POLICY_SLOW_AVOIDANCE,
true);
+ CacheLoader<Node, AtomicLongArray> cacheLoader =
+ new CacheLoader<Node, AtomicLongArray>() {
+ @Override
+ public AtomicLongArray load(Node key) throws Exception {
+ // The array stores at most two timestamps, since we don't need
more;
+ // the first one is always the least recent one, and hence the one
to inspect.
+ long now = nanoTime();
+ AtomicLongArray array = responseTimes.getIfPresent(key);
+ if (array == null) {
+ array = new AtomicLongArray(1);
+ array.set(0, now);
+ } else if (array.length() == 1) {
+ long previous = array.get(0);
+ array = new AtomicLongArray(2);
+ array.set(0, previous);
+ array.set(1, now);
+ } else {
+ array.set(0, array.get(1));
+ array.set(1, now);
+ }
+ return array;
+ }
+ };
+ this.responseTimes =
CacheBuilder.newBuilder().weakKeys().build(cacheLoader);
Review Comment:
I think we should add a
[RemovalListener](https://guava.dev/releases/21.0/api/docs/com/google/common/cache/RemovalListener.html)
here.
If a GC happens and response times for a Node are purged, then we'll end up
treating that as "insufficient responses" in `isResponseRateInsufficient`,
which can lead us to mark a node as unhealthy. I recognize that this is a bit
of a pathological example, but this behavior does depend on GC timing and would
be a pain to track down, so adding logging could make someone's life easier
down the line.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]