dcapwell commented on code in PR #3656:
URL: https://github.com/apache/cassandra/pull/3656#discussion_r1830083304
##########
src/java/org/apache/cassandra/service/accord/AccordService.java:
##########
@@ -568,6 +570,46 @@ public static List<ClusterMetadata> tcmLoadRange(long min,
long max)
return afterLoad;
}
+ /**
+ * This method exists due to the fact that we define a retry policy for
TCM to follow, and then TCM ignores it and does no retries...
+ */
+ private static List<ClusterMetadata> reconstruct(long min, long max)
+ {
+ Epoch start = Epoch.create(min);
+ Epoch end = Epoch.create(max);
+ Retry.Deadline retryPolicyThatGetsIgnored =
Retry.Deadline.retryIndefinitely(DatabaseDescriptor.getCmsAwaitTimeout().to(NANOSECONDS),
+
TCMMetrics.instance.fetchLogRetries);
+ Throwable lastError = null;
+ Backoff backoff = new Backoff.ExponentialBackoff(42, 200,
SECONDS.toMillis(1), ThreadLocalRandom.current()::nextDouble);
Review Comment:
> There's already a Retry#Backoff class that handles backoff that is used in
conjunction with TCM retry functions.
Reviewed and still would leave `Backoff.ExponentialBackoff` personally.
* `Deadline` waits at most a fixed time - don't want that behavior. It does
compose, so that is a positive if I did want to bound the time
* `Backoff` is fixed wait, which I don't feel is a good behavior. If things
are failing we should actually *back off*. Also, there are possibilities
others are doing the same so you *should* have jitter to avoid DDOSing, but
`Jitter` in TCM doesn't compose
* `Jitter` is just random sleep times
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]