[
https://issues.apache.org/jira/browse/CASSANDRA-20059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898380#comment-17898380
]
David Capwell commented on CASSANDRA-20059:
-------------------------------------------
Sorry, didn't context switch to download... here are the results
||Repo||Branch||Parent Branch||SHA||Status||
|https://github.com/dcapwell/cassandra.git|CASSANDRA-20059|trunk|3fa63cf81ce03bfa45c2b312c1c2846a1d84eee5|Unstable
Failed Builds:
||Build||Result||Reason||
| jvm11-utests | fail | Test
org.apache.cassandra.cql3.ViewComplexTTLTest::testUnselectedColumnsTTLWithFlush[2]-_jdk11
faile |
| jvm17-dtests | fail | Test
org.apache.cassandra.distributed.test.HintedHandoffAddRemoveNodesTest::shouldAvoidHintTransferO
|
| jvm17-dtests-fuzz | fail | Test
org.apache.cassandra.fuzz.ring.ConsistentBootstrapTest::coordinatorIsBehindTest-_jdk17
had an e |
| jvm17-utests | fail | Test
org.apache.cassandra.cql3.PstmtPersistenceTest::testPstmtInvalidation-_jdk17
had an error |
| python-upgrade-dtests | fail | Test
upgrade_tests.upgrade_through_versions_test.TestUpgrade_indev_5_0_x_To_indev_trunk::upgrade_tes
|
> TCM's Retry.Deadline#retryIndefinitely is dangerous if used with
> RemoteProcessor as the deadline does not impact message retries
> --------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-20059
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20059
> Project: Cassandra
> Issue Type: Bug
> Components: Transactional Cluster Metadata
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Fix For: 5.x
>
> Attachments:
> ci_summary-trunk-3fa63cf81ce03bfa45c2b312c1c2846a1d84eee5.html,
> result_details-trunk-3fa63cf81ce03bfa45c2b312c1c2846a1d84eee5.tar.gz
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {code}
> public static Deadline retryIndefinitely(long timeoutNanos, Meter retryMeter)
> {
> return new Deadline(Clock.Global.nanoTime() + timeoutNanos,
> new Retry.Jitter(Integer.MAX_VALUE,
> DEFAULT_BACKOFF_MS, new Random(), retryMeter))
> {
> @Override
> public boolean reachedMax()
> {
> return false;
> }
> @Override
> public long remainingNanos()
> {
> return timeoutNanos;
> }
> public String toString()
> {
> return String.format("RetryIndefinitely{tries=%d}",
> currentTries());
> }
> };
> }
> {code}
> Sample usage pattern (example is in Accord, but same pattern exists in
> RemoteProcessor.commit)
> {code}
> Promise<LogState> request = new AsyncPromise<>();
> List<InetAddressAndPort> candidates = new
> ArrayList<>(log.metadata().fullCMSMembers());
> sendWithCallbackAsync(request,
> Verb.TCM_RECONSTRUCT_EPOCH_REQ,
> new ReconstructLogState(lowEpoch, highEpoch,
> includeSnapshot),
> new CandidateIterator(candidates),
> retryPolicy);
> return request.get(retryPolicy.remainingNanos(), TimeUnit.NANOSECONDS);
> {code}
> The issue here is that the networking retry has no clue that we gave up
> waiting on the request, so we will keep retrying until success! The reason
> for this is “reachedMax” is used to see if its safe to run again, but it
> isn’t as the deadline has passed!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]