[
https://issues.apache.org/jira/browse/CASSANDRA-18816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767663#comment-17767663
]
David Capwell edited comment on CASSANDRA-18816 at 9/21/23 6:16 PM:
--------------------------------------------------------------------
bq. the new tests all seem to fail
They are failing when compatibility mode is disabled, which caused us to use
the latest messaging version=13... there was a code change that fails
serialization if the time value is too large... I asked about this in slack
(see https://the-asf.slack.com/archives/CK23JSY2K/p1695314528885979) as this
change feels dangerous (we have vint, so why try to truncate when we don't need
to?)... This was failing in those builds as there was a but with the simulated
clock... it works with nanoseconds but was given a seed time in milliseconds,
and when it created a new millisecond it was actually returning nanoseconds!
The clock was fixed so it now makes sense and the tests are passing locally,
will check CI in a few hours to make sure its stable there as well
was (Author: dcapwell):
.bq the new tests all seem to fail
They are failing when compatibility mode is disabled, which caused us to use
the latest messaging version=13... there was a code change that fails
serialization if the time value is too large... I asked about this in slack
(see https://the-asf.slack.com/archives/CK23JSY2K/p1695314528885979) as this
change feels dangerous (we have vint, so why try to truncate when we don't need
to?)... This was failing in those builds as there was a but with the simulated
clock... it works with nanoseconds but was given a seed time in milliseconds,
and when it created a new millisecond it was actually returning nanoseconds!
The clock was fixed so it now makes sense and the tests are passing locally,
will check CI in a few hours to make sure its stable there as well
> Add support for repair coordinator to retry messages that timeout
> -----------------------------------------------------------------
>
> Key: CASSANDRA-18816
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18816
> Project: Cassandra
> Issue Type: Improvement
> Components: Consistency/Repair
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 10h 40m
> Remaining Estimate: 0h
>
> Now that CASSANDRA-15399 is in, most of the repair messages have a state that
> they can check against to make message delivery idempotent, allowing the
> coordinator to retry such messages; a few of the most critical messages to
> retry are: PREPARE_MSG, VALIDATION_REQ, VALIDATION_RSP, SYNC_REQ, and
> SYNC_RSP.
> With this I propose making the coordinator able to retry these key messages
> to try and make repair more resilient to ephemeral issues.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]