[
https://issues.apache.org/jira/browse/CASSANDRA-9649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601016#comment-14601016
]
Sylvain Lebresne commented on CASSANDRA-9649:
---------------------------------------------
I believe you're right. We need ballots to be globally unique, which given we
provide our own timestamp to {{UUIDGen.getTimeUUID()}} means that we must
ensure said timestamp is locally unique. And that has been broken by
CASSANDRA-5667 (before that we were letting {{UUIDGen}} pick the timestamp
which does the right thing).
And dividing {{state.getTimestamp()}} is only one of the problems, we have 2
others: the fact that {{state.getTimestamp()}} is currently only unique to a
connection not a VM, and the fact that 2 racy operations that do get a
{{summary}} might see the same {{mostRecentInProgressCommit}} and thus pick the
same timestamp.
So anyway, I've pushed [here|https://github.com/pcmanus/cassandra/commits/9649]
a commit to fix that. I'm also going to bump the priority since unless I'm
forgetting a reason why this is not a problem, it's kind of a big deal.
I do want to note one point: due to the conjonction of CASSANDRA-7801 and
CASSANDRA-5667, if a node has its clock in the future, other nodes might
"synchronize" themselves on that clock through Paxos operations. While this is
probably ok for small drift (this may even be considered a good thing), this
could be rather problematic if a node ends up with a clock far in the future
(even temporarily, due to an operator error for instance). This is a risk for
our Paxos by design, but CASSANDRA-7801 makes the consequences potentially
affect other operation as well. This makes me nervous and while I still think
CASSANDRA-7801 is theoreticaly a good idea, I'm starting to think that it might
not be worth the risk and so I wanted to float the idea of reversing it. I've
pushed a branch
[here|https://github.com/pcmanus/cassandra/commits/9649-without-7801] with an
additional (simple) commit to show what I have in mind.
The patch is against 2.1. C* 2.0 is affected too but the patch won't apply
because it builds on top of CASSANDRA-7801 which isn't on 2.0. I'm a bit
reluctant to mess too much with 2.0 at this point so I would suggest to
basically revert CASSANDRA-5667 by simply relying on {{UUIGen.getTimeUUID()}}
without providing our own timestamp.
> Paxos ballot in StorageProxy could clash
> ----------------------------------------
>
> Key: CASSANDRA-9649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9649
> Project: Cassandra
> Issue Type: Bug
> Reporter: Stefania
> Assignee: Stefania
> Priority: Minor
>
> This code in {{StorageProxy.beginAndRepairPaxos()}} takes a timestamp in
> microseconds but divides it by 1000 before adding one. So if the summary is
> null, ballotMillis would be the same for up to 1000 possible state timestamp
> values:
> {code}
> long currentTime = (state.getTimestamp() / 1000) + 1;
> long ballotMillis = summary == null
> ? currentTime
> : Math.max(currentTime, 1 +
> UUIDGen.unixTimestamp(summary.mostRecentInProgressCommit.ballot));
> UUID ballot = UUIDGen.getTimeUUID(ballotMillis);
> {code}
> {{state.getTimestamp()}} returns the time in micro seconds and it ensures to
> add one microsecond to any previously used timestamp if the client sends the
> same or an older timestamp.
> Initially I used this code in {{ModificationStatement.casInternal()}},
> introduced by CASSANDRA-9160 to support cas unit tests, but occasionally
> these tests were failing. It was only when I ensured uniqueness of the ballot
> that the tests started to pass reliably.
> I wonder if we could ever have the same issue in StorageProxy?
> cc [~jbellis] and [~slebresne] for CASSANDRA-7801
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)