[
https://issues.apache.org/jira/browse/CASSANDRA-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061831#comment-15061831
]
Sylvain Lebresne commented on CASSANDRA-10423:
----------------------------------------------
bq. Why will there be more contention when there are more participants?
That's actually not what I'm saying. What I'm saying is that for a fixed amount
of contention, having more participants make it more likely for Paxos rounds to
interleave and thus having to restart new rounds and thus for any particular
update to be slower. Or to put it another way, I'm just saying that conditional
update performance is generally negatively impacted by bigger RF (all other
things being equal), and moving happen to temporarily bump the RF as far Paxos
is concerned.
But again, I'm not pretending that this explain the problem here at all as the
difference in performance observed by Roger during a move is more dramatic than
what I would expect this explanation to account for. I just don't have other
idea of why a move would impact Paxos that badly and so we'd need to be able to
reproduce this easily to find the problem.
> Paxos/LWT failures when moving node
> -----------------------------------
>
> Key: CASSANDRA-10423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10423
> Project: Cassandra
> Issue Type: Bug
> Environment: Cassandra version: 2.0.14
> Java-driver version: 2.0.11
> Reporter: Roger Schildmeijer
> Fix For: 2.1.x
>
>
> While moving a node (nodetool move <newtoken>) we noticed that lwt started
> failing for some (~50%) requests. The java-driver (version 2.0.11) returned
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout
> during write query at consistency SERIAL (7 replica were required but only 0
> acknowledged the write). The cluster was not under heavy load.
> I noticed that the failed lwt requests all took just above 1s. That
> information and the WriteTimeoutException could indicate that this happens:
> https://github.com/apache/cassandra/blob/cassandra-2.0.14/src/java/org/apache/cassandra/service/StorageProxy.java#L268
> I can't explain why though. Why would there be more cas contention just
> because a node is moving?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)