[ 
https://issues.apache.org/jira/browse/CASSANDRA-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-11302.
------------------------------------------
       Resolution: Fixed
    Fix Version/s: 3.5
                   3.0.5
                   2.2.6
                   2.1.14
    Reproduced In: 2.2.5, 2.1.5  (was: 2.1.5, 2.2.5)

Re-run on 3.0 looked much better so committed, thanks.

I'll note that this bug will likely make us drop all droppable messages once 
{{expireMessages}} run, though that latter method only kicks in when we have 
1024 outstanding messages in the queue, which is why this shouldn't affect 
"healthy" cluster. That could still be pretty bad on a short burst of activity 
or a node getting very slightly behind. 

> Invalid time unit conversion causing write timeouts
> ---------------------------------------------------
>
>                 Key: CASSANDRA-11302
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11302
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Mike Heffner
>            Assignee: Sylvain Lebresne
>             Fix For: 2.1.14, 2.2.6, 3.0.5, 3.5
>
>         Attachments: nanosec.patch
>
>
> We've been debugging a write timeout that we saw after upgrading from the 
> 2.0.x release line, with our particular workload. Details of that process can 
> be found in this thread:
> https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html
> After bisecting various patch release versions, and then commits, on the 
> 2.1.x release line we've identified version 2.1.5 and this commit as the 
> point where the timeouts first start appearing:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9
> After examining the commit we believe this line was a typo:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467
> as it doesn't properly convert the timeout value from milliseconds to 
> nanoseconds.
> After testing with the attached patch applied, we do not see timeouts on 
> version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've 
> tested our workload against this and we are fairly confident in the patch, we 
> are not experts with the code base so we would prefer additional review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to