[jira] [Commented] (CASSANDRA-11302) Invalid time unit conversion causing write timeouts

2016-03-08 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185095#comment-15185095
 ] 

Ariel Weisberg commented on CASSANDRA-11302:


+1

> Invalid time unit conversion causing write timeouts
> ---
>
> Key: CASSANDRA-11302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11302
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Heffner
>Assignee: Sylvain Lebresne
> Attachments: nanosec.patch
>
>
> We've been debugging a write timeout that we saw after upgrading from the 
> 2.0.x release line, with our particular workload. Details of that process can 
> be found in this thread:
> https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html
> After bisecting various patch release versions, and then commits, on the 
> 2.1.x release line we've identified version 2.1.5 and this commit as the 
> point where the timeouts first start appearing:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9
> After examining the commit we believe this line was a typo:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467
> as it doesn't properly convert the timeout value from milliseconds to 
> nanoseconds.
> After testing with the attached patch applied, we do not see timeouts on 
> version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've 
> tested our workload against this and we are fairly confident in the patch, we 
> are not experts with the code base so we would prefer additional review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11302) Invalid time unit conversion causing write timeouts

2016-03-08 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184979#comment-15184979
 ] 

Sylvain Lebresne commented on CASSANDRA-11302:
--

No reason to not get that fix quickly so pushed a fix that uses the 
{{isTimedOut}} method instead:
|| patch || utests || dtests ||
| [2.1|https://github.com/pcmanus/cassandra/commits/11302-2.1] | 
[utests|http://cassci.datastax.com/job/pcmanus-11302-2.1-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-11302-2.1-dtest] |
| [2.2|https://github.com/pcmanus/cassandra/commits/11302-2.2] | 
[utests|http://cassci.datastax.com/job/pcmanus-11302-2.2-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-11302-2.2-dtest] |
| [3.0|https://github.com/pcmanus/cassandra/commits/11302-3.0] | 
[utests|http://cassci.datastax.com/job/pcmanus-11302-3.0-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-11302-3.0-dtest] |

The tests on 2.1 and 2.2 are on par with their main branches, the 3.0 runs had 
a few additional failures which are probably unrelated but I've re-started them 
to check. [~aweisberg], mind having a look (all the versions are the same, they 
merge up from 2.1 without conflict)?


> Invalid time unit conversion causing write timeouts
> ---
>
> Key: CASSANDRA-11302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11302
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Heffner
> Attachments: nanosec.patch
>
>
> We've been debugging a write timeout that we saw after upgrading from the 
> 2.0.x release line, with our particular workload. Details of that process can 
> be found in this thread:
> https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html
> After bisecting various patch release versions, and then commits, on the 
> 2.1.x release line we've identified version 2.1.5 and this commit as the 
> point where the timeouts first start appearing:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9
> After examining the commit we believe this line was a typo:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467
> as it doesn't properly convert the timeout value from milliseconds to 
> nanoseconds.
> After testing with the attached patch applied, we do not see timeouts on 
> version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've 
> tested our workload against this and we are fairly confident in the patch, we 
> are not experts with the code base so we would prefer additional review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11302) Invalid time unit conversion causing write timeouts

2016-03-04 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179988#comment-15179988
 ] 

Ariel Weisberg commented on CASSANDRA-11302:


Yes that is the issue. There is an isTimedOut method on QueuedMessage that 
could probably be made to do the conversion and be called from all the 
locations that want to check the timeout.

> Invalid time unit conversion causing write timeouts
> ---
>
> Key: CASSANDRA-11302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11302
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Heffner
> Attachments: nanosec.patch
>
>
> We've been debugging a write timeout that we saw after upgrading from the 
> 2.0.x release line, with our particular workload. Details of that process can 
> be found in this thread:
> https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html
> After bisecting various patch release versions, and then commits, on the 
> 2.1.x release line we've identified version 2.1.5 and this commit as the 
> point where the timeouts first start appearing:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9
> After examining the commit we believe this line was a typo:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467
> as it doesn't properly convert the timeout value from milliseconds to 
> nanoseconds.
> After testing with the attached patch applied, we do not see timeouts on 
> version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've 
> tested our workload against this and we are fairly confident in the patch, we 
> are not experts with the code base so we would prefer additional review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11302) Invalid time unit conversion causing write timeouts

2016-03-03 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179524#comment-15179524
 ] 

Sylvain Lebresne commented on CASSANDRA-11302:
--

Definitively looks fishy but since you're the author, can you have a quick look 
[~aweisberg]?

> Invalid time unit conversion causing write timeouts
> ---
>
> Key: CASSANDRA-11302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11302
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Heffner
> Attachments: nanosec.patch
>
>
> We've been debugging a write timeout that we saw after upgrading from the 
> 2.0.x release line, with our particular workload. Details of that process can 
> be found in this thread:
> https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html
> After bisecting various patch release versions, and then commits, on the 
> 2.1.x release line we've identified version 2.1.5 and this commit as the 
> point where the timeouts first start appearing:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9
> After examining the commit we believe this line was a typo:
> https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467
> as it doesn't properly convert the timeout value from milliseconds to 
> nanoseconds.
> After testing with the attached patch applied, we do not see timeouts on 
> version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've 
> tested our workload against this and we are fairly confident in the patch, we 
> are not experts with the code base so we would prefer additional review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)