[
https://issues.apache.org/jira/browse/CASSANDRA-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947038#comment-15947038
]
Alex Petrov edited comment on CASSANDRA-13216 at 3/29/17 12:42 PM:
-------------------------------------------------------------------
Found the problem. I didn't anticipate initially that this test is
time-dependent. The initial fix is still applicable. It's reproducible quite
easily by adding a {{sleep}} of as few as 100 milliseconds around
[here|https://github.com/apache/cassandra/blob/732d1af866b91e5ba63e7e2a467d99d4cb90e11f/test/unit/org/apache/cassandra/net/MessagingServiceTest.java#L112].
YMMV with an exact sleep number.
However, I do not think there's any way we can reliably fetch latency numbers,
since dropwizard metrics reservoirs (used within
[timers|https://github.com/dropwizard/metrics/blob/15dde825de1843927898a7ad3c3bb11b2913a931/metrics-core/src/main/java/com/codahale/metrics/Timer.java#L64]
are tracking real time, and snapshots we're doing (however precise) won't ever
be perfect. I've mocked the clock:
||[3.11|https://github.com/ifesdjeen/cassandra/tree/13216-followup-3.11]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-3.11-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-3.11-dtest/]|
||[trunk|https://github.com/ifesdjeen/cassandra/tree/13216-followup-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-trunk-dtest/]|
3.0 branch is not susceptible to this problem, since we use time-independent
[Meter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/metrics/DroppedMessageMetrics.java#L31]
instead of timer there.
Let's wait for 24 hours, I've put the utest on retry.
was (Author: ifesdjeen):
Found the problem. I didn't anticipate initially that this test is
time-dependent. The initial fix is still applicable. It's reproducible quite
easily by adding a {{sleep}} of as few as 100 milliseconds around
[here|https://github.com/apache/cassandra/blob/732d1af866b91e5ba63e7e2a467d99d4cb90e11f/test/unit/org/apache/cassandra/net/MessagingServiceTest.java#L112].
YMMV with an exact sleep number.
However, I do not think there's any way we can reliably fetch latency numbers,
since dropwizard metrics reservoirs (used within
[timers|https://github.com/dropwizard/metrics/blob/15dde825de1843927898a7ad3c3bb11b2913a931/metrics-core/src/main/java/com/codahale/metrics/Timer.java#L64]
are tracking real time, and snapshots we're doing (however precise) won't ever
be perfect. I've mocked the clock:
|[3.11|https://github.com/ifesdjeen/cassandra/tree/13216-3.11-followup]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-3.11-followup-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-3.11-followup-dtest/]|
|[trunk|https://github.com/ifesdjeen/cassandra/tree/13216-followup-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-trunk-followup-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-trunk-followup-dtest/]|
3.0 branch is not susceptible to this problem, since we use time-independent
[Meter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/metrics/DroppedMessageMetrics.java#L31]
instead of timer there.
Let's wait for 24 hours, I've put the utest on retry.
> testall failure in
> org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages
> ------------------------------------------------------------------------------------
>
> Key: CASSANDRA-13216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13216
> Project: Cassandra
> Issue Type: Bug
> Components: Testing
> Reporter: Sean McCarthy
> Assignee: Alex Petrov
> Labels: test-failure, testall
> Fix For: 3.0.13, 3.11.0, 4.0
>
> Attachments: TEST-org.apache.cassandra.net.MessagingServiceTest.log,
> TEST-org.apache.cassandra.net.MessagingServiceTest.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.11_testall/81/testReport/org.apache.cassandra.net/MessagingServiceTest/testDroppedMessages
> {code}
> Error Message
> expected:<... dropped latency: 27[30 ms and Mean cross-node dropped latency:
> 2731] ms> but was:<... dropped latency: 27[28 ms and Mean cross-node dropped
> latency: 2730] ms>
> {code}{code}
> Stacktrace
> junit.framework.AssertionFailedError: expected:<... dropped latency: 27[30 ms
> and Mean cross-node dropped latency: 2731] ms> but was:<... dropped latency:
> 27[28 ms and Mean cross-node dropped latency: 2730] ms>
> at
> org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages(MessagingServiceTest.java:83)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)