[
https://issues.apache.org/jira/browse/CASSANDRA-6823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924285#comment-13924285
]
Benedict commented on CASSANDRA-6823:
-------------------------------------
I would guess you're seeing latency spikes induced by flushing. If the
"messages dropped" are sandwiched between an "Enqueuing flush" and "Completed
flushing", that's the culprit.
Does the same *new* stress not report these errors against 2.0?
> TimedOutException/dropped mutations running stress on 2.1
> ----------------------------------------------------------
>
> Key: CASSANDRA-6823
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6823
> Project: Cassandra
> Issue Type: Bug
> Reporter: dan jatnieks
> Priority: Minor
> Labels: stress
> Attachments: stress.log, system.log
>
>
> While testing CASSANDRA-6357, I am seeing TimedOutException errors running
> stress on both 2.1 and trunk, and system log is showing dropped mutation
> messages.
> {noformat}
> $ ant -Dversion=2.1.0-SNAPSHOT jar
> $ ./bin/cassandra
> $ ./cassandra-2.1/tools/bin/cassandra-stress write n=10000000
> Created keyspaces. Sleeping 1s for propagation.
> Warming up WRITE with 50000 iterations...
> Connected to cluster: Test Cluster
> Datatacenter: datacenter1; Host: localhost/127.0.0.1; Rack: rack1
> Sleeping 2s...
> Running WRITE with 50 threads for 10000000 iterations
> ops , op/s,adj op/s, key/s, mean, med, .95, .99,
> .999, max, time, stderr
> 74597 , 74590, 74590, 74590, 0.7, 0.3, 1.7, 7.8,
> 39.4, 156.0, 1.0, 0.00000
> 175807 , 100469, 111362, 100469, 0.5, 0.3, 1.0, 2.2,
> 16.4, 105.2, 2.0, 0.00000
> 278037 , 100483, 110412, 100483, 0.5, 0.4, 0.9, 2.2,
> 15.9, 95.4, 3.0, 0.13983
> 366806 , 86301, 86301, 86301, 0.6, 0.4, 0.9, 2.4,
> 97.6, 107.0, 4.1, 0.10002
> 473244 , 105209, 115906, 105209, 0.5, 0.3, 1.0, 2.2,
> 10.2, 99.6, 5.1, 0.08246
> 574363 , 99939, 112606, 99939, 0.5, 0.3, 1.0, 2.2,
> 8.4, 115.3, 6.1, 0.07297
> 665162 , 89343, 89343, 89343, 0.6, 0.3, 1.1, 2.3,
> 12.5, 116.4, 7.1, 0.06256
> 768575 , 102028, 102028, 102028, 0.5, 0.3, 1.0, 2.1,
> 10.7, 116.0, 8.1, 0.05703
> 870318 , 100383, 112278, 100383, 0.5, 0.4, 1.0, 2.1,
> 8.2, 109.1, 9.1, 0.04984
> 972584 , 100496, 111616, 100496, 0.5, 0.3, 1.0, 2.3,
> 10.3, 109.1, 10.1, 0.04542
> 1063466 , 88566, 88566, 88566, 0.6, 0.3, 1.1, 2.5,
> 107.3, 116.9, 11.2, 0.04152
> 1163218 , 98512, 107549, 98512, 0.5, 0.3, 1.2, 3.4,
> 17.9, 92.9, 12.2, 0.04007
> 1257989 , 93578, 103808, 93578, 0.5, 0.3, 1.4, 3.8,
> 12.6, 105.6, 13.2, 0.03687
> 1349628 , 90205, 99257, 90205, 0.6, 0.3, 1.2, 2.9,
> 20.3, 99.6, 14.2, 0.03401
> 1448125 , 97133, 106429, 97133, 0.5, 0.3, 1.2, 2.9,
> 11.9, 102.2, 15.2, 0.03170
> 1536662 , 87137, 95464, 87137, 0.6, 0.4, 1.1, 2.9,
> 83.7, 94.0, 16.2, 0.02964
> 1632373 , 94446, 102735, 94446, 0.5, 0.4, 1.1, 2.6,
> 11.7, 85.5, 17.2, 0.02818
> 1717028 , 83533, 83533, 83533, 0.6, 0.4, 1.1, 2.7,
> 87.4, 101.8, 18.3, 0.02651
> 1817081 , 97807, 108004, 97807, 0.5, 0.3, 1.1, 2.5,
> 14.5, 99.1, 19.3, 0.02712
> 1904103 , 85634, 94846, 85634, 0.6, 0.3, 1.2, 3.0,
> 92.4, 105.3, 20.3, 0.02585
> 2001438 , 95991, 104822, 95991, 0.5, 0.3, 1.2, 2.7,
> 13.5, 95.3, 21.3, 0.02482
> 2086571 , 89121, 99429, 89121, 0.6, 0.3, 1.2, 3.2,
> 30.9, 103.3, 22.3, 0.02367
> 2184096 , 88718, 97020, 88718, 0.6, 0.3, 1.3, 3.2,
> 85.6, 98.0, 23.4, 0.02262
> 2276823 , 91795, 91795, 91795, 0.5, 0.3, 1.3, 3.5,
> 81.1, 102.1, 24.4, 0.02174
> 2381493 , 101074, 101074, 101074, 0.5, 0.3, 1.3, 3.3,
> 12.9, 99.1, 25.4, 0.02123
> 2466415 , 83368, 92292, 83368, 0.6, 0.4, 1.2, 3.0,
> 14.3, 188.5, 26.4, 0.02037
> 2567406 , 100099, 109267, 100099, 0.5, 0.3, 1.4, 3.3,
> 10.9, 94.2, 27.4, 0.01989
> 2653040 , 84476, 91922, 84476, 0.6, 0.3, 1.4, 3.2,
> 77.0, 100.3, 28.5, 0.01937
> TimedOutException(acknowledged_by:0)
> TimedOutException(acknowledged_by:0)
> TimedOutException(acknowledged_by:0)
> TimedOutException(acknowledged_by:0)
> TimedOutException(acknowledged_by:0)
> TimedOutException(acknowledged_by:0)
> TimedOutException(acknowledged_by:0)
> TimedOutException(acknowledged_by:0)
> ...
> 9825371 , 84636, 91716, 84636, 0.6, 0.3, 1.4, 4.5,
> 23.4, 86.4, 125.7, 0.00894
> 9915317 , 87803, 93938, 87803, 0.6, 0.3, 1.3, 4.2,
> 62.4, 87.2, 126.7, 0.00888
> 10000000 , 93564, 101405, 93564, 0.5, 0.3, 1.4, 4.2,
> 16.2, 83.1, 127.6, 0.00880
> Results:
> real op rate : 78378
> adjusted op rate : 78378
> adjusted op rate stderr : 0
> key rate : 78378
> latency mean : 0.6
> latency median : 0.3
> latency 95th percentile : 1.1
> latency 99th percentile : 3.2
> latency 99.9th percentile : 20.0
> latency max : 11141.1
> Total operation time : 00:02:07
> END
> {noformat}
> And in from the system log:
> {noformat}
> [jenkins@blade-2-1a ~]$ grep -i "messages dropped"
> /var/log/cassandra/system.log
> INFO [ScheduledTasks:1] 2014-03-07 11:35:27,872 MessagingService.java:818 -
> 18 MUTATION messages dropped in last 5000ms
> INFO [ScheduledTasks:1] 2014-03-07 11:35:53,037 MessagingService.java:818 -
> 18 MUTATION messages dropped in last 5000ms
> INFO [ScheduledTasks:1] 2014-03-07 11:36:43,210 MessagingService.java:818 -
> 218 MUTATION messages dropped in last 5000ms
> {noformat}
> I am seeing the same results using {{\-mode thrift}} as well the default
> using the java-driver.
--
This message was sent by Atlassian JIRA
(v6.2#6252)