[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081059#comment-14081059 ] Brandon Williams commented on CASSANDRA-7567: - Closing in favor of CASSANDRA-7658 > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081060#comment-14081060 ] Benedict commented on CASSANDRA-7567: - So Java driver then (cql3 native prepared is default) There isn't anything we can really do about this without Java Driver support > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081049#comment-14081049 ] Brandon Williams commented on CASSANDRA-7567: - Whichever mode is the default. My command line is pretty simple: {noformat} tools/bin/cassandra-stress write n=1000 -node bw-1,bw-2 -schema 'replication(factor=3)' {noformat} > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Fix For: 2.1.0 > > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081045#comment-14081045 ] Benedict commented on CASSANDRA-7567: - Which mode of connectivity? "smart thrift" and cql native3 both use token aware routing from the Java driver (smart thrift does its own fairly dumb round-robin for a given token range), so will go directly to a random node in the cluster. Java driver I don't think we have any easy API control over what nodes we connect to, and I'm not sure there's a lot of point making "smart thrift" too smart, since it's only there to compare fairly against cql native3's token-aware routing. Regular thrift mode won't do this. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Fix For: 2.1.0 > > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081040#comment-14081040 ] Brandon Williams commented on CASSANDRA-7567: - I applied CASSANDRA-7644 and finally got meaningful traces, but none were fruitful - everything always completed quickly and exactly as it should. So I went for the sure-fire method of reproducing, and suspended the jvm on the 'dd' node so it could respond at all. Still nothing suspicious in the trace, however now I _did_ see stress report 12s latencies. Suspecting that stress wasn't actually doing what I told it to (only connect to node1 and node2, not node3 which I'm trying to beat up with dd) I discovered it was actually connecting to all three nodes, and that's why suspending the 'unused' node caused the latencies. I don't think anything is wrong in Cassandra itself here any longer, but something is wrong with stress, probably doing ring discovery and connecting to everything when it shouldn't. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Fix For: 2.1.0 > > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078788#comment-14078788 ] Brandon Williams commented on CASSANDRA-7567: - Half-win then. Unfortunately between CASSANDRA-7629 and CASSANDRA-7644 I can't tell what's going on in the traces :( If you do try to get traces though you'll probably want to apply CASSANDRA-7626. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078786#comment-14078786 ] Ryan McGuire commented on CASSANDRA-7567: - Sorry for the confusion, I don't think I tested this on 2.0 after all, I was unable to see the same latency spikes on 2.0.8 nor 2.0 HEAD. I do seem them on 2.1 HEAD and I'm preparing logs and traces for that run. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078451#comment-14078451 ] Ryan McGuire commented on CASSANDRA-7567: - I'll rerun on both 2.1 and 2.0 and capture both a trace and matching logs. The logs were from a different cluster, matching to results in [this comment|https://issues.apache.org/jira/browse/CASSANDRA-7567?focusedCommentId=14072373&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14072373] but I started a new cluster (EC2, so not same IP) for the comment with the trace. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078424#comment-14078424 ] Benedict commented on CASSANDRA-7567: - Ryan, there's something fishy about those logs: The trace gives a source of 10.180.41.228, but there's no node with that IP address in the log files you attached...? > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078401#comment-14078401 ] Ryan McGuire commented on CASSANDRA-7567: - Both do it. All the above results are from 2.1 though. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078393#comment-14078393 ] Jonathan Ellis commented on CASSANDRA-7567: --- Ryan, was your reproduction on 2.0 or 2.1? > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074829#comment-14074829 ] Ryan McGuire commented on CASSANDRA-7567: - {code} cqlsh> select activity, source, source_elapsed, thread from system_traces.events where session_id = f0c14390-1434-11e4-bbc2-1511583f4f50; activity | source | source_elapsed | thread --+++--- Message received from /10.109.137.124 | 10.111.173.249 | 50 | Thread-11 Determining replicas for mutation | 10.109.137.124 |158 | Thrift:116 Sending message to /10.111.173.249 | 10.109.137.124 |303 | WRITE-/10.111.173.249 Sending message to /10.180.41.228 | 10.109.137.124 |303 | WRITE-/10.180.41.228 Message received from /10.109.137.124 | 10.180.41.228 | 36 | Thread-28 Appending to commitlog | 10.111.173.249 | 94484 | SharedPool-Worker-29 Adding to Standard1 memtable | 10.111.173.249 | 94569 | SharedPool-Worker-29 Enqueuing response to /10.109.137.124 | 10.111.173.249 | 94636 | SharedPool-Worker-29 Sending message to /10.109.137.124 | 10.111.173.249 | 94752 | WRITE-/10.109.137.124 Message received from /10.111.173.249 | 10.109.137.124 | 95898 | Thread-4 Processing response from /10.111.173.249 | 10.109.137.124 | 96001 | SharedPool-Worker-11 (11 rows) {code} > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072406#comment-14072406 ] Brandon Williams commented on CASSANDRA-7567: - I ran it a pretty long time, but I might just have more IO. Can you try the cqlsh w/tracing trick I did and see if you can catch a slow one? > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell >Assignee: Brandon Williams > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072373#comment-14072373 ] Ryan McGuire commented on CASSANDRA-7567: - I don't know what it is about 12s specifically, but I'm seeing that same intermittent 12s latency spike on a different cluster, this time 3 c3.8xlarge. How long did you run the test? It definitely gets progressively worse the longer it runs. This kind of activity loops: {code} 20352718 , 25, 25, 25, 4029.3, 1.4, 12089.2, 12091.3, 12092.0, 12092.0, 333.0, 0.01946 20639361 , 24827, 24827, 24827, 4.0, 0.9, 2.3, 6.7, 143.2, 10555.2, 344.5, 0.01964 20656481 , 17580, 17580, 17580, 4.8, 1.4,14.4,56.1, 198.3, 206.8, 345.5, 0.01978 20672190 , 14344, 14344, 14344, 6.6, 0.9,15.6,42.0, 320.1, 328.2, 346.6, 0.01997 20682762 , 1, 1, 1, 7.8, 0.8,16.0,97.3, 444.3, 447.5, 347.3, 0.02018 20683062 , 25, 25, 25, 4017.3, 1.0, 12054.9, 12056.2, 12057.6, 12057.6, 359.3, 0.02037 21256796 , 43999, 43999, 43999, 2.3, 0.9, 1.4, 3.5, 80.8, 12100.9, 372.3, 0.02049 21356015 , 90655, 90655, 90655, 1.1, 0.9, 1.7, 3.2, 80.1,86.7, 373.4, 0.02051 21426407 , 69401, 69401, 69401, 1.4, 0.9, 2.0, 2.7, 111.2, 114.4, 374.4, 0.02044 21500494 , 73883, 73883, 73883, 1.3, 1.0, 2.1, 9.6, 21.0,23.1, 375.4, 0.02036 21517270 , 16108, 16108, 16108, 6.1, 0.9,16.4,79.7, 135.1, 136.2, 376.5, 0.02029 21523818 ,4002,4002,4002,22.0, 1.4,84.4, 493.9, 612.6, 672.2, 378.1, 0.02046 21524118 , 25, 25, 25, 4024.3, 0.8, 12081.4, 12086.2, 12097.4, 12097.4, 390.2, 0.02074 21975735 , 35025, 35025, 35025, 2.9, 0.8, 1.1, 1.7, 90.4, 12102.2, 403.1, 0.02104 22064231 , 87534, 87534, 87534, 1.1, 0.9, 1.3, 2.3, 100.4, 102.8, 404.1, 0.02109 22166146 , 105540, 105540, 105540, 0.9, 0.9, 1.2, 2.1, 5.1,15.3, 405.1, 0.02102 22244592 , 72442, 72442, 72442, 1.4, 0.9, 1.9, 3.6, 98.0, 100.8, 406.1, 0.02096 22313323 , 67902, 67902, 67902, 1.5, 0.9, 2.0, 3.3, 213.0, 223.1, 407.2, 0.02088 22332959 , 19299, 19299, 19299, 5.1, 0.9,15.5,21.5, 119.6, 122.1, 408.2, 0.02081 22342749 ,9854,9854,9854, 6.3, 0.9,14.9,23.6, 547.6, 566.8, 409.2, 0.02094 22347270 ,5399,5399,5399,22.4, 1.6,22.5, 759.5, 816.5, 830.5, 410.0, 0.02115 22347570 , 25, 25, 25, 4020.3, 0.8, 12083.5, 12089.3, 12099.5, 12099.5, 422.1, 0.02139 22697985 , 26963, 26963, 26963, 3.8, 0.8, 1.1, 2.1, 65.5, 12100.1, 435.1, 0.02166 22788909 , 90108, 90108, 90108, 1.1, 0.9, 1.3, 3.2, 96.2,99.2, 436.1, 0.02174 22885016 , 95809, 95809, 95809, 1.0, 0.9, 1.3, 2.2, 78.5,82.4, 437.1, 0.02167 22975898 , 81816, 81816, 81816, 1.2, 0.9, 1.6, 3.5, 100.1, 102.3, 438.2, 0.02161 23049501 , 72615, 72615, 72615, 1.4, 1.0, 1.9, 2.5, 147.2, 151.1, 439.2, 0.02153 23074719 , 24475, 24475, 24475, 4.0, 0.8,14.4,19.6, 34.9,46.3, 440.3, 0.02145 23094356 , 18263, 18263, 18263, 5.5, 1.6,15.0,54.3, 99.6, 106.9, 441.3, 0.02155 23103664 ,9878,9878,9878, 6.5, 1.4,16.1, 122.5, 142.2, 349.2, 442.3, 0.02168 23104059 , 33, 33, 33, 3083.8, 0.9, 12068.0, 12071.5, 12071.6, 12071.6, 454.4, 0.02187 23378761 , 21693, 21693, 21693, 4.7, 0.8, 1.5, 3.2, 69.5, 12101.6, 467.1, 0.02212 23473613 , 88575, 88575, 88575, 1.1, 0.9, 1.2, 3.0, 95.9,99.0, 468.1, 0.0 23552129 , 77476, 77476, 77476, 1.3, 0.9, 1.4, 3.3, 159.1, 168.8, 469.1, 0.02215 23639310 , 86154, 86154, 86154, 1.1, 0.9, 1.3, 3.1, 95.5,98.4, 470.1, 0.02207 23731401 , 92987, 92987, 92987, 1.1, 0.9, 1.3, 2.9, 59.4,62.7, 471.1, 0.02200 23804418 , 63119, 63119, 63119, 1.6, 1.0, 1.9,11.9, 147.2, 149.7, 472.3, 0.02193 23825655 , 20577, 20577, 20577, 4.9, 0.8,16.1,22.2, 37.6,44.7, 473.3, 0.02186 23845878 , 19932, 19932, 19932, 5.0, 1.2,14.8,23.4, 79.5,87.0, 474.3, 0.02197 23854629 , 16059, 16059, 16059, 5.7, 2.0,13.0,98.1, 114.8, 115.3, 474.9, 0.02207 23854929 ,
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066495#comment-14066495 ] Brandon Williams commented on CASSANDRA-7567: - I wasn't able to reproduce this, even using ionice to downgrade java and promote dd. I also ran a 'dummy' insert in cqlsh with tracing on while this was running, and if we examine a trace: {noformat} activity | timestamp | source| source_elapsed ---+--+---+ execute_cql3_query | 16:26:14,860 | 10.208.8.123 | 0 Parsing INSERT INTO "Standard1" (key, "C0") VALUES ( 0xf0, 0xf0); | 16:26:14,860 | 10.208.8.123 |100 Message received from /10.208.8.123 | 16:26:14,860 | 10.208.35.225 | 54 Preparing statement | 16:26:14,860 | 10.208.8.123 |326 Acquiring switchLock read lock | 16:26:14,860 | 10.208.35.225 |170 Determining replicas for mutation | 16:26:14,860 | 10.208.8.123 |491 Appending to commitlog | 16:26:14,860 | 10.208.35.225 |199 Acquiring switchLock read lock | 16:26:14,860 | 10.208.8.123 |674 Appending to commitlog | 16:26:14,860 | 10.208.8.123 |701 Sending message to /10.208.35.225 | 16:26:14,860 | 10.208.8.123 |705 Adding to Standard1 memtable | 16:26:14,860 | 10.208.8.123 |752 Adding to Standard1 memtable | 16:26:14,861 | 10.208.35.225 |275 Enqueuing response to /10.208.8.123 | 16:26:14,861 | 10.208.35.225 |373 Sending message to bw-1/10.208.8.123 | 16:26:14,861 | 10.208.35.225 |538 Message received from /10.208.35.225 | 16:26:14,862 | 10.208.8.123 | null Processing response from /10.208.35.225 | 16:26:14,862 | 10.208.8.123 | null Sending message to /10.208.8.63 | 16:26:14,863 | 10.208.8.123 | null Message received from /10.208.8.123 | 16:26:14,867 | 10.208.8.63 | 51 Acquiring switchLock read lock | 16:26:14,867 | 10.208.8.63 |169 Message received from /10.208.8.63 | 16:26:15,345 | 10.208.8.123 | null Appending to commitlog | 16:26:15,345 | 10.208.8.63 | 478510 Processing response from /10.208.8.63 | 16:26:15,345 | 10.208.8.123 | null Adding to Standard1 memtable | 16:26:15,345 | 10.208.8.63 | 478552 Enqueuing response to /10.208.8.123 | 16:26:15,345 | 10.208.8.63 | 478755 Sending message to bw-1/10.208.8.123 | 16:26:15,345 | 10.208.8.63 | 478967 Request complete | 16:26:14,860 | 10.208.8.123 |907 {noformat} But isolate it to just the coordinator used and remove unnecessary network messages: {noformat} activity | timestamp | source| source_elapsed ---+--+---+ execute_cql3_query | 16:26:14,860 | 10.208.8.123 | 0 Parsing INSERT INTO "Standard1" (key, "C0") VALUES ( 0xf0, 0xf0); | 16:26:14,860 | 10.208.8.123 |100 Preparing statement | 16:26:14,860 | 10.208.8.123 |326 Determining replicas for mutation | 16:26:14,860 | 10.208.8.123 |491 Acquiring switchLock read lock | 16:26:14,860 | 10.208.8.123 |674 Appending to commitlog | 16:26:14,860 | 10.208.8.123 |701 Adding to Standard1 memtable | 16:26:14,860 | 10.208.8.123 |752 Request complete | 16:26:14,860 | 10.208.8.123 |907 {nofo
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065844#comment-14065844 ] Ryan McGuire commented on CASSANDRA-7567: - Forgive my previous armchair reasoning. It's actually getting progressively worse as it goes along. I'm attaching the logs so other's can analyze this. I introduced the dd command at 60s into the write. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell > Attachments: 7567.logs.bz2, write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065819#comment-14065819 ] Ryan McGuire commented on CASSANDRA-7567: - I have reproduced this on a 10 node m3.xlarge cluster on EC2. * Run stress against nodes 1-5 * On node 9, run the dd command. * Watch stress completely time out. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell > Attachments: write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7567) when the commit_log disk for a single node is overwhelmed the entire cluster slows down
[ https://issues.apache.org/jira/browse/CASSANDRA-7567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065263#comment-14065263 ] David O'Dell commented on CASSANDRA-7567: - BTW the replication_factor = 3 and the consistency level = ONE. > when the commit_log disk for a single node is overwhelmed the entire cluster > slows down > --- > > Key: CASSANDRA-7567 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7567 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: debian 7.5, bare metal, 14 nodes, 64CPUs, 64GB RAM, > commit_log disk sata, data disk SSD, vnodes, leveled compaction strategy >Reporter: David O'Dell > Attachments: write_request_latency.png > > > We've run into a situation where a single node out of 14 is experiencing high > disk io. This can happen when a node is being decommissioned or after it > joins the ring and runs into the bug cassandra-6621. > When this occurs the write latency for the entire cluster spikes. > From 0.3ms to 170ms. > To simulate this simply run dd on the commit_log disk (dd if=/dev/zero > of=/tmp/foo bs=1024) and you will see that instantly all nodes in the cluster > have slowed down. > BTW overwhelming the data disk does not have this same effect. > Also I've tried this where the overwhelmed node isn't being connected > directly from the client and it still has the same effect. -- This message was sent by Atlassian JIRA (v6.2#6252)