[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116006#comment-13116006 ] Jason Harvey commented on CASSANDRA-2170: - The stack trace in that example was from a 0.8.1 node. The same problem has occurred on my 0.8.5 nodes. I checked the stack trace that I had, and I couldn't find any threads executing in any ExpiringMap stuff. We switched to HSHA about a week ago and it appears to have resolved the loads pikes. Is the stack trace I gave interesting enough to pursue further investigation into the issue, or should I just leave the answer as 'HSHA' ? Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Assignee: Brandon Williams as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107233#comment-13107233 ] Peter Schuller commented on CASSANDRA-2170: --- Wow, interesting. Are you sure it's 0.8.5 though? The stack trace is not matching what I see in the 0.8.5 tag (mismatched line number for MessagingService.addCallback()). We've been seeing load spikes on 0.7, but havent reported it because it's such an old version. However we were never able to grab stacks because no JMX query would ever succeed during this condition. The stack trace indicates it's stuck doing resize operations on the NBHM where each thread is trying to help the resizing operation along by performing potentially duplicate (for forward progress producing) work. Do you have a list of all stacks? Do you find any thread (should be 0 or 1) that are executing in ExpiringMap.CacheMonitor.run() at the time of the load spikes? I guess we're seeing some kind of fallen-and-cant-get-up senario having to do with the resize. Maybe dogpiling the resize is making it overall slow enough that it never gets unstuck without a temporary stop in incoming requests. Or some such. That's gut feely speculation without having actually looked at it carefully, so take it with a grain of salt :) Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Assignee: Brandon Williams as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106828#comment-13106828 ] Jason Harvey commented on CASSANDRA-2170: - I did a couple more thread dumps on spiking nodes. One interesting thing I'm seeing is that there are a high number of NBHM threads in the runnable state during the load spikes. One the dumps I analyzed, there were often 200-300 of these threads in RUNNABLE. Here is an example of the threads: https://gist.github.com/ef215227b85bdff5f033 Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Assignee: Brandon Williams as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105627#comment-13105627 ] Jason Harvey commented on CASSANDRA-2170: - Tried without jsvc, same result. This problem is also consistent on 0.8.5 on Debian. I replicated the issue on a coordinator which only owned 1 token. Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Assignee: Brandon Williams as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105629#comment-13105629 ] Jason Harvey commented on CASSANDRA-2170: - Clarification: Debian Squeeze Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Assignee: Brandon Williams as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105790#comment-13105790 ] Jason Harvey commented on CASSANDRA-2170: - I should note, the symptoms I'm seeing are basically identical to #2054 Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Assignee: Brandon Williams as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082138#comment-13082138 ] Jason Harvey commented on CASSANDRA-2170: - Re-opening per request of driftx. So, still seeing this problem ever since our upgrade from 0.6.7. It is 100% consistent on 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.8.0, 0.8.1. I've tried Sun JRE and OpenJDK. Tried with JNA and without. Tried Ubuntu 08.04/10.04/10.10/11.04, as well as RHEL5.1. It *only* happens on coordinator nodes. For the 0.8 ring, I created a brand new ring and added data from our app one CF at a time. As soon as I added a busy CF, the problem popped up again. The load on the boxes in the new ring is under 1 all the time, except for when the load spike occurs. Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035816#comment-13035816 ] Brandon Williams commented on CASSANDRA-2170: - I never saw anyone reliably report this on any platform except ec2, so I strongly the suspect the cause was what is covered in this link: https://silverline.librato.com/blog/main/EC2_Users_Should_be_Cautious_When_Booting_Ubuntu_10_04_AMIs Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004665#comment-13004665 ] Jonathan Ellis commented on CASSANDRA-2170: --- (I.e. memory pressure caused by key cache preheating being too aggressive.) Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Fix For: 0.6.13 as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004663#comment-13004663 ] Jonathan Ellis commented on CASSANDRA-2170: --- I wonder if this could have been CASSANDRA-2175. Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Fix For: 0.6.13 as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-2170) Load spikes
[ https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994963#comment-12994963 ] Jonathan Ellis commented on CASSANDRA-2170: --- AFAIK nobody has seen this on 0.7.1. Load spikes --- Key: CASSANDRA-2170 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170 Project: Cassandra Issue Type: Bug Affects Versions: 0.6.11 Reporter: Jonathan Ellis Fix For: 0.6.12 as reported on CASSANDRA-2058, some users are still seeing load spikes on 0.6.11, even with fairly low-volume read workloads. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira