[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-09-27 Thread Jason Harvey (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13116006#comment-13116006
 ] 

Jason Harvey commented on CASSANDRA-2170:
-

The stack trace in that example was from a 0.8.1 node. The same problem has 
occurred on my 0.8.5 nodes.

I checked the stack trace that I had, and I couldn't find any threads executing 
in any ExpiringMap stuff.

We switched to HSHA about a week ago and it appears to have resolved the loads 
pikes. Is the stack trace I gave interesting enough to pursue further 
investigation into the issue, or should I just leave the answer as 'HSHA' ?

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
Assignee: Brandon Williams

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-09-17 Thread Peter Schuller (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107233#comment-13107233
 ] 

Peter Schuller commented on CASSANDRA-2170:
---

Wow, interesting. Are you sure it's 0.8.5 though? The stack trace is not 
matching what I see in the 0.8.5 tag (mismatched line number for 
MessagingService.addCallback()).

We've been seeing load spikes on 0.7, but havent reported it because it's such 
an old version. However we were never able to grab stacks because no JMX query 
would ever succeed during this condition.

The stack trace indicates it's stuck doing resize operations on the NBHM where 
each thread is trying to help the resizing operation along by performing 
potentially duplicate (for forward progress producing) work.

Do you have a list of all stacks? Do you find any thread (should be 0 or 1) 
that are executing in ExpiringMap.CacheMonitor.run() at the time of the load 
spikes?

I guess we're seeing some kind of fallen-and-cant-get-up senario having to do 
with the resize. Maybe dogpiling the resize is making it overall slow enough 
that it never gets unstuck without a temporary stop in incoming requests. Or 
some such. That's gut feely speculation without having actually looked at it 
carefully, so take it with a grain of salt :)


 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
Assignee: Brandon Williams

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-09-16 Thread Jason Harvey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13106828#comment-13106828
 ] 

Jason Harvey commented on CASSANDRA-2170:
-

I did a couple more thread dumps on spiking nodes. One interesting thing I'm 
seeing is that there are a high number of NBHM threads in the runnable state 
during the load spikes. One the dumps I analyzed, there were often 200-300 of 
these threads in RUNNABLE.

Here is an example of the threads: https://gist.github.com/ef215227b85bdff5f033

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
Assignee: Brandon Williams

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-09-15 Thread Jason Harvey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105627#comment-13105627
 ] 

Jason Harvey commented on CASSANDRA-2170:
-

Tried without jsvc, same result.

This problem is also consistent on 0.8.5 on Debian. I replicated the issue on a 
coordinator which only owned 1 token.

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
Assignee: Brandon Williams

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-09-15 Thread Jason Harvey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105629#comment-13105629
 ] 

Jason Harvey commented on CASSANDRA-2170:
-

Clarification: Debian Squeeze

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
Assignee: Brandon Williams

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-09-15 Thread Jason Harvey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105790#comment-13105790
 ] 

Jason Harvey commented on CASSANDRA-2170:
-

I should note, the symptoms I'm seeing are basically identical to #2054

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
Assignee: Brandon Williams

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-08-09 Thread Jason Harvey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082138#comment-13082138
 ] 

Jason Harvey commented on CASSANDRA-2170:
-

Re-opening per request of driftx.

So, still seeing this problem ever since our upgrade from 0.6.7.

It is 100% consistent on 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.8.0, 0.8.1. I've tried 
Sun JRE and OpenJDK. Tried with JNA and without. Tried Ubuntu 
08.04/10.04/10.10/11.04, as well as RHEL5.1. It *only* happens on coordinator 
nodes.

For the 0.8 ring, I created a brand new ring and added data from our app one CF 
at a time. As soon as I added a busy CF, the problem popped up again. The load 
on the boxes in the new ring is under 1 all the time, except for when the load 
spike occurs.

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-05-18 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035816#comment-13035816
 ] 

Brandon Williams commented on CASSANDRA-2170:
-

I never saw anyone reliably report this on any platform except ec2, so I 
strongly the suspect the cause was what is covered in this link:

https://silverline.librato.com/blog/main/EC2_Users_Should_be_Cautious_When_Booting_Ubuntu_10_04_AMIs

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (CASSANDRA-2170) Load spikes

2011-03-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004665#comment-13004665
 ] 

Jonathan Ellis commented on CASSANDRA-2170:
---

(I.e. memory pressure caused by key cache preheating being too aggressive.)

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
 Fix For: 0.6.13


 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (CASSANDRA-2170) Load spikes

2011-03-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004663#comment-13004663
 ] 

Jonathan Ellis commented on CASSANDRA-2170:
---

I wonder if this could have been CASSANDRA-2175.

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
 Fix For: 0.6.13


 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (CASSANDRA-2170) Load spikes

2011-02-15 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994963#comment-12994963
 ] 

Jonathan Ellis commented on CASSANDRA-2170:
---

AFAIK nobody has seen this on 0.7.1.

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
 Fix For: 0.6.12


 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira