Oftopic: ksoftirqd after ddos take more cpu? as result cassandra latensy very high
Hello We was under ddos attack, and as result we got high ksoftirqd activity - as result cassandra begin answer very slow. But when ddos was gone high ksoftirqd activity still exists, and dissaper when i stop cassandra daemon, and repeat again when i start cassadra daemon, the fully resolution of problem is full reboot of server. What this can be (why ksoftirqd begin work very intensive when cassandra runing - we disable all working traffic to cluster but this doesn't help so this is can't be due heavy load )? And how to solve this? PS: OS ubuntu 10.0.4 (2.6.32.41) cassandra 1.0.10 java 1.6.32 (from oracle)
Re: Oftopic: ksoftirqd after ddos take more cpu? as result cassandra latensy very high
Hello, it is not related to cassandra/ddos. it is kernel problems due to leap second. See http://serverfault.com/questions/403732/anyone-else-experiencing-high-rates-of-linux-server-crashes-during-a-leap-second On Sun, Jul 1, 2012 at 1:05 PM, ruslan usifov ruslan.usi...@gmail.com wrote: Hello We was under ddos attack, and as result we got high ksoftirqd activity - as result cassandra begin answer very slow. But when ddos was gone high ksoftirqd activity still exists, and dissaper when i stop cassandra daemon, and repeat again when i start cassadra daemon, the fully resolution of problem is full reboot of server. What this can be (why ksoftirqd begin work very intensive when cassandra runing - we disable all working traffic to cluster but this doesn't help so this is can't be due heavy load )? And how to solve this? PS: OS ubuntu 10.0.4 (2.6.32.41) cassandra 1.0.10 java 1.6.32 (from oracle)
Re: Oftopic: ksoftirqd after ddos take more cpu? as result cassandra latensy very high
Good afternoon, This again looks like it could be the leap second issue: This looks like the problem a bunch of us were having yesterday that isn't cleared without a reboot or a date command. It seems to be related to the leap second that was added between the 30th June and the 1st of July. See the mailing list thread with subject High CPU usage as of 8pm eastern time If you are seeing high CPU usage and a stall after restarting cassandra still, and you are on Linux, try: date; date `date +%m%d%H%M%C%y.%S`; date; In a terminal and see if everything starts working again. I hope this helps. Please spread the word if you see others having issues with unresponsive kernels/high CPU. -- David Daeschler On Sun, Jul 1, 2012 at 1:05 PM, ruslan usifov ruslan.usi...@gmail.com wrote: Hello We was under ddos attack, and as result we got high ksoftirqd activity - as result cassandra begin answer very slow. But when ddos was gone high ksoftirqd activity still exists, and dissaper when i stop cassandra daemon, and repeat again when i start cassadra daemon, the fully resolution of problem is full reboot of server. What this can be (why ksoftirqd begin work very intensive when cassandra runing - we disable all working traffic to cluster but this doesn't help so this is can't be due heavy load )? And how to solve this? PS: OS ubuntu 10.0.4 (2.6.32.41) cassandra 1.0.10 java 1.6.32 (from oracle)
Re: Oftopic: ksoftirqd after ddos take more cpu? as result cassandra latensy very high
2012/7/1 David Daeschler david.daesch...@gmail.com: Good afternoon, This again looks like it could be the leap second issue: This looks like the problem a bunch of us were having yesterday that isn't cleared without a reboot or a date command. It seems to be related to the leap second that was added between the 30th June and the 1st of July. See the mailing list thread with subject High CPU usage as of 8pm eastern time If you are seeing high CPU usage and a stall after restarting cassandra still, and you are on Linux, try: date; date `date +%m%d%H%M%C%y.%S`; date; In a terminal and see if everything starts working again. I hope this helps. Please spread the word if you see others having issues with unresponsive kernels/high CPU. Hello, this realy helps. In our case two problems cross each other-(( and we doesn't have assumed that might be a kernel problem. On one data cluster we simply reboot it, and in seccond apply date solution and everything is fine, thanks