Hi Romain, No, I don't think we upgraded cassandra version or changed any of those schema elements. After I realized this high load issue, I found that some of the tables have a shorter gc_grace_seconds(1day) than the rest and because it seemed causing constant compaction cycles, I have changed them to 10days. but again, that's after load hit this high number. some of nodes got eased a little bit after changing gc_grace_seconds values and repairing nodes, but since few days ago, all of nodes are constantly reporting load 15-20.
Thank you for the suggestion about logging, let me try to change the log level to see what I can get from it. Thanks, Aoi 2016-07-13 13:28 GMT-07:00 Romain Hardouin <romainh...@yahoo.fr>: > Did you upgrade from a previous version? DId you make some schema changes > like compaction strategy, compression, bloom filter, etc.? > What about the R/W requests? > SharedPool Workers are... shared ;-) Put logs in debug to see some examples > of what services are using this pool (many actually). > > Best, > > Romain > > > Le Mercredi 13 juillet 2016 18h15, Patrick McFadin <pmcfa...@gmail.com> a > écrit : > > > Might be more clear looking at nodetool tpstats > > From there you can see all the thread pools and if there are any blocks. > Could be something subtle like network. > > On Tue, Jul 12, 2016 at 3:23 PM, Aoi Kadoya <cadyan....@gmail.com> wrote: > > Hi, > > I am running 6 nodes vnode cluster with DSE 4.8.1, and since few weeks > ago, all of the cluster nodes are hitting avg. 15-20 cpu load. > These nodes are running on VMs(VMware vSphere) that have 8vcpu > (1core/socket)-16 vRAM.(JVM options : -Xms8G -Xmx8G -Xmn800M) > > At first I thought this is because of CPU iowait, however, iowait is > constantly low(in fact it's 0 almost all time time), CPU steal time is > also 0%. > > When I took a thread dump, I found some of "SharedPool-Worker" threads > are consuming CPU and those threads seem to be waiting for something > so I assume this is the cause of cpu load. > > "SharedPool-Worker-1" #240 daemon prio=5 os_prio=0 > tid=0x00007fabf459e000 nid=0x39b3 waiting on condition > [0x00007faad7f02000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:85) > at java.lang.Thread.run(Thread.java:745) > > Thread dump looks like this, but I am not sure what is this > sharedpool-worker waiting for. > Would you please help me with the further trouble shooting? > I am also reading the thread posted by Yuan as the situation is very > similar to mine but I didn't get any blocked, dropped or pending count > in my tpstat result. > > Thanks, > Aoi > > > >