Hi Patrick, In fact I couldn't see any thread pool named "shared". here is the result of tpstats from one of my nodes.
Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 173237609 0 0 ReadStage 0 0 71266557 0 0 RequestResponseStage 0 0 87617557 0 0 ReadRepairStage 0 0 51822 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 0 0 3828 0 0 HintedHandoff 0 0 23 0 0 GossipStage 0 0 2169599 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 1353194 0 0 ValidationExecutor 0 0 3337647 0 0 MigrationStage 0 0 5 0 0 AntiEntropyStage 0 0 7527026 0 0 PendingRangeCalculator 0 0 24 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 118019 0 0 MemtablePostFlush 0 0 3398738 0 0 MemtableReclaimMemory 0 0 122249 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 I have enabled a auto repair service on opscenter and it's running behind but I also realized that my cluster isn't well balanced.. other than system/opscenter keyspaces, I only have one keyspace and its replication factor is 3 (network topology strategy) Datacenter: xxx ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN xxxxxx 10.19 GB 256 ? 6bf8db87-d4cc-4a75-86a5-bc1b27ced32c RAC1 UN xxxxxx 10.59 GB 256 ? 2d407831-e10d-4a6b-86c0-26c7a60e613d RAC1 UN xxxxxx 7.99 GB 256 ? 1e05d70e-502e-4ac4-a6ed-bf912c332062 RAC1 UN xxxxxx 7.67 GB 256 ? 41a8e12a-c8e8-42ff-b681-b74f493a2407 RAC1 UN xxxxxx 11.13 GB 256 ? 67572986-99b8-4a78-9039-aaa0aca8c236 RAC1 UN xxxxxx 9.54 GB 256 ? 3f22001b-f03d-4bd0-8608-dd467cbc17f0 RAC1 Thanks, Aoi 2016-07-13 9:15 GMT-07:00 Patrick McFadin <pmcfa...@gmail.com>: > Might be more clear looking at nodetool tpstats > > From there you can see all the thread pools and if there are any blocks. > Could be something subtle like network. > > On Tue, Jul 12, 2016 at 3:23 PM, Aoi Kadoya <cadyan....@gmail.com> wrote: >> >> Hi, >> >> I am running 6 nodes vnode cluster with DSE 4.8.1, and since few weeks >> ago, all of the cluster nodes are hitting avg. 15-20 cpu load. >> These nodes are running on VMs(VMware vSphere) that have 8vcpu >> (1core/socket)-16 vRAM.(JVM options : -Xms8G -Xmx8G -Xmn800M) >> >> At first I thought this is because of CPU iowait, however, iowait is >> constantly low(in fact it's 0 almost all time time), CPU steal time is >> also 0%. >> >> When I took a thread dump, I found some of "SharedPool-Worker" threads >> are consuming CPU and those threads seem to be waiting for something >> so I assume this is the cause of cpu load. >> >> "SharedPool-Worker-1" #240 daemon prio=5 os_prio=0 >> tid=0x00007fabf459e000 nid=0x39b3 waiting on condition >> [0x00007faad7f02000] >> java.lang.Thread.State: WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) >> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:85) >> at java.lang.Thread.run(Thread.java:745) >> >> Thread dump looks like this, but I am not sure what is this >> sharedpool-worker waiting for. >> Would you please help me with the further trouble shooting? >> I am also reading the thread posted by Yuan as the situation is very >> similar to mine but I didn't get any blocked, dropped or pending count >> in my tpstat result. >> >> Thanks, >> Aoi > >