How many CPUs are you using for interrupts? http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux
Have you tried making a flame graph to see where Cassandra is spending its time? http://www.brendangregg.com/blog/2014-06-12/java-flame-graphs.html Are you tracking GC pauses? Jon On Mon, May 22, 2017 at 2:03 PM Eric Pederson <eric...@gmail.com> wrote: > Hi all: > > I'm new to Cassandra and I'm doing some performance testing. One of > things that I'm testing is ingestion throughput. My server setup is: > > - 3 node cluster > - SSD data (both commit log and sstables are on the same disk) > - 64 GB RAM per server > - 48 cores per server > - Cassandra 3.0.11 > - 48 Gb heap using G1GC > - 1 Gbps NICs > > Since I'm using SSD I've tried tuning the following (one at a time) but > none seemed to make a lot of difference: > > - concurrent_writes=384 > - memtable_flush_writers=8 > - concurrent_compactors=8 > > I am currently doing ingestion tests sending data from 3 clients on the > same subnet. I am using cassandra-stress to do some ingestion testing. > The tests are using CL=ONE and RF=2. > > Using cassandra-stress (3.10) I am able to saturate the disk using a large > enough column size and the standard five column cassandra-stress schema. > For example, -col size=fixed(400) will saturate the disk and compactions > will start falling behind. > > One of our main tables has a row size that approximately 200 bytes, across > 64 columns. When ingesting this table I don't see any resource > saturation. Disk utilization is around 10-15% per iostat. Incoming > network traffic on the servers is around 100-300 Mbps. CPU utilization is > around 20-70%. nodetool tpstats shows mostly zeros with occasional > spikes around 500 in MutationStage. > > The stress run does 10,000,000 inserts per client, each with a separate > range of partition IDs. The run with 200 byte rows takes about 4 minutes, > with mean Latency 4.5ms, Total GC time of 21 secs, Avg GC time 173 ms. > > The overall performance is good - around 120k rows/sec ingested. But I'm > curious to know where the bottleneck is. There's no resource saturation and > nodetool tpstats shows only occasional brief queueing. Is the rest just > expected latency inside of Cassandra? > > Thanks, > > -- Eric >