Chris Burroughs created CASSANDRA-8035:
------------------------------------------

             Summary: 2.0.x repair causes large increasein client latency even 
for small datase
                 Key: CASSANDRA-8035
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8035
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: c-2.0.10, 3 nodes per @ DCs.  Load < 50 MB
            Reporter: Chris Burroughs


Running repair causes a significnat increase in client latency even when the 
total amount of data per node is very small.

Each node 900 req/s and during normal operations the 99p Client Request Lantecy 
is less than 4 ms and usually less than 1ms.  During repair the latency 
increases to within 4-10ms on all nodes.  I am unable to find any resource 
based explantion for this.  Several graphs are attached to summarize.

 * Client Request Latency goes up significantly.
 * Local keyspace read latency is flat.  I interpret this to mean that it's 
purly coordinator overhead that's causing the slowdown.
 * Row cache hit rate is unaffected ( and is very high).  Between these two 
metrics I don't think there is any doubt that virtually all reads are being 
satisfied in memory.
 * There is plenty of available cpu.  Aggregate cpu used (mostly nic) did go up 
during this.

Having more/larger keyspaces seems to make it worse.  Having two keyspaces on 
this cluster (still with total size << RAM) caused larger increases in latency 
which would have made for better graphs but it pushed the cluster well outsid 
of SLAs and we needed to move the second keyspace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to