Tom van der Woerdt created CASSANDRA-12940:
----------------------------------------------

             Summary: Large compaction backlogs should slow down repairs
                 Key: CASSANDRA-12940
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12940
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Tom van der Woerdt


Repairs cause a flood of small sstables. In some situations the small sstables 
come in so fast that it takes longer to commit the compaction transaction than 
it takes to stream in the tables. This will cause a buildup in sstables, and 
this buildup causes compaction to go even slower (see CASSANDRA-12764).

For a cluster of mine this means running into nodes with >100 loadavg, with 
tables that have 10k sstables. After the repair finishes the nodes go back to 
normal, but it takes a while and affects query latency a lot.

The compaction paths could probably be faster, though I'm more interested in 
making repairs wait for compaction. When we have a L0 with 10000+ tables, the 
repair path should probably wait a minute.

All I did was run 'nodetool repair' :
{noformat}
                SSTable count: 11755
                SSTables in each level: [11709/4, 23/10, 50, 0, 0, 0, 0, 0, 0]
{noformat}

`nodetool compactionstats' shows 17 pending tasks (seems a bit low) and 
`nodetool netstats' shows 1861 lines of text over 138 stream sessions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to