Jeremy Hanna updated CASSANDRA-14654:
    Component/s: Compaction

> Reduce heap pressure during compactions
> ---------------------------------------
>                 Key: CASSANDRA-14654
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14654
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Chris Lohfink
>            Assignee: Chris Lohfink
>            Priority: Major
> Small partition compactions are painfully slow with a lot of overhead per 
> partition. There also tends to be an excess of objects created (ie 
> 200-700mb/s) per compaction thread.
> The EncoderStats walks through all the partitions and with mergeWith it will 
> create a new one per partition as it walks the potentially millions of 
> partitions. In a test scenario of about 600byte partitions and a couple 100mb 
> of data this consumed ~16% of the heap pressure. Changing this to instead 
> mutably track the min values and create one in a EncodingStats.Collector 
> brought this down considerably (but not 100% since the 
> UnfilteredRowIterator.stats() still creates 1 per partition).
> The KeyCacheKey makes a full copy of the underlying byte array in 
> ByteBufferUtil.getArray in its constructor. This is the dominating heap 
> pressure as there are more sstables. By changing this to just keeping the 
> original it completely eliminates the current dominator of the compactions 
> and also improves read performance.
> Minor tweak included for this as well for operators when compactions are 
> behind on low read clusters is to make the preemptive opening setting a 
> hotprop.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to