Cameron Zemek created CASSANDRA-18773:
-----------------------------------------

             Summary: Compactions are slow
                 Key: CASSANDRA-18773
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18773
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Cameron Zemek
         Attachments: stress.yaml

I have noticed that compactions involving a lot of sstables are very slow (for 
example major compactions). I have attached a cassandra stress profile that can 
generate such a dataset under ccm. In my local test I have 2567 sstables at 4Mb 
each.

I added code to track wall clock time of various parts of the code. One 
problematic part is ManyToOne constructor. Tracing through the code for every 
partition creating a ManyToOne for all the sstable iterators for each 
partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked 
on single core CPU (since this code is single threaded) with it spending 85% of 
the wall clock time in ManyToOne constructor.

As another datapoint to show its the merge iterator part of the code using the 
cfstats from [https://github.com/instaclustr/cassandra-sstable-tools/] which 
reads all the sstables but does no merging gets 26Mb/sec read speed.

Tracking back from ManyToOne call I see this in 
UnfilteredPartitionIterators::merge
{code:java}
                for (int i = 0; i < toMerge.size(); i++)
                {
                    if (toMerge.get(i) == null)
                    {
                        if (null == empty)
                            empty = EmptyIterators.unfilteredRow(metadata, 
partitionKey, isReverseOrder);
                        toMerge.set(i, empty);
                    }
                }
 {code}
Not sure what purpose of creating these empty rows are. But on a whim I removed 
all these empty iterators before passing to ManyToOne and then all the wall 
clock time shifted to CompactionIterator::hasNext() and read speed increased to 
1.5Mb/s.

So there are further bottlenecks in this code path it seems, but the first is 
this ManyToOne and having to build it for every partition read.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to