[
https://issues.apache.org/jira/browse/CASSANDRA-12464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei Deng updated CASSANDRA-12464:
---------------------------------
Labels: lcs lhf performance (was: lcs performance)
> Investigate the potential improvement of parallelism on higher level
> compactions in LCS
> ---------------------------------------------------------------------------------------
>
> Key: CASSANDRA-12464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12464
> Project: Cassandra
> Issue Type: Improvement
> Components: Compaction
> Reporter: Wei Deng
> Labels: lcs, lhf, performance
>
> According to LevelDB's design doc
> [here|https://github.com/google/leveldb/blob/master/doc/impl.html#L115-L116],
> "A compaction merges the contents of the picked files to produce a sequence
> of level-(L+1) files", it will "switch to producing a new level-(L+1) file
> after the current output file has reached the target file size" (in our case
> 160MB), it will also "switch to a new output file when the key range of the
> current output file has grown enough to overlap more than ten level-(L+2)
> files". This is to ensure "that a later compaction of a level-(L+1) file will
> not pick up too much data from level-(L+2)."
> Our current code in LeveledCompactionStrategy doesn't implement this last
> rule, but we might be able to quickly implement it and see how much a
> compaction throughput improvement it can deliver. Potentially we can create a
> scenario where a number of large L0 SSTables are present (e.g. 200GB after
> switching from STCS) and let it to create thousands of L1 SSTables overflow,
> and see how fast LCS can digest this much data from L1 and properly
> upper-level them to completion.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)