Hello. I'm in the process of migrating my old 60 node cluster into a new 72 node cluster running 2.2.6. I fired BulkLoader on the old cluster to stream all data from every node in the old cluster to my new cluster, and I'm now watching as my new cluster is doing compactions. What I like is to understand the LeveledCompactionStrategy behaviour in more detail.
I'm taking one node as an example, but all other nodes have quite same situation. There are 53 live SSTables in a big table. This can be seen both by looking la-*Data.db files and also with nodetool cfstats: "SSTables in each level: [31/4, 10, 12, 0, 0, 0, 0, 0, 0]" If I look on the SSTable files in the disk I see some huge SSTables, like a 37 GiB, 57 GiB, 74 GiB, which are all on Level 0 (used sstablemetadata to see this). The size of all live sstables are about 920 GiB. Then there are tmp-la-*Data.db and tmplink-la-*Data.db files (the tmplink files are hardlinks to the tmp file due to CASSANDRA-6916). I guess that these come from the single active compaction. The total size of these files are around ~65 GiB. On the compaction side compactionstats shows that there's just one compaction running, which is heavily CPU bound: (I've reformatted the output here) pending tasks: 5390 bytes done: 673623792733 (673 GiB) bytes left: 3325656896682 (3325 GiB) Active compaction remaining time : 2h44m39s Why is the bytes done and especially bytes left such big? I don't have that much data in my node. Also how does Cassandra calculate the pending tasks with LCS? Why are there a few such big SSTables in the active sstable list? Is it because LCS falls back to STCS if L0 is too full? Should I use the stcs_im_l0:false option? What will happen to these big sstables in the future? I'm currently just waiting for the compactions to eventually finish, but I'm hoping to learn in more detail what the system does and possibly to help similar migration in the future. Thanks, - Garo