Hi, I have a five nodes C* cluster suffering from a big number of pending compaction tasks: 1) 571; 2) 91; 3) 367; 4) 22; 5) 232
Initially, it was holding one big table (table_a). With Spark, I read that table, extended its data and stored in a second table_b. After this copying/extending process the number of compaction tasks in the cluster has grown up. From nodetool cfstats (see output at the bottom): table_a has 20 SSTables and table_b has 18219. As I understand table_b has a big SSTables number because data was transferred from one table to another within a short time and eventually this tables will be compacted. But now I have to read whole data from this table_b and send it to Elasticsearch. When Spark reads this table some Cassandra nodes are dying because of OOM. I think that when compaction will be completed - the Spark reading job will work fine. The question is how can I speed up compaction process, what if I will add another two nodes to cluster - will compaction finish faster? Or data will be copied to new nodes but compaction will continue on the original set of SSTables? *Nodetool cfstats output: Table: table_a SSTable count: 20 Space used (live): 1064889308052 Space used (total): 1064889308052 Space used by snapshots (total): 0 Off heap memory used (total): 1118106937 SSTable Compression Ratio: 0.12564594959566894 Number of keys (estimate): 56238959 Memtable cell count: 76824 Memtable data size: 115531402 Memtable off heap memory used: 0 Memtable switch count: 17 Local read count: 0 Local read latency: NaN ms Local write count: 77308 Local write latency: 0.045 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 120230328 Bloom filter off heap memory used: 120230168 Index summary off heap memory used: 2837249 Compression metadata off heap memory used: 995039520 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 52066354 Compacted partition mean bytes: 133152 Average live cells per slice (last five minutes): NaN Maximum live cells per slice (last five minutes): 0 Average tombstones per slice (last five minutes): NaN Maximum tombstones per slice (last five minutes): 0 nodetool cfstats table_b Keyspace: dump_es Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Flushes: 0 Table: table_b SSTable count: 18219 Space used (live): 1316641151665 Space used (total): 1316641151665 Space used by snapshots (total): 0 Off heap memory used (total): 3863604976 SSTable Compression Ratio: 0.20387645535477916 Number of keys (estimate): 712032622 Memtable cell count: 0 Memtable data size: 0 Memtable off heap memory used: 0 Memtable switch count: 0 Local read count: 0 Local read latency: NaN ms Local write count: 0 Local write latency: NaN ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 Bloom filter space used: 2382971488 Bloom filter off heap memory used: 2742320056 Index summary off heap memory used: 371500752 Compression metadata off heap memory used: 749784168 Compacted partition minimum bytes: 771 Compacted partition maximum bytes: 1629722 Compacted partition mean bytes: 3555 Average live cells per slice (last five minutes): 132.375 Maximum live cells per slice (last five minutes): 149 Average tombstones per slice (last five minutes): 1.0 Maximum tombstones per slice (last five minutes): 1 ------------------ I logged CQL requests going from Spark and checked how one such request is performing - it fetches 8075rows, 59mb data in 155s (see below check output) $ date; echo 'SELECT "scan_id", "snapshot_id", "scan_doc", "snapshot_doc" FROM "dump_es"."table_b" WHERE token("scan_id") > 946122293981930504 AND token("scan_id") <= 946132293981 930504 ALLOW FILTERING;' | cqlsh --request-timeout=3600 | wc ; date Fri Apr 27 13:32:55 UTC 2018 8076 61191 59009831 Fri Apr 27 13:35:30 UTC 2018