[
https://issues.apache.org/jira/browse/CASSANDRA-8641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484693#comment-14484693
]
Anuj commented on CASSANDRA-8641:
---------------------------------
We faced same issue in 2.0.3 when we ran repair with write load. We had
substantial data to repair. We are planning to upgrade to 2.0.13 soon. Is the
bug fixed in 2.0.13?
> Repair causes a large number of tiny SSTables
> ---------------------------------------------
>
> Key: CASSANDRA-8641
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8641
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Ubuntu 14.04
> Reporter: Flavien Charlon
> Fix For: 2.1.3
>
>
> I have a 3 nodes cluster with RF = 3, quad core and 32 GB or RAM. I am
> running 2.1.2 with all the default settings. I'm seeing some strange
> behaviors during incremental repair (under write load).
> Taking the example of one particular column family, before running an
> incremental repair, I have about 13 SSTables. After finishing the incremental
> repair, I have over 114000 SSTables.
> {noformat}
> Table: customers
> SSTable count: 114688
> Space used (live): 97203707290
> Space used (total): 99175455072
> Space used by snapshots (total): 0
> SSTable Compression Ratio: 0.28281112416526505
> Memtable cell count: 0
> Memtable data size: 0
> Memtable switch count: 1069
> Local read count: 0
> Local read latency: NaN ms
> Local write count: 11548705
> Local write latency: 0.030 ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.00000
> Bloom filter space used: 144145152
> Compacted partition minimum bytes: 311
> Compacted partition maximum bytes: 1996099046
> Compacted partition mean bytes: 3419
> Average live cells per slice (last five minutes): 0.0
> Maximum live cells per slice (last five minutes): 0.0
> Average tombstones per slice (last five minutes): 0.0
> Maximum tombstones per slice (last five minutes): 0.0
> {noformat}
> Looking at the logs during the repair, it seems Cassandra is struggling to
> compact minuscule memtables (often just a few kilobytes):
> {noformat}
> INFO [CompactionExecutor:337] 2015-01-17 01:44:27,011
> CompactionTask.java:251 - Compacted 32 sstables to
> [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-228341,].
> 8,332 bytes to 6,547 (~78% of original) in 80,476ms = 0.000078MB/s. 32
> total partitions merged to 32. Partition merge counts were {1:32, }
> INFO [CompactionExecutor:337] 2015-01-17 01:45:35,519
> CompactionTask.java:251 - Compacted 32 sstables to
> [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-229348,].
> 8,384 bytes to 6,563 (~78% of original) in 6,880ms = 0.000910MB/s. 32
> total partitions merged to 32. Partition merge counts were {1:32, }
> INFO [CompactionExecutor:339] 2015-01-17 01:47:46,475
> CompactionTask.java:251 - Compacted 32 sstables to
> [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-229351,].
> 8,423 bytes to 6,401 (~75% of original) in 10,416ms = 0.000586MB/s. 32
> total partitions merged to 32. Partition merge counts were {1:32, }
> {noformat}
>
> Here is an excerpt of the system logs showing the abnormal flushing:
> {noformat}
> INFO [AntiEntropyStage:1] 2015-01-17 15:28:43,807 ColumnFamilyStore.java:840
> - Enqueuing flush of customers: 634484 (0%) on-heap, 2599489 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:06,823 ColumnFamilyStore.java:840
> - Enqueuing flush of levels: 129504 (0%) on-heap, 222168 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:07,940 ColumnFamilyStore.java:840
> - Enqueuing flush of chain: 4508 (0%) on-heap, 6880 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:08,124 ColumnFamilyStore.java:840
> - Enqueuing flush of invoices: 1469772 (0%) on-heap, 2542675 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:09,471 ColumnFamilyStore.java:840
> - Enqueuing flush of customers: 809844 (0%) on-heap, 3364728 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:24,368 ColumnFamilyStore.java:840
> - Enqueuing flush of levels: 28212 (0%) on-heap, 44220 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:24,822 ColumnFamilyStore.java:840
> - Enqueuing flush of chain: 860 (0%) on-heap, 1130 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:24,985 ColumnFamilyStore.java:840
> - Enqueuing flush of invoices: 334480 (0%) on-heap, 568959 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:27,375 ColumnFamilyStore.java:840
> - Enqueuing flush of customers: 221568 (0%) on-heap, 929962 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:35,755 ColumnFamilyStore.java:840
> - Enqueuing flush of invoices: 7916 (0%) on-heap, 11080 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:36,239 ColumnFamilyStore.java:840
> - Enqueuing flush of customers: 9968 (0%) on-heap, 33041 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:37,935 ColumnFamilyStore.java:840
> - Enqueuing flush of invoices: 42108 (0%) on-heap, 69494 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:41,182 ColumnFamilyStore.java:840
> - Enqueuing flush of customers: 40936 (0%) on-heap, 159099 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:49,573 ColumnFamilyStore.java:840
> - Enqueuing flush of levels: 17236 (0%) on-heap, 27048 (0%) off-heap
> INFO [AntiEntropyStage:1] 2015-01-17 15:29:50,440 ColumnFamilyStore.java:840
> - Enqueuing flush of chain: 548 (0%) on-heap, 630 (0%) off-heap
> {noformat}
> At the end of the repair, the cluster has become unusable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)