Thom Bartold created CASSANDRA-12713:
----------------------------------------

             Summary: Not enough space for compaction LCS
                 Key: CASSANDRA-12713
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12713
             Project: Cassandra
          Issue Type: Bug
          Components: Compaction
         Environment: Cassandra 2.2.5 running on Amazon Linux 12 node cluster 
on i2.xlarge instances
            Reporter: Thom Bartold


Disk space usage is increasing on the nodes even though the amount of data is 
not increasing. No rows are ever deleted, but there are many rows being updated 
daily. The increase is about 0.5% of the disk space per day. There are also 
rows being added to the table, but not more than 0.005% per day. It appears 
that when rows are updated, the duplicates are not being merged.

Our current worst case node is using 73% of disk space (218GB free space). All 
sstables are limited to 160MB, so we would expect LCS compaction to require at 
most 1.6GB.

Filesystem     1K-blocks      Used Available Use% Mounted on
/dev/xvdb      781029612 562807548 218222064  73% /data

We guessed that compactions were not actually running, but could not find any 
indication why they would not be. so we tried forcing a major compaction with 
'nodetool compact overlordpreprod document'. This produced the following error 
messages in the log:

INFO  [RMI TCP Connection(4243)-10.241.55.214] 2016-09-26 16:14:38,796 
CompactionManager.java:610 - Cannot perform a full major compaction as repaired 
and unrepaired sstables cann
ot be compacted together. These two set of sstables will be compacted 
separately.
ERROR [CompactionExecutor:32811] 2016-09-26 16:14:38,804 
CassandraDaemon.java:185 - Exception in thread 
Thread[CompactionExecutor:32811,1,main]
java.lang.RuntimeException: Not enough space for compaction, estimated sstables 
= 2734, expected write size = 458700024959
        at 
org.apache.cassandra.db.compaction.CompactionTask.checkAvailableDiskSpace(CompactionTask.java:275)
 ~[apache-cassandra-2.2.6.jar:2.2.6]
        at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:118)
 ~[apache-cassandra-2.2.6.jar:2.2.6]
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.2.6.jar:2.2.6]
        at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:74)
 ~[apache-cassandra-2.2.6.jar:2.2.6]
        at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.2.6.jar:2.2.6]
        at 
org.apache.cassandra.db.compaction.CompactionManager$8.runMayThrow(CompactionManager.java:599)
 ~[apache-cassandra-2.2.6.jar:2.2.6]
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.2.6.jar:2.2.6]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_74]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_74]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_74]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_74]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]
INFO  [CompactionExecutor:32804] 2016-09-26 16:14:38,810 
CompactionManager.java:1502 - Compaction interrupted: 
Compaction@cf0c43b0-28e8-11e6-8f2b-1b695a8d5b2c(overlordpreprod, doc
ument, 2785792378/4821540826)bytes

Clearly 458,700,024,959 is more that the disk space we have free at this point, 
but it's also close to the size of the entire data table.

$ nodetool cfstats overlordpreprod.document
Keyspace: overlordpreprod
        Read Count: 32275
        Read Latency: 12.637312718822619 ms.
        Write Count: 6479
        Write Latency: 0.07886602870813397 ms.
        Pending Flushes: 0
                Table: document
                SSTable count: 3592
                SSTables in each level: [2, 21/10, 215/100, 1525/1000, 1826, 0, 
0, 0, 0]
                Space used (live): 548039713018
                Space used (total): 548039713018
                Space used by snapshots (total): 0
                Off heap memory used (total): 435358352
                SSTable Compression Ratio: 0.22833978714133665
                Number of keys (estimate): 121419286
                Memtable cell count: 16464
                Memtable data size: 315692802
                Memtable off heap memory used: 0
                Memtable switch count: 1
                Local read count: 32291
                Local read latency: 13.867 ms
                Local write count: 6480
                Local write latency: 0.085 ms
                Pending flushes: 0
                Bloom filter false positives: 14
                Bloom filter false ratio: 0.00020
                Bloom filter space used: 119263408
                Bloom filter off heap memory used: 119234672
                Index summary off heap memory used: 24386640
                Compression metadata off heap memory used: 291737040
                Compacted partition minimum bytes: 87
                Compacted partition maximum bytes: 12108970
                Compacted partition mean bytes: 16807
                Average live cells per slice (last five minutes): 1.0
                Maximum live cells per slice (last five minutes): 1
                Average tombstones per slice (last five minutes): 1.0
                Maximum tombstones per slice (last five minutes): 1

We selected LCS strategy because it was supposed to limit the amount of free 
disk space needed. It appears here that LCS requires much large amounts of free 
space than expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to