Hi Patrick and thanks for your reply, We are monitoring disk usage and more and we don't seem to be running out of space at the moment. We have separate partitions/disks for commitlog/data. Which one do you suspect and why?
Regards, Vasilis On 25 Aug 2016 4:01 pm, "Patrick McFadin" <pmcfa...@gmail.com> wrote: This looks like you've run out of disk. What are your hardware specs? Patrick On Thursday, August 25, 2016, Benedict Elliott Smith <bened...@apache.org> wrote: > You should update from 2.0 to avoid this behaviour, is the simple answer. > You are correct that when the commit log gets full the memtables are > flushed to make room. 2.0 has several interrelated problems here though: > > There is a maximum flush queue length property (I cannot recall its name), > and once there are this many memtables flushing, no more writes can take > place on the box, whatsoever. You cannot simply increase this length, > though, because that shrinks the maximum size of any single memtable (it > is, iirc, total_memtable_space / (1 + flush_writers + max_queue_length)), > which worsens write-amplification from compaction. > > This is because the memory management for memtables in 2.0 was really > terrible, and this queue length was used to try to ensure the space > allocated was not exceeded. > > Compounding this, when clearing the commit log 2.0 will flush all > memtables with data in them regardless of it is useful to do so, meaning > having more tables (that are actively written to) than your max queue > length will necessarily cause stalls every time you run out of commit log > space. > > In 2.1, none of these concerns apply. > > > On 24 August 2016 at 23:40, Vasileios Vlachos <vasileiosvlac...@gmail.com> > wrote: > >> Hello, >> >> >> >> >> >> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We >> run C* 2.0.17 on Ubuntu 12.04 at the moment. >> >> >> >> >> Our C# application often throws logs, which correlate with dropped >> messages (counter mutations usually) in our logs. We think that if a >> specific mutaiton stays in the queue for more than 5 seconds, Cassandra >> drops it. This is also suggested by these lines in system.log: >> >> ERROR [ScheduledTasks:1] 2016-08-23 13:29:51,454 MessagingService.java >> (line 912) MUTATION messages were dropped in last 5000 ms: 317 for internal >> timeout and 0 for cross node timeout >> ERROR [ScheduledTasks:1] 2016-08-23 13:29:51,454 MessagingService.java >> (line 912) COUNTER_MUTATION messages were dropped in last 5000 ms: 6 for >> internal timeout and 0 for cross node timeout >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line >> 55) Pool Name Active Pending Completed Blocked >> All Time Blocked >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line >> 70) ReadStage 0 0 245177190 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,455 StatusLogger.java (line >> 70) RequestResponseStage 0 0 3530334509 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line >> 70) ReadRepairStage 0 0 1549567 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line >> 70) MutationStage 48 1380 2540965500 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,456 StatusLogger.java (line >> 70) ReplicateOnWriteStage 0 0 189615571 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line >> 70) GossipStage 0 0 20586077 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line >> 70) CacheCleanupExecutor 0 0 0 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line >> 70) MigrationStage 0 0 106 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,457 StatusLogger.java (line >> 70) MemoryMeter 0 0 303029 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line >> 70) ValidationExecutor 0 0 0 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line >> 70) FlushWriter 1 5 322604 1 >> 8227 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,458 StatusLogger.java (line >> 70) InternalResponseStage 0 0 35 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line >> 70) AntiEntropyStage 0 0 0 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line >> 70) MemtablePostFlusher 1 5 424104 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,459 StatusLogger.java (line >> 70) MiscStage 0 0 0 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,460 StatusLogger.java (line >> 70) PendingRangeCalculator 0 0 37 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,460 StatusLogger.java (line >> 70) commitlog_archiver 0 0 0 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,461 StatusLogger.java (line >> 70) CompactionExecutor 4 4 5144499 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,461 StatusLogger.java (line >> 70) HintedHandoff 0 0 3194 0 >> 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,461 StatusLogger.java (line >> 79) CompactionManager 1 4 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,461 StatusLogger.java (line >> 81) Commitlog n/a 0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,461 StatusLogger.java (line >> 93) MessagingService n/a 0/0 >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,462 StatusLogger.java (line >> 103) Cache Type Size Capacity >> KeysToSave >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,462 StatusLogger.java (line >> 105) KeyCache 104828280 104857600 >> all >> INFO [ScheduledTasks:1] 2016-08-23 13:29:51,462 StatusLogger.java (line >> 111) RowCache 0 0 >> all >> >> >> >> So far we have noticed that when our application throws logs, Cassandra >> will have FlushWriter Active=1 and Pending=4+ tasks (like shown above). >> Just before the tpstats-like output that appears in system.log we seem to >> have consecutive flushes which take anywhere between 7-20 seconds, like >> shown below: >> >> INFO [FlushWriter:26567] 2016-08-22 21:15:30,550 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/LIVESt >> ats/statsperdealer/LIVEStats-statsperdealer-jb-50127-Data.db (640031 >> bytes) for commitlog position ReplayPosition(segmentId=1465909831301, >> position=225) >> INFO [FlushWriter:26567] 2016-08-22 21:15:30,678 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/LIVESt >> ats/statsperdealerpervrm/LIVEStats-statsperdealerpervrm-jb-33131-Data.db >> (893342 bytes) for commitlog position ReplayPosition(segmentId=1465909831301, >> position=225) >> INFO [FlushWriter:26567] 2016-08-22 21:15:30,706 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/LIVESt >> ats/statsperserver/LIVEStats-statsperserver-jb-48158-Data.db (4862 >> bytes) for commitlog position ReplayPosition(segmentId=1465909831301, >> position=225) >> INFO [FlushWriter:26567] 2016-08-22 21:15:33,881 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/Organa >> mi/sessionstate/Organami-sessionstate-jb-55718-Data.db (76781692 bytes) >> for commitlog position ReplayPosition(segmentId=1465909831301, >> position=225) >> INFO [FlushWriter:26567] 2016-08-22 21:15:35,314 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/LIVESt >> ock/stock/LIVEStock-stock-jb-36747-Data.db (9302046 bytes) for commitlog >> position ReplayPosition(segmentId=1465909831301, position=8363) >> >> INFO [FlushWriter:26567] 2016-08-22 21:15:35,341 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/system >> /compactions_in_progress/system-compactions_in_progress-jb-1018208-Data.db >> (161 bytes) for commitlog position ReplayPosition(segmentId=1465909831301, >> position=226441) >> INFO [FlushWriter:26567] 2016-08-22 21:15:38,719 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/DataMi >> ning/quotebyquotereference/DataMining-quotebyquotereference-jb-579845-Data.db >> (18754778 bytes) for commitlog position >> ReplayPosition(segmentId=1465909831301, >> position=325555) >> INFO [FlushWriter:26567] 2016-08-22 21:15:38,770 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/system >> /compactions_in_progress/system-compactions_in_progress-jb-1018209-Data.db >> (167 bytes) for commitlog position ReplayPosition(segmentId=1465909831301, >> position=429568) >> INFO [FlushWriter:26567] 2016-08-22 21:15:38,793 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/system >> /compactions_in_progress/system-compactions_in_progress-jb-1018210-Data.db >> (161 bytes) for commitlog position ReplayPosition(segmentId=1465909831301, >> position=430537) >> INFO [FlushWriter:26567] 2016-08-22 21:15:38,813 Memtable.java (line 402) >> Completed flushing /var/lib/cassandra/data/system >> /compactions_in_progress/system-compactions_in_progress-jb-1018211-Data.db >> (158 bytes) for commitlog position ReplayPosition(segmentId=1465909831301, >> position=433145) >> >> Even though no individual memtable seems to be taking a very long time, >> the commitlog segment id for the flushes shown above is the same. We think >> that this happens because the commitlog reached its limit and is rotating >> this particular segment, which triggers flush activity for all memtables >> referenced in this segment. >> >> Our questions: >> >> >> >> >> >> 1. Will writes be blocked for the tables listed in the above FlushWriter >> output while this commitlog segment is being rotated? If so, will the >> blocking last for the entire duration or just for the time taken for the >> respective memtable to be flushed to disk? >> >> 2. We have also noticed that flushes can also be triggered by MemoryMeter >> (probably when occupancy fraction reaches the specified limit, at which >> point the largest memtable is flushed to disk). Would it be wise to >> increase the commitlog total size, so that MemoryMeter is the main reason >> for flush activity? When flushes are triggered due to "memory pressure" we >> never seem to through logs in our application. >> >> 3. Is there a recommended value for the ratio (total size of Commitlog >> segments)/(total size of Memtables)? We use the defaults for both values >> which makes the total size for Commitlog segments 1G (32 files of 32M) and >> the total size used for Memtables 2G, since we have allocated 8G in total >> for Heap. >> >> 4. The total space for all memtables seems to be Heap/4; this is 2G for >> us. Is there a maximum size for an individual memtable? >> >> 5. What is the max queue size for MemtablePostFlusher and exactly happens >> after a flush (which I pressume relates to MemtablePostFlusher)? Is it >> concerning that we also have five pending tasks there? >> >> Many thanks, >> >> >> Vasilis >> > >