[ 
https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069070#comment-13069070
 ] 

Sylvain Lebresne commented on CASSANDRA-2829:
---------------------------------------------

I think this kind of work, in that we won't keep commit log forever, but it 
still keep commit logs for much longer than necessary because:
# it relies on forceFlush being called, which unless client triggered will only 
be after the memtable expires and quite a bunch of commit log could pile up 
during that time. Quite potentially enough to be a problem (if the commit logs 
fills up you hard drive, it doesn't matter much that "it would have been 
deleted in 5 hours"). I think we can do much better with not too much effort.
# when we do flush the expired memtable, we'll call maybeSwitchMemtable() will 
potentially clean memtables. This doesn't sound like a good use of resource: 
we'll grab the write lock, create a latch, create a new memtable, increment the 
memtable switch number, push an almost no-op job on the flush executor.

I think we should fix the real problem. The problem is that we discard segment, 
we always keep the current segment dirty because we don't know if there was 
some write since we grabbed the context. Let's add that information and fix 
that. This would make commit log being deleted much quicker, even if we don't 
consider the corner case of column family that have suddenly no write anymore, 
because CFs like the system ones, that have very low update volume can retain 
the logs longer than it's really need.

As for the fix, because the CL executor is mono-threaded, this is fairly easy, 
let's have an in-memory map of cfId->lastPositionWritten, and compare that to 
the context position in discardCompletedSegmentInternal (we could probably even 
just use a set of cfid who would meant: dirty since last getContext).

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 
> 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to 
> discard obsolete commit log segments. This can result it log segments not 
> been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data 
> loaded and a running application working against the cluster. Did a rolling 
> restart and then kicked off a repair, one node filled up the commit log 
> volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 
> CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 
> CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 
> CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 
> CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 
> CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 
> CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 
> CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 
> CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 
> CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 
> CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default 
> settings which is 24 hours. Will create another ticket see if these can be 
> reduced or if it's something users should do, in this case it would not have 
> mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the 
> segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
>       Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
>       Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
>       Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
>       Cf id 1000: 43175492
>       Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
>       Cf id 1000: 239223
>       Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
>       Cf id 1001: 57595560
>       Cf id 1: 816960
>       Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont 
> have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log 
> segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush 
> started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. 
> cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still 
> marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought 
> SystemTable.updateToken() was called which does not flush, and this was the 
> last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() 
> which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to