[jira] [Commented] (CASSANDRA-8496) Remove MemtablePostFlusher

Branimir Lambov (JIRA) Fri, 22 Jul 2016 06:56:11 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389531#comment-15389531
 ]


Branimir Lambov commented on CASSANDRA-8496:
--------------------------------------------

I don't think so. Correctness is currently broken (i.e. failed flush can cause 
data loss on the node) due to one of the iterations of multiple concurrent 
requests to fix failed flush causing unlimited commitlog growth and death of 
node. If we roll back the change that did that, we will continue being asked 
repeatedly to fix it and I believe we will eventually break correctness again.

On the other hand, most of the code is already written. Point 6 is the only 
pain point left, but I think the risk it introduces is minor and Aleksey is 
planning changes to metadata storage mechanisms (for unrelated reasons) that 
should make it trivial to take care of it too.

> Remove MemtablePostFlusher
> --------------------------
>
>                 Key: CASSANDRA-8496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8496
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>
> To improve clearing of the CL, prevent infinite growth, and ensure the prompt 
> completion of tasks waiting on flush in the case of transient errors, large 
> flushes or slow disks, in 2.1 we could eliminate the post flusher altogether. 
> Since we now enforce that Memtables track contiguous ranges, a relatively 
> small change would permit Memtables to know the exact minimum as well as the 
> currently known exact maximum. The CL could easily track the total dirty 
> range, knowing that it must be contiguous, by using an AtomicLong instead of 
> an AtomicInteger, and tracking both the min/max seen, not just the max. The 
> only slight complexity will come in for tracking the _clean_ range as this 
> can now be non-contiguous, if there are 3 memtable flushes covering the same 
> CL segment, and one of them completes later. To solve this we can use an 
> interval tree since these operations are infrequent, so the extra overhead is 
> nominal. Once the interval tree completely overlaps the dirty range, we mark 
> the entire dirty range clean.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8496) Remove MemtablePostFlusher

Reply via email to