[ 
https://issues.apache.org/jira/browse/CASSANDRA-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293060#comment-15293060
 ] 

Branimir Lambov commented on CASSANDRA-8496:
--------------------------------------------

I did not fully appreciate how much correctness relies on {{PostFlush}}. There 
are several aspects to letting flushes continue after a stuck/failed one:
- Commit log must track unflushed intervals to be aware if there's a flushing 
"hole", to avoid discarding the relevant segments. (1)
- Replay must be able to see such unflushed intervals, even after later flushes 
and compaction. (2)
- Snapshotting should be able to tell if there are unflushed intervals, i.e. 
holes in the snapshot. Should we error out if this is the case? (3)
- Truncation should be able to deal with a flush of old data completing after 
truncation. (4)
- We should be able to re-attempt flushes to solve transient failures (e.g. 
space ran out but was then made available). (5)

Bonus complication:
- There is potential replay order problem involving table metadata: if an 
incompatible change to a table's metadata was flushed, but older data remained 
unflushed, commit log replay will attempt to apply old data on new format which 
could fail, resulting in data loss on the node. (6)

Anything I'm missing?

CASSANDRA-9669 made sstables track intervals of replay positions and added 
replay machinery to replay unflushed intervals that may be earlier than the 
latest flush position. CASSANDRA-11828 expands on this to track sets of flushed 
intervals in commit log segments and sstables, which finishes (1) and (2). I am 
currently looking to address (5) which would enable (3) and (4) to see a 
successful flush round as a guarantee of lack of holes.

I am a little worried about (6). Is there anything I may be missing that could 
help with this scenario?


> Remove MemtablePostFlusher
> --------------------------
>
>                 Key: CASSANDRA-8496
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8496
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>
> To improve clearing of the CL, prevent infinite growth, and ensure the prompt 
> completion of tasks waiting on flush in the case of transient errors, large 
> flushes or slow disks, in 2.1 we could eliminate the post flusher altogether. 
> Since we now enforce that Memtables track contiguous ranges, a relatively 
> small change would permit Memtables to know the exact minimum as well as the 
> currently known exact maximum. The CL could easily track the total dirty 
> range, knowing that it must be contiguous, by using an AtomicLong instead of 
> an AtomicInteger, and tracking both the min/max seen, not just the max. The 
> only slight complexity will come in for tracking the _clean_ range as this 
> can now be non-contiguous, if there are 3 memtable flushes covering the same 
> CL segment, and one of them completes later. To solve this we can use an 
> interval tree since these operations are infrequent, so the extra overhead is 
> nominal. Once the interval tree completely overlaps the dirty range, we mark 
> the entire dirty range clean.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to