[jira] [Comment Edited] (CASSANDRA-12956) CL is not replayed on custom 2i exception

Alex Petrov (JIRA) Thu, 01 Dec 2016 12:03:33 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-12956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15708453#comment-15708453
 ]


Alex Petrov edited comment on CASSANDRA-12956 at 12/1/16 8:02 PM:
------------------------------------------------------------------

Patch for {{3.0}} is quite different and is much bigger. Main problem is that 
there's no transactionality on the same level as in {{3.X}}. {{3.0}} memtables 
are flushed and renamed to non-tmp names, readers are returned. We need a bit 
better granularity, since after we may have to abort all the flushed sstables 
if 2i failed. I've changed it a bit in {{3.x}} fashion, although since we flush 
to just one sstable, I thought that extracting {{txn}} to the top level will 
not give us anything.

Both patches introduce the second latch. I'm usually not the biggest fan of two 
threads that have to wait for one another, but here the ordering is an issue. 
Problem is that post-flush executor is single-threaded (for ordering), and 
flush executor is multi-threaded, so we can't return future backed with that 
multi-threaded executor as it will break order. On the other hand, if we move 
2i flush to flush executor, we'll have to sequentially wait for 2i, then all 
memtables. Current approach allows to keep these actions parallel. 

We only need to synchronise the non-cf 2i flush with memtable holding data for 
current cf. All the cf-index memtables will be in sync with data one anyways 
since they're combined in the transaction. 

|[3.X|https://github.com/ifesdjeen/cassandra/tree/12956-3.X]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.X-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.X-dtest/]|
|[3.0|https://github.com/ifesdjeen/cassandra/tree/12956-3.0-v2]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.0-v2-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.0-v2-dtest/]|
|[trunk|https://github.com/ifesdjeen/cassandra/tree/12956-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-trunk-dtest/]|



was (Author: ifesdjeen):
Patch for {{3.0}} is quite different and is much bigger. Main problem is that 
there's no transactionality on the same level as in {{3.X}}. {{3.0}} memtables 
are flushed and renamed to non-tmp names, readers are returned. We need a bit 
better granularity, since after we may have to abort all the flushed sstables 
if 2i failed. I've changed it a bit in {{3.x}} fashion, although since we flush 
to just one sstable, I thought that extracting {{txn}} to the top level will 
not give us anything.

Both patches introduce the second latch. I'm usually not the biggest fan of two 
threads that have to wait for one another, but here the ordering is an issue. 
Problem is that post-flush executor is single-threaded (for ordering), and 
flush executor is multi-threaded, so we can't return future backed with that 
multi-threaded executor as it will break order. On the other hand, if we move 
2i flush to flush executor, we'll have to sequentially wait for 2i, then all 
memtables. Current approach allows to keep these actions parallel. 

We only need to synchronise the non-cf 2i flush with memtable holding data for 
current cf. All the cf-index memtables will be in sync with data one anyways 
since they're combined in the transaction. 

|[3.X|https://github.com/ifesdjeen/cassandra/tree/12956-3.X]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.X-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.X-dtest/]|
|[3.0|https://github.com/ifesdjeen/cassandra/tree/12956-3.0]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.0-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-3.0-dtest/]|
|[trunk|https://github.com/ifesdjeen/cassandra/tree/12956-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12956-trunk-dtest/]|


> CL is not replayed on custom 2i exception
> -----------------------------------------
>
>                 Key: CASSANDRA-12956
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12956
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alex Petrov
>            Assignee: Alex Petrov
>            Priority: Critical
>
> If during the node shutdown / drain the custom (non-cf) 2i throws an 
> exception, CommitLog will get correctly preserved (segments won't get 
> discarded because segment tracking is correct). 
> However, when it gets replayed on node startup,  we're making a decision 
> whether or not to replay the commit log. CL segment starts getting replayed, 
> since there are non-discarded segments and during this process we're checking 
> whether there every [individual 
> mutation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L215]
>  in commit log is already committed or no. Information about the sstables is 
> taken from [live sstables on 
> disk|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L250-L256].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-12956) CL is not replayed on custom 2i exception

Reply via email to