[ 
https://issues.apache.org/jira/browse/CASSANDRA-10230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745849#comment-14745849
 ] 

T Jake Luciani commented on CASSANDRA-10230:
--------------------------------------------

One issue that becomes more likely with the removal of the coordinator batchlog 
is the possibility of inconsistency in the face of data loss on the base table.

For example:  

   * Write to base table RF=3 and CL.ONE
   * View is updated with value
   * Base replica is also the coordinator
   * Respond to the client as success
   * Machine dies forever

In this case we would have data in the view that could never be repaired.  This 
isn't really a likely scenario but it *could* happen.  And there is a higher 
chance of it happening without the coordinator batchlog since it will ensure 
the batchlog is not removed unless the write is ack'd by a QUORUM on nodes.

This could also happen at quorum if you lose any quorum of nodes or all 
replicas.  
I'll also note this situation can happen today with manual materialized views 
using the batchlog under CL.ONE

I created CASSANDRA-10346 to address this in the future (a way to identify and 
ideally fix inconsistencies between the base and view tables)



> Remove coordinator batchlog from materialized views
> ---------------------------------------------------
>
>                 Key: CASSANDRA-10230
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10230
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: T Jake Luciani
>            Assignee: Joel Knighton
>             Fix For: 3.0.0 rc1
>
>
> We are considering removing or making optional the coordinator batchlog.  
> The batchlog primary serves as a way to quickly reach consistency between 
> base and view since we don't have any kind of read repair between base and 
> view. But we do have repair so as long as you don't lose nodes while writing 
> at CL.ONE you will be eventually consistent.
> I've committed to the 3.0 branch a way to disable the coordinator with 
> {{-Dcassandra.mv_disable_coordinator_batchlog=true}}
> The majority of the performance hit to throughput is currently the batchlog 
> as shown by this chart.
> http://cstar.datastax.com/graph?stats=f794245a-4d9d-11e5-9def-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=498.52&ymin=0&ymax=50142.4
> I'd like to have tests run with and without this flag to validate how quickly 
> we achieve quorum consistency without repair writing with CL.ONE.   Once we 
> can see there is little/no impact we can permanently remove the coordinator 
> batchlog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to