[
https://issues.apache.org/jira/browse/CASSANDRA-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeremy Hanna updated CASSANDRA-13162:
-------------------------------------
Component/s: Materialized Views
> Batchlog replay is throttled during bootstrap, creating conditions for
> incorrect query results on materialized views
> --------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-13162
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13162
> Project: Cassandra
> Issue Type: Bug
> Components: Materialized Views
> Reporter: Wei Deng
> Priority: Critical
> Labels: bootstrap, materializedviews
>
> I've tested this in a C* 3.0 cluster with a couple of Materialized Views
> defined (one base table and two MVs on that base table). The data volume is
> not very high per node (about 80GB of data per node total, and that
> particular base table has about 25GB of data uncompressed with one MV taking
> 18GB compressed and the other MV taking 3GB), and the cluster is using decent
> hardware (EC2 C4.8XL with 18 cores + 60GB RAM + 18K IOPS RAID0 from two 3TB
> gp2 EBS volumes).
> This is originally a 9-node cluster. It appears that after adding 3 more
> nodes to the DC, the system.batches table accumulated a lot of data on the 3
> new nodes (each having around 20GB under system.batches directory), and in
> the subsequent week the batchlog on the 3 new nodes got slowly replayed back
> to the rest of the nodes in the cluster. The bottleneck seems to be the
> throttling defined in this cassandra.yaml setting:
> batchlog_replay_throttle_in_kb, which by default is set to 1MB/s.
> Given that it is taking almost a week (and still hasn't finished) for the
> batchlog (from MV) to be replayed after the boostrap finishes, it seems only
> reasonable to unthrottle (or at least give it a much higher throttle rate)
> during the initial bootstrap, and hence I'd consider this a bug for our
> current MV implementation.
> Also as far as I understand, the bootstrap logic won't wait for the
> backlogged batchlog to be fully replayed before changing the new
> bootstrapping node to "UN" state, and if batchlog for the MVs got stuck in
> this state for a long time, we basically will get wrong answers on the MVs
> during that whole duration (until batchlog is fully played to the cluster),
> which adds even more criticality to this bug.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]