[
https://issues.apache.org/jira/browse/CASSANDRA-12268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15455243#comment-15455243
]
Carl Yeksigian commented on CASSANDRA-12268:
--------------------------------------------
After rebasing and rerunning, these failures are legitimate and are caused by
batchlog replay of the newly created mutations.
We are trying to replay a batch which includes no entries, and are failing
because of it. I've added an assertion that we are generating non-empty
mutations, but it isn't being tripped, so I haven't yet found where this is
occurring.
> Make MV Index creation robust for wide referent rows
> ----------------------------------------------------
>
> Key: CASSANDRA-12268
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12268
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Shook
> Assignee: Carl Yeksigian
> Fix For: 3.0.x, 3.x
>
> Attachments: 12268.py
>
>
> When creating an index for a materialized view for extant data, heap pressure
> is very dependent on the cardinality of of rows associated with each index
> value. With the way that per-index value rows are created within the index,
> this can cause unbounded heap pressure, which can cause OOM. This appears to
> be a side-effect of how each index row is applied atomically as with batches.
> The commit logs can accumulate enough during the process to prevent the node
> from being restarted. Given that this occurs during global index creation,
> this can happen on multiple nodes, making stable recovery of a node set
> difficult, as co-replicas become unavailable to assist in back-filling data
> from commitlogs.
> While it is understandable that you want to avoid having relatively wide rows
> even in materialized views, this represents a particularly difficult
> scenario for triage.
> The basic recommendation for improving this is to sub-group the index
> creation into smaller chunks internally, providing a maximal bound against
> the heap pressure when it is needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)