[
https://issues.apache.org/jira/browse/CASSANDRA-12888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821129#comment-15821129
]
victor commented on CASSANDRA-12888:
------------------------------------
Hi Benjamin,
thanks for the overall awesomeness :) You response is very helpful!
Our use-case exploits atomicity of MVs on a non-append only scenario. Aside
from the less code to write in our application, it allows us skipping
multi-partition batches to achieve atomicity. It's still not 100% clear to me,
but the cluster seems to be exposed to less stress on atomicity/denormalization
with MVs than with multi-partition batches. (at least DataStax indicates there
are performance gains compared with manual denormalization scenario, not even
counting manual denormalization with batches, see
http://www.datastax.com/dev/blog/materialized-view-performance-in-cassandra-3-x.
Is this assumption correct?
>I'd recommend not to use MVs that use a different partition key on the MV
>than on the base table as this requires inter-node communication for EVERY
>write operation. So you can easily kill your cluster with bulk operations
>(like in streaming).
Excuse my ignorance, but isn't having a different partition key the point of
denormalization on MVs (to have different read paths)? Would this node
coordination be worse or the same on a multi-partition batch scenario?
Given our system stores critical information, we've decided to skip MVs
altogether until the feature becomes more "ops friendly" on production.
Thanks a lot!
VĂctor
> Incremental repairs broken for MVs and CDC
> ------------------------------------------
>
> Key: CASSANDRA-12888
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12888
> Project: Cassandra
> Issue Type: Bug
> Components: Streaming and Messaging
> Reporter: Stefan Podkowinski
> Assignee: Benjamin Roth
> Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> SSTables streamed during the repair process will first be written locally and
> afterwards either simply added to the pool of existing sstables or, in case
> of existing MVs or active CDC, replayed on mutation basis:
> As described in {{StreamReceiveTask.OnCompletionRunnable}}:
> {quote}
> We have a special path for views and for CDC.
> For views, since the view requires cleaning up any pre-existing state, we
> must put all partitions through the same write path as normal mutations. This
> also ensures any 2is are also updated.
> For CDC-enabled tables, we want to ensure that the mutations are run through
> the CommitLog so they can be archived by the CDC process on discard.
> {quote}
> Using the regular write path turns out to be an issue for incremental
> repairs, as we loose the {{repaired_at}} state in the process. Eventually the
> streamed rows will end up in the unrepaired set, in contrast to the rows on
> the sender site moved to the repaired set. The next repair run will stream
> the same data back again, causing rows to bounce on and on between nodes on
> each repair.
> See linked dtest on steps to reproduce. An example for reproducing this
> manually using ccm can be found
> [here|https://gist.github.com/spodkowinski/2d8e0408516609c7ae701f2bf1e515e8]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)