[
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922466#comment-16922466
]
Nadav Har'El commented on CASSANDRA-9928:
-----------------------------------------
This issue has recently turned 4 years old, and I'm curious how sure we are
about the *reasons* described above for why we forbid MV with two new key
columns - whether these reasons are correct, and whether we are sure these are
the only reasons.
As [~fsander] asked above, while a base-view inconsistency is indeed more
likely in the two-new-key-columns case, don't we have the same problem in the
regular one-new-key-column case - of scenarios where an unfortunate order of
node failures cause data to appear in a view replica which doesn't appear in
the base replica, and thus will never be deleted? I thought this was one of the
main reasons why MV was recently downgraded to "experimental" status.
But I also wonder if we didn't miss a second problem, that of row liveness,
similar to what we have in the case of unselected columns (see
[CASSANDRA-13826)|https://jira.apache.org/jira/browse/CASSANDRA-13826)] where
if we add and remove different base columns which are view keys, but the view
row has just a *single* timestamp, we can end up being unable to add a view row
that we previously deleted. For example, here is a scenario I thought might be
problematic (didn't actually test this, one would need to disable the check in
the code forbidding multiple new MV key columns to run a test case):
Assume that x,y are regular column in base, but key columns in the view. For
brevity, we leave out other base key columns and other regular columns.
Consider the following sequence of events on one row of the base table:
# Add x=1 at timestamp 1. Since y is still null, no view row is created yet.
# Add y=1 at timestamp 10. This creates a view row with key x=1, y=1. The row
only contains a CQL row marker, and a single timestamp is chosen for it: 10.
# Delete x at timestamp 2. This deletes x’s older (ts=1) value, and so the
view row should be deleted. Again, a timestamp needs to be chosen for this
deletion - it will be 10 again, and the deletion will override the creation
with the same timestamp from the previous step, and so far everything is fine.
# Add x=2 at timestamp 3. This overrides the deletion of x (which was in
timestamp 2) so again, both x and y have values and a view row should be
created with key x=2, y=1. However, this creation will again have timestamp 10
(y’s timestamp) and not be able to shadow the deletion from step 3 (in step 3,
deletion won over data, so here it will win again). So the view row we wanted
to add will not be added!
> Add Support for multiple non-primary key columns in Materialized View primary
> keys
> ----------------------------------------------------------------------------------
>
> Key: CASSANDRA-9928
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
> Project: Cassandra
> Issue Type: Improvement
> Components: Feature/Materialized Views
> Reporter: T Jake Luciani
> Priority: Normal
> Labels: materializedviews
> Fix For: 4.x
>
>
> Currently we don't allow > 1 non primary key from the base table in a MV
> primary key. We should remove this restriction assuming we continue
> filtering out nulls. With allowing nulls in the MV columns there are a lot
> of multiplicative implications we need to think through.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]