[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

Tupshin Harper (JIRA) Fri, 17 Jul 2015 13:52:33 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631882#comment-14631882
 ]


Tupshin Harper commented on CASSANDRA-6477:
-------------------------------------------

OK, so let me summarize my view of the conflicting viewpoints here
# If the MV shares the same partition key (and only reorders the partition 
based on different clustering columns), then the problem is relatively easy. 
Unfortunately the general consensus is that a common case will be to have 
different partition keys in the MV than the base table, so we can't support 
only that easy case.
# If the MV has a different partition key than the base table, then there are 
inherently more nodes involved in fulfilling the entire request, and we have to 
address that case.
# As [~tjake] and [~jbellis] say, the more nodes involved in a query, the 
higher the risk of unavailability if the MV is updated synchronously.
# Some use cases expect synchronous updates (as argued by [~rustyrazorblade] 
and [~brianmhess]
# But others use cases definitely do not. I think it is absurd to say that just 
because a table has a MV, every write should care about the MV. Even more 
absurd to say that adding an MV to a table will reduce the availability of all 
writes to the base table. 

Given all of those, the conclusion that both sync and async forms are necessary 
seems totally unavoidable.

Ideally, I'd like to see an extension of what [~iamaleksey] proposed above but 
be much more thorough and flexible about it.

If each request were able to pass multiple consistency-level contracts to the 
coordinator, each one could represent the expectation for a separate callback 
at the driver level.
e.g. A query to a table with a MV could express the following compound 
consistency levels. {noformat} {LQ, LOCAL_ONE{DC3,DC4}, LQ{MV1,MV2}} {noformat}
That would tell the coordinator to deliver three separate notifications back to 
the client. One when LQ in the local dc was fulfilled. Another when at least 
one copy was delivered to each of DC3 and DC4, and another when LQ was 
fulfilled in the local dc for MV1 and MV2.

I realize that this is a very far-fetched proposal, but I wanted to throw it 
out there as, imo, it reflects the theoretically best option that fulfills 
everybody's requirements. (and is also a very general mechanism that could be 
used in other scenarios).

Short of that,  I don't think there is any choice but to support both sync and 
async forms of writes to tables with MVs.

One more point(not to distract from the above). With the current design of MVs, 
there will always be risk of inconsistent reads (timeouts leaving data 
queryable in the primary table but not in one or more MVs) until the data is 
eventually propagated to the MV. While it would be at a high cost, RAMP would 
still be useful to be to provide read isolation in that scenario. 

> Materialized Views (was: Global Indexes)
> ----------------------------------------
>
>                 Key: CASSANDRA-6477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Carl Yeksigian
>              Labels: cql
>             Fix For: 3.0 beta 1
>
>         Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

Reply via email to