[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626761#comment-14626761
 ] 

Tupshin Harper commented on CASSANDRA-6477:
-------------------------------------------

I find myself disagreeing with the hard requirement that all rows in the table 
must show up in the materialized views. While it would be nice, I believe that 
clearly documenting the limitation and providing a couple of reasonable choices 
is far preferable then encouraging using rope sufficient to hang the user.

My suggestion:
* Create a formal notion of NOT NULL columns in the schema that can be applied 
to a table, irrespective of any MV usage. 
* Columns that are NOT NULL would have the exact same restrictions as PK 
columns, namely that they need to be included in all inserts and updates (with 
the possible exception of LWT updates)
* Document (and warn in cqlsh) the fact that if you create a MV with a PK using 
a nullable column from the table, then those values will not be in the view

It seems to me like this is a far less dangerous (and in many ways less 
surprising) than automatically creating a hotspot in the MV because lots of 
data with NULLs get added.

Now with 8099 supporting NULLs for clustering columns, this might only apply to 
columns that would be a partition key in the MV, and that seems appealing. But 
I can't talk myself into liking inserting nulls into a MV partition key.

> Materialized Views (was: Global Indexes)
> ----------------------------------------
>
>                 Key: CASSANDRA-6477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Carl Yeksigian
>              Labels: cql
>             Fix For: 3.0 beta 1
>
>         Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to