[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626761#comment-14626761
]
Tupshin Harper commented on CASSANDRA-6477:
-------------------------------------------
I find myself disagreeing with the hard requirement that all rows in the table
must show up in the materialized views. While it would be nice, I believe that
clearly documenting the limitation and providing a couple of reasonable choices
is far preferable then encouraging using rope sufficient to hang the user.
My suggestion:
* Create a formal notion of NOT NULL columns in the schema that can be applied
to a table, irrespective of any MV usage.
* Columns that are NOT NULL would have the exact same restrictions as PK
columns, namely that they need to be included in all inserts and updates (with
the possible exception of LWT updates)
* Document (and warn in cqlsh) the fact that if you create a MV with a PK using
a nullable column from the table, then those values will not be in the view
It seems to me like this is a far less dangerous (and in many ways less
surprising) than automatically creating a hotspot in the MV because lots of
data with NULLs get added.
Now with 8099 supporting NULLs for clustering columns, this might only apply to
columns that would be a partition key in the MV, and that seems appealing. But
I can't talk myself into liking inserting nulls into a MV partition key.
> Materialized Views (was: Global Indexes)
> ----------------------------------------
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
> Issue Type: New Feature
> Components: API, Core
> Reporter: Jonathan Ellis
> Assignee: Carl Yeksigian
> Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the
> index across the cluster is a Good Thing. However, for high-cardinality
> data, local indexes require querying most nodes in the cluster even if only a
> handful of rows is returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)