[
https://issues.apache.org/jira/browse/CASSANDRA-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139147#comment-15139147
]
Jeremiah Jordan commented on CASSANDRA-9231:
--------------------------------------------
I think we probably have other issues to solve besides CASSANDRA-9754 for
multi-GB partitions to be viable? Are you not going to still have operational
issues around repairing them and compacting them still?
> Support Routing Key as part of Partition Key
> --------------------------------------------
>
> Key: CASSANDRA-9231
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9231
> Project: Cassandra
> Issue Type: Wish
> Reporter: Matthias Broecheler
>
> Provide support for sub-dividing the partition key into a routing key and a
> non-routing key component. Currently, all columns that make up the partition
> key of the primary key are also routing keys, i.e. they determine which nodes
> store the data. This proposal would give the data modeler the ability to
> designate only a subset of the columns that comprise the partition key to be
> routing keys. The non-routing key columns of the partition key identify the
> partition but are not used to determine where to store the data.
> Consider the following example table definition:
> CREATE TABLE foo (
> a int,
> b int,
> c int,
> d int,
> PRIMARY KEY (([a], b), c ) );
> (a,b) is the partition key, c is the clustering key, and d is just a column.
> In addition, the square brackets identify the routing key as column a. This
> means that only the value of column a is used to determine the node for data
> placement (i.e. only the value of column a is murmur3 hashed to compute the
> token). In addition, column b is needed to identify the partition but does
> not influence the placement.
> This has the benefit that all rows with the same routing key (but potentially
> different non-routing key columns of the partition key) are stored on the
> same node and that knowledge of such co-locality can be exploited by
> applications build on top of Cassandra.
> Currently, the only way to achieve co-locality is within a partition.
> However, this approach has the limitations that: a) there are theoretical and
> (more importantly) practical limitations on the size of a partition and b)
> rows within a partition are ordered and an index is build to exploit such
> ordering. For large partitions that overhead is significant if ordering isn't
> needed.
> In other words, routing keys afford a simple means to achieve scalable
> node-level co-locality without ordering while clustering keys afford
> page-level co-locality with ordering. As such, they address different
> co-locality needs giving the data modeler the flexibility to choose what is
> needed for their application.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)