[
https://issues.apache.org/jira/browse/GORA-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned GORA-267:
-----------------------------------------
Assignee: Lewis John McGibbney
> Cassandra composite primary key support
> ---------------------------------------
>
> Key: GORA-267
> URL: https://issues.apache.org/jira/browse/GORA-267
> Project: Apache Gora
> Issue Type: Improvement
> Components: gora-cassandra
> Reporter: [email protected]
> Assignee: Lewis John McGibbney
> Labels: features
> Fix For: 0.6
>
> Attachments: gora-267.diff
>
>
> The extension allows to define primary keys that are represented by avro
> classes. A mapping specifies how fields of the key class are mapped to the
> components of composite partition keys and composite column names. This gives
> users more control with respect to the distribution of data into Cassandra
> database structures. It is now possible to store data in wide rows with
> custom indexes that allow for fast range scans on a single node. Also there
> is no more need for an order-preserving partitioner that is likely to
> compromise data distribution in the Cassandra cluster.
> The extension allows to define primary keys that are represented by avro
> classes. A mapping specifies how fields of the key class are mapped to the
> components of composite partition keys and composite column names. This gives
> users more control with respect to the distribution of data into Cassandra
> database structures. It is now possible to store data in wide rows with
> custom indexes that allow for fast range scans on a single node. Also there
> is no more need for an order-preserving partitioner that is likely to
> compromise data distribution in the Cassandra cluster.
> In essence, composite primary keys with identical partition parts will be
> written in the same Cassandra row (which is essentially a partition). Within
> the same row entities are stored in lexical order by their cluster key
> components. Avro field names are appended as the last component of the
> composite column name. The current implementation does not substitute super
> columns. Thus, complex avro fields are still mapped to super columns. Super
> column families use the same composite primary keys as simple column
> families. As Gora always fully loads nested complex types, the use of super
> column families is not really a problem. Yet, super columns could be
> substituted by another level of column name components below the field
> qualifiers in future work. It would also be possible to rethink the
> decomposition of complex nested types beyond the first level.
> The implementation uses the concept of Gora partitionQueries in order to
> decompose row scanning queries into a sets of queries that each operate on a
> single row. However, such a decomposition is not always possible and real
> range scans are limited to wide rows (partitions).
> The implementation is fully backward compatible. Simple key classes can still
> be used and row scans are still possible with an order-preserving
> partitioner. The current junit tests are all passed. Furthermore, I have
> added an example and some unit tests to demonstrate the use of composite
> primary keys for time series data.
> As mentioned earlier, we are happy to share this extension. I've created a
> jira issue for it (GORA-267) and will provide the implementation on GitHub
> (https://github.com/zirpins/gora/tree/GORA-267).
> Regards,
> Christian
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)