two dimensional slicing

Bryce Allen Fri, 20 Jan 2012 09:21:06 -0800

I'm storing very large versioned lists of names, and I'd like to
query a range of names within a given range of versions, which is a two
dimensional slice, in a single query. This is easy to do using
ByteOrderedPartitioner, but seems to require multiple (non parallel)
queries and extra CFs when using RandomPartitioner.


I see two approaches when using RP:

1) Data is stored in a super column family, with one dimension being
the super column names and the other the sub column names. Since
slicing on sub columns requires a list of super column names, a
second standard CF is needed to get a range of names before doing a
query on the main super CF. With CASSANDRA-2710, the same is possible
using a standard CF with composite types instead of a super CF.

2) If one of the dimensions is small, a two dimensional slice isn't
required. The data can be stored in a standard CF with linear ordering
on a composite type (large_dimension, small_dimension). Data is queried
based on the large dimension, and the client throws out the extra data
in the other dimension.

Neither of the above solutions are ideal. Does anyone else have a use
case where two dimensional slicing is useful? Given the disadvantages of
BOP, is it practical to make the composite column query model richer to
support this sort of use case?

Thanks,
Bryce

signature.asc
Description: PGP signature

two dimensional slicing

Reply via email to