Re: two dimensional slicing

aaron morton Mon, 23 Jan 2012 00:00:38 -0800

It depends a bit on the data and the query patterns. 

* How many versions do you have ? 
* How many names in each version ? 
* When querying do you know the versions numbers you want to query from ? How 
many are there normally? 
* How frequent are the updates and the reads ?


I would lean towards using two standard CF's, one to list all the version 
numbers (in a single row probably) and one to hold the names in a particular 
version. 

To do your query slice the first CF and then run multi gets to the second. 

Thats probably not the best solution, if you can add some more info it may get 
better. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 21/01/2012, at 6:20 AM, Bryce Allen wrote:

> I'm storing very large versioned lists of names, and I'd like to
> query a range of names within a given range of versions, which is a two
> dimensional slice, in a single query. This is easy to do using
> ByteOrderedPartitioner, but seems to require multiple (non parallel)
> queries and extra CFs when using RandomPartitioner.
> 
> I see two approaches when using RP:
> 
> 1) Data is stored in a super column family, with one dimension being
> the super column names and the other the sub column names. Since
> slicing on sub columns requires a list of super column names, a
> second standard CF is needed to get a range of names before doing a
> query on the main super CF. With CASSANDRA-2710, the same is possible
> using a standard CF with composite types instead of a super CF.
> 
> 2) If one of the dimensions is small, a two dimensional slice isn't
> required. The data can be stored in a standard CF with linear ordering
> on a composite type (large_dimension, small_dimension). Data is queried
> based on the large dimension, and the client throws out the extra data
> in the other dimension.
> 
> Neither of the above solutions are ideal. Does anyone else have a use
> case where two dimensional slicing is useful? Given the disadvantages of
> BOP, is it practical to make the composite column query model richer to
> support this sort of use case?
> 
> Thanks,
> Bryce

Re: two dimensional slicing

Reply via email to