Re: Cassandra range scans

Jeremey.Barrett Mon, 22 Feb 2010 12:24:19 -0800

On Feb 22, 2010, at 12:19 AM, ext Jonathan Ellis wrote:

>>  2) is the row key model I suggested above the best approach in Cassandra, 
>> or is there something better? My testing so far has been using 
>> get_range_slice with a ColumnParent of just the CF and SlicePredicate 
>> listing the columns I want (though really I want all columns, is there a 
>> shorthand for that?)
> 
> Cassandra deals fine with millions of columns per row, and allows
> prefix queries on columns too.  So an alternate model would be to have
> userX as row key, and column keys "A:1, A:2, A:3, ..., B:1, B:2, B:3,
> ...".  This will be marginally faster than splitting by row, and has
> the added advantage of not requiring OPP.
> 
> You could use supercolumns here too (where the supercolumn name is the
> thing type).  If you always want to retrieve all things of type A at a
> time per user, then that is a more natural fit.  (Otherwise, the lack
> of subcolumn indexing could be a performance gotcha for you:
> http://issues.apache.org/jira/browse/CASSANDRA-598).


Would you say the supercolumn approach is faster than scanning rows? Any 
particular advantages or disadvantages to writing to a bunch of supercolumns at 
once (e.g. in one user row), vs. writing to a bunch of rows at once (with the 
same key prefix, i.e. close together in an order-preserved store)?

> 
>>  3) schema changes (i.e. adding a new CF)... seems like currently you take 
>> the whole cluster down to accomplish this... is that likely to change in the 
>> future?
> 
> You have to take each node down, but a rolling restart is fine.  No
> reason for the whole cluster to be down at once.

OK, that's not a big deal.

Extremely helpful... thanks for the response!

Jeremey.

Re: Cassandra range scans

Reply via email to