No problems.
IMHO you should develop a sizable bruise banging your head against a using
Standard CF's and the Random Partitioner before using something else.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 23/12/2011, at 6:29 AM, Bryce
Thanks, that definitely has advantages over using a super column. We
ran into thrift timeouts when the super column got large, and with the
super column range query there is no way (AFAIK) to batch the request at
the subcolumn level.
-Bryce
On Thu, 22 Dec 2011 10:06:58 +1300
aaron morton
AFAIK there are no plans kill the BOP, but I would still try to make your life
easier by using the RP. .
My understanding of the problem is at certain times you snapshot the files in a
dir; and the main query you want to handle is At what points between time t0
and time t1 did files x,y and z
Generally, RandomPartitioner is the recommended one.
If you already provide randomized keys it doesn't make much of a
difference, the nodes should be balanced with any partitioner.
However, unless you have UUID in all keys of all column families
(highly unlikely) ByteOrderedPartitioner and
I think it comes down to how much you benefit from row range scans, and
how confident you are that going forward all data will continue to use
random row keys.
I'm considering using BOP as a way of working around the non indexes
super column limitation. In my current schema, row keys are random
Bryce,
Have you considered using CompositeColumns and a standard CF? Row key
is the UUID column name is (timestamp : dir_entry) you can then slice all
columns with a particular time stamp.
Even if you have a random key, I would use the RP unless you have an
extreme use case.
I wasn't aware of CompositeColumns, thanks for the tip. However I think
it still doesn't allow me to do the query I need - basically I need to
do a timestamp range query, limiting only to certain file names at
each timestamp. With BOP and a separate row for each timestamp,
prefixed by a random
Hey Guys,
I just came across http://wiki.apache.org/cassandra/ByteOrderedPartitioner and
it got me thinking. If the row keys are java.util.UUID which are generated
randomly (and securely), then what type of partitioner would be the best? Since
the key values are already random, would it make a