Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-23 Thread aaron morton
No problems. IMHO you should develop a sizable bruise banging your head against a using Standard CF's and the Random Partitioner before using something else. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/12/2011, at 6:29 AM, Bryce

Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-22 Thread Bryce Allen
Thanks, that definitely has advantages over using a super column. We ran into thrift timeouts when the super column got large, and with the super column range query there is no way (AFAIK) to batch the request at the subcolumn level. -Bryce On Thu, 22 Dec 2011 10:06:58 +1300 aaron morton

Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-21 Thread aaron morton
AFAIK there are no plans kill the BOP, but I would still try to make your life easier by using the RP. . My understanding of the problem is at certain times you snapshot the files in a dir; and the main query you want to handle is At what points between time t0 and time t1 did files x,y and z

Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-20 Thread Filipe Gonçalves
Generally, RandomPartitioner is the recommended one. If you already provide randomized keys it doesn't make much of a difference, the nodes should be balanced with any partitioner. However, unless you have UUID in all keys of all column families (highly unlikely) ByteOrderedPartitioner and

Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-20 Thread Bryce Allen
I think it comes down to how much you benefit from row range scans, and how confident you are that going forward all data will continue to use random row keys. I'm considering using BOP as a way of working around the non indexes super column limitation. In my current schema, row keys are random

Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-20 Thread aaron morton
Bryce, Have you considered using CompositeColumns and a standard CF? Row key is the UUID column name is (timestamp : dir_entry) you can then slice all columns with a particular time stamp. Even if you have a random key, I would use the RP unless you have an extreme use case.

Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-20 Thread Bryce Allen
I wasn't aware of CompositeColumns, thanks for the tip. However I think it still doesn't allow me to do the query I need - basically I need to do a timestamp range query, limiting only to certain file names at each timestamp. With BOP and a separate row for each timestamp, prefixed by a random

Choosing a Partitioner Type for Random java.util.UUID Row Keys

2011-12-19 Thread Drew Kutcharian
Hey Guys, I just came across http://wiki.apache.org/cassandra/ByteOrderedPartitioner and it got me thinking. If the row keys are java.util.UUID which are generated randomly (and securely), then what type of partitioner would be the best? Since the key values are already random, would it make a