Re: How does cassandra page through low cardinality indexes?

2014-05-29 Thread Robert Coli
On Fri, May 16, 2014 at 10:53 AM, Kevin Burton bur...@spinn3r.com wrote: I'm struggling with cassandra secondary indexes since the documentation seems all over the place and I'm having to put together everything from blog posts. This mostly-complete summary content will eventually make it

Re: How does cassandra page through low cardinality indexes?

2014-05-29 Thread DuyHai Doan
Hello Robert There are some maths involved when considering the performance of secondary index in C* First, the current implementation is a distributed 2nd index, meaning that each node that contains actual data also contains the index data. So considering a cluster of *N* nodes with

Re: How does cassandra page through low cardinality indexes?

2014-05-29 Thread Robert Coli
On Thu, May 29, 2014 at 1:08 PM, DuyHai Doan doanduy...@gmail.com wrote: Hello Robert There are some maths involved when considering the performance of secondary index in C* Yes, these are the maths which are behind my FIXMEs in the original post. I merely have not had time to explicitly

Re: How does cassandra page through low cardinality indexes?

2014-05-29 Thread Paulo Ricardo Motta Gomes
Really informative thread, thank you! We had a secondary index trauma a while ago, and since then we knew it was not a good idea for most of the cases, but now it's even more clear why. On Thu, May 29, 2014 at 5:31 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, May 29, 2014 at 1:08 PM,

Re: How does cassandra page through low cardinality indexes?

2014-05-16 Thread DuyHai Doan
Hello Kevin For the internal working of secondary index and LIMIT, you can have a look at this : https://issues.apache.org/jira/browse/CASSANDRA-5975 The comments and attached patch will give you a hint on how LIMIT is implemented. Alternatively you can look directly in the source code

How does cassandra page through low cardinality indexes?

2014-05-16 Thread Kevin Burton
I'm struggling with cassandra secondary indexes since the documentation seems all over the place and I'm having to put together everything from blog posts. Anyway. If I have a low cardinality index of say 10 values, and 1M records. This means each secondary index key will have references to