Hi Jonathan Thanks for the explanation
Thanks, KajMagnus On Mon, Sep 5, 2011 at 11:05 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > The first node can answer the question as long as you've requested > less rows than the first node has on it. Hence the "low cardinality" > point in what you quoted. > > On Sat, Sep 3, 2011 at 5:00 AM, Kaj Magnus Lindberg > <kajmagnu...@gmail.com> wrote: >> Hello Anyone >> >> I have a follow up question on a question from February 2011. In >> short, I wonder why one won't have to query all Cassandra nodes when >> doing a secondary index lookup -- although each node only indexes data >> that it holds locally. >> >> The question and answer was: >> ( http://www.mail-archive.com/user@cassandra.apache.org/msg10506.html ) >> === Question === >> As far as I understand automatic secondary indexes are generated for >> node local data. >> In this case query by secondary index involve all nodes storing part of >> column family to get results (?) so (if i am right) if data is spread across >> 50 nodes then 50 nodes are involved in single query? >> [...] >> === Answer === >> In practice, local secondary indexes scale to {RF * the limit of a single >> machine} for -low cardinality- values (ex: users living in a certain state) >> since the first node is likely to be able to answer your question. This also >> means they are good for performing filtering for analytics. >> [...] >> >> === Now I wonder === >> Why would the first node be likely to be able to answer the question? >> It stores only index entries for users on that particular machine, >> (says http://wiki.apache.org/cassandra/SecondaryIndexes: >> "Each node only indexes data that it holds locally" ) >> but users might be stored by user name? And would thus be stored on >> many different machines? Even if they happen to live in the same >> state? >> >> Why won't the client need to query the indexes of [all servers that >> store info on users] to find all relevant users, when doing a user >> property lookup? >> >> >> Best regards, KajMagnus >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >