Re: Secondary index read/write explanation

2012-09-07 Thread Sam Tunnicliffe
On 7 September 2012 00:42, aaron morton aa...@thelastpickle.com wrote:
 1.  When a write request is received, it is written to the base CF and
 secondary index to secondary (hidden) CF. If this right, will the secondary
 index be written local the node or will it follow RP/OPP to write to nodes.

 it's local.
 If an index is to be updated the previous column values from be read from
 the primary CF so they can be deleted from the secondary index CF before
 inserting the new values.

https://issues.apache.org/jira/browse/CASSANDRA-2897 (in trunk)
removes that read of the previously indexed values from the update
path.


 2.  When a coordinator receives a read request with say predicate x=y where
 column x is the secondary index, how does the coordinator query relevant
 node(s)? How does it avoid sending it to all nodes if it is locally indexed?

 When you ask for x=y the coordinator has no idea the rows for that query
 exist in the cluster. If you ask at CL ONE it only does a local read. If you
 ask at a higher CL it asks CL nodes for each TokenRange in the cluster. Or
 for a restricted token range if you have a key restriction in the query.

 If there is any article/blog that can help understand this better, please
 let me know.

 I think this is still mostly relevant
 http://www.datastax.com/docs/0.7/data_model/secondary_indexes

 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6/09/2012, at 5:32 PM, Venkat Rama venkata.s.r...@gmail.com wrote:

 Hi All,

 I am a new bee to Cassandra and trying to understand how secondary indexes
 work.  I have been going over the discussion on
 https://issues.apache.org/jira/browse/CASSANDRA-749 about local secondary
 indexes. And interesting question on
 http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html.  The
 discussion seems to assume that most common uses cases are ones with range
 queries.  Is this right?

 I am trying to understand the low cardinality reasoning and how the read
 gets executed.  I have following questions, hoping i can explain my question
 well :)

 1.  When a write request is received, it is written to the base CF and
 secondary index to secondary (hidden) CF. If this right, will the secondary
 index be written local the node or will it follow RP/OPP to write to nodes.
 2.  When a coordinator receives a read request with say predicate x=y where
 column x is the secondary index, how does the coordinator query relevant
 node(s)? How does it avoid sending it to all nodes if it is locally indexed?

 If there is any article/blog that can help understand this better, please
 let me know.

 Thanks again in advance.

 VR




Re: Secondary index read/write explanation

2012-09-06 Thread aaron morton
 1.  When a write request is received, it is written to the base CF and 
 secondary index to secondary (hidden) CF. If this right, will the secondary 
 index be written local the node or will it follow RP/OPP to write to nodes.
it's local. 
If an index is to be updated the previous column values from be read from the 
primary CF so they can be deleted from the secondary index CF before inserting 
the new values.

 2.  When a coordinator receives a read request with say predicate x=y where 
 column x is the secondary index, how does the coordinator query relevant 
 node(s)? How does it avoid sending it to all nodes if it is locally indexed?

When you ask for x=y the coordinator has no idea the rows for that query exist 
in the cluster. If you ask at CL ONE it only does a local read. If you ask at a 
higher CL it asks CL nodes for each TokenRange in the cluster. Or for a 
restricted token range if you have a key restriction in the query.

 If there is any article/blog that can help understand this better, please let 
 me know.

I think this is still mostly relevant 
http://www.datastax.com/docs/0.7/data_model/secondary_indexes

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/09/2012, at 5:32 PM, Venkat Rama venkata.s.r...@gmail.com wrote:

 Hi All,
 
 I am a new bee to Cassandra and trying to understand how secondary indexes 
 work.  I have been going over the discussion on 
 https://issues.apache.org/jira/browse/CASSANDRA-749 about local secondary 
 indexes. And interesting question on 
 http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html.  The 
 discussion seems to assume that most common uses cases are ones with range 
 queries.  Is this right? 
 
 I am trying to understand the low cardinality reasoning and how the read gets 
 executed.  I have following questions, hoping i can explain my question well 
 :)
 
 1.  When a write request is received, it is written to the base CF and 
 secondary index to secondary (hidden) CF. If this right, will the secondary 
 index be written local the node or will it follow RP/OPP to write to nodes.
 2.  When a coordinator receives a read request with say predicate x=y where 
 column x is the secondary index, how does the coordinator query relevant 
 node(s)? How does it avoid sending it to all nodes if it is locally indexed?
 
 If there is any article/blog that can help understand this better, please let 
 me know.
 
 Thanks again in advance.
 
 VR
 



Secondary index read/write explanation

2012-09-05 Thread Venkat Rama
Hi All,

I am a new bee to Cassandra and trying to understand how secondary indexes
work.  I have been going over the discussion on
https://issues.apache.org/jira/browse/CASSANDRA-749 about local secondary
indexes. And interesting question on
http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html.  The
discussion seems to assume that most common uses cases are ones with range
queries.  Is this right?

I am trying to understand the low cardinality reasoning and how the read
gets executed.  I have following questions, hoping i can explain my
question well :)

1.  When a write request is received, it is written to the base CF and
secondary index to secondary (hidden) CF. If this right, will the secondary
index be written local the node or will it follow RP/OPP to write to nodes.
2.  When a coordinator receives a read request with say predicate x=y where
column x is the secondary index, how does the coordinator query relevant
node(s)? How does it avoid sending it to all nodes if it is locally indexed?

If there is any article/blog that can help understand this better, please
let me know.

Thanks again in advance.

VR