subject:"Node selection when both partition key and secondary index field constrained\?"

Re: Node selection when both partition key and secondary index field constrained?

2013-01-31 Thread aaron morton

 So basically it's merging the results 2 separate queries:   Indexed scan 
 (token-range) intersect foo.flag_index=true  
NO.
It is doing one query, one the secondary index. When it reads the row keys in 
that index is discards any outside of the token range, 
That query is sent to nodes which have a token range that intersect with the 
token range you have supplied. 

So if your query token range is included in one Node Token Range, the query 
will be sent to CL nodes that replicate that token range. 

Cheers
 
-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 31/01/2013, at 3:49 AM, Peter Lin wool...@gmail.com wrote:

 I'd also point out, Hector has better support for CQL3 features than
 Astyanax. I contributed some stuff to hector back in December, but I
 don't have time to apply those changes to astyanax.
 
 I have other contributions in mind for hector, which I hope to work on
 later this year.
 
 On Wed, Jan 30, 2013 at 9:45 AM, Edward Capriolo edlinuxg...@gmail.com 
 wrote:
 Hector has this feature because Hector is awesome sauce, but aystynsnax is
 new,sexy, and bogged about by netflix.
 
 So the new cassandra trend to force everyone to use less functional new
 stuff is at work here making you wish for something that already exists
 elsewhere.
 
 
 On Wednesday, January 30, 2013, Hiller, Dean dean.hil...@nrel.gov wrote:
 I recall someone doing some work in Astyanax and I don't know if it made
 it back in where astyanax would retry at a lower CL level when 2 nodes were
 down so things could continue to work which was a VERY VERY cool feature.
 You may want to look into that….I know at some point, I plan to.
 
 Later,
 Dean
 
 From: Edward Capriolo
 edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Wednesday, January 30, 2013 7:31 AM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Node selection when both partition key and secondary index
 field constrained?
 
 Any query is going to fail quorum + rf3 + 2 nodes down.
 
 One thing about 2x indexes (both user defined and built in) is that
 finding an answer using them requires more nodes to be up then just a single
 get or slice.
 
 On Monday, January 28, 2013, Mike Sample
 mike.sam...@gmail.commailto:mike.sam...@gmail.com wrote:
 Thanks Aaron.   So basically it's merging the results 2 separate queries:
 Indexed scan (token-range) intersect foo.flag_index=true where the
 latter query hits the entire cluster as per the secondary index FAQ entry.
 Thus the overall query would fail if LOCAL_QUORUM was requested, RF=3 and 2
 nodes in a given replication group were down. Darn.  Is there any way of
 efficiently getting around this (ie scope the query to just the nodes in 
 the
 token range)?
 
 
 
 
 On Mon, Jan 28, 2013 at 11:44 AM, aaron morton
 aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote:
 
 It uses the index...
 
 cqlsh:dev tracing on;
 Now tracing requests.
 cqlsh:dev
 cqlsh:dev
 cqlsh:dev SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND
 TOKEN(id) = '0' AND flag=true;
 
 Tracing session: 128cab90-6982-11e2-8cd1-51eaa232562e
 
 activity   | timestamp|
 source| source_elapsed
 
 +--+---+
 execute_cql3_query | 08:36:55,244 |
 127.0.0.1 |  0
  Parsing statement | 08:36:55,244 |
 127.0.0.1 |600
 Peparing statement | 08:36:55,245 |
 127.0.0.1 |   1408
  Determining replicas to query | 08:36:55,246 |
 127.0.0.1 |   1924
 Executing indexed scan for (max(-9939393), max(0)] | 08:36:55,247 |
 127.0.0.1 |   2956
 Executing single-partition query on foo.flag_index | 08:36:55,247 |
 127.0.0.1 |   3192
   Acquiring sstable references | 08:36:55,247 |
 127.0.0.1 |   3220
  Merging memtable contents | 08:36:55,247 |
 127.0.0.1 |   3265
   Scanned 0 rows and matched 0 | 08:36:55,247 |
 127.0.0.1 |   3396
   Request complete | 08:36:55,247 |
 127.0.0.1 |   3644
 
 
 It reads from the secondary index and discards keys that are outside of
 the token range.
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 28/01/2013, at 4:24 PM, Mike Sample
 mike.sam...@gmail.commailto:mike.sam...@gmail.com wrote:
 
 Does the following FAQ entry hold even when the partion key is also
 constrained in the query (by token())?
 
 http://wiki.apache.org

Re: Node selection when both partition key and secondary index field constrained?

2013-01-30 Thread Edward Capriolo

Any query is going to fail quorum + rf3 + 2 nodes down.

One thing about 2x indexes (both user defined and built in) is that finding
an answer using them requires more nodes to be up then just a single get or
slice.

On Monday, January 28, 2013, Mike Sample mike.sam...@gmail.com wrote:
 Thanks Aaron.   So basically it's merging the results 2 separate
queries:   Indexed scan (token-range) intersect foo.flag_index=true
where the latter query hits the entire cluster as per the secondary index
FAQ entry.   Thus the overall query would fail if LOCAL_QUORUM was
requested, RF=3 and 2 nodes in a given replication group were down. Darn.
Is there any way of efficiently getting around this (ie scope the query to
just the nodes in the token range)?




 On Mon, Jan 28, 2013 at 11:44 AM, aaron morton aa...@thelastpickle.com
wrote:

 It uses the index...

 cqlsh:dev tracing on;
 Now tracing requests.
 cqlsh:dev
 cqlsh:dev
 cqlsh:dev SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND
TOKEN(id) = '0' AND flag=true;

 Tracing session: 128cab90-6982-11e2-8cd1-51eaa232562e

  activity   | timestamp|
source| source_elapsed

+--+---+
  execute_cql3_query | 08:36:55,244 |
127.0.0.1 |  0
   Parsing statement | 08:36:55,244 |
127.0.0.1 |600
  Peparing statement | 08:36:55,245 |
127.0.0.1 |   1408
   Determining replicas to query | 08:36:55,246 |
127.0.0.1 |   1924
  Executing indexed scan for (max(-9939393), max(0)] | 08:36:55,247 |
127.0.0.1 |   2956
  Executing single-partition query on foo.flag_index | 08:36:55,247 |
127.0.0.1 |   3192
Acquiring sstable references | 08:36:55,247 |
127.0.0.1 |   3220
   Merging memtable contents | 08:36:55,247 |
127.0.0.1 |   3265
Scanned 0 rows and matched 0 | 08:36:55,247 |
127.0.0.1 |   3396
Request complete | 08:36:55,247 |
127.0.0.1 |   3644


 It reads from the secondary index and discards keys that are outside of
the token range.

 Cheers


 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 28/01/2013, at 4:24 PM, Mike Sample mike.sam...@gmail.com wrote:

  Does the following FAQ entry hold even when the partion key is also
constrained in the query (by token())?
 
  http://wiki.apache.org/cassandra/SecondaryIndexes:
  ==
 Q: How does choice of Consistency Level affect cluster availability
when using secondary indexes?
 
 A: Because secondary indexes are distributed, you must have CL
nodes available for all token ranges in the cluster in order to complete a
query. For example, with RF = 3, when two out of three consecutive nodes in
the ring are unavailable, all secondary index queries at CL = QUORUM will
fail, however secondary index queries at CL = ONE will succeed. This is
true regardless of cluster size.
  ==
 
  For example:
 
  CREATE TABLE foo (
  id uuid,
  seq_num bigint,
  flag boolean,
  some_other_data blob,
  PRIMARY KEY (id,seq_num)
  );
 
  CREATE INDEX flag_index ON foo (flag);
 
  SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND TOKEN(id) =
'0' AND flag=true;
 
  Would the above query with LOCAL_QUORUM succeed given the following?
IE is the token range used first trim node selection?
 
  * the cluster has 18 nodes
  * foo is in a keyspace with a replication factor of 3 for that data
center
  * 2 nodes in one of the replication groups are down
  * the token range in the query is not in the range of the down nodes
 
 
  Thanks in advance!

Re: Node selection when both partition key and secondary index field constrained?

2013-01-30 Thread Hiller, Dean

I recall someone doing some work in Astyanax and I don't know if it made it 
back in where astyanax would retry at a lower CL level when 2 nodes were down 
so things could continue to work which was a VERY VERY cool feature.  You may 
want to look into that….I know at some point, I plan to.

Later,
Dean

From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Wednesday, January 30, 2013 7:31 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Node selection when both partition key and secondary index field 
constrained?

Any query is going to fail quorum + rf3 + 2 nodes down.

One thing about 2x indexes (both user defined and built in) is that finding an 
answer using them requires more nodes to be up then just a single get or slice.

On Monday, January 28, 2013, Mike Sample 
mike.sam...@gmail.commailto:mike.sam...@gmail.com wrote:
 Thanks Aaron.   So basically it's merging the results 2 separate queries:   
 Indexed scan (token-range) intersect foo.flag_index=true where the latter 
 query hits the entire cluster as per the secondary index FAQ entry.   Thus 
 the overall query would fail if LOCAL_QUORUM was requested, RF=3 and 2 nodes 
 in a given replication group were down. Darn.  Is there any way of 
 efficiently getting around this (ie scope the query to just the nodes in the 
 token range)?




 On Mon, Jan 28, 2013 at 11:44 AM, aaron morton 
 aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote:

 It uses the index...

 cqlsh:dev tracing on;
 Now tracing requests.
 cqlsh:dev
 cqlsh:dev
 cqlsh:dev SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND 
 TOKEN(id) = '0' AND flag=true;

 Tracing session: 128cab90-6982-11e2-8cd1-51eaa232562e

  activity   | timestamp| source  
   | source_elapsed
 +--+---+
  execute_cql3_query | 08:36:55,244 | 
 127.0.0.1 |  0
   Parsing statement | 08:36:55,244 | 
 127.0.0.1 |600
  Peparing statement | 08:36:55,245 | 
 127.0.0.1 |   1408
   Determining replicas to query | 08:36:55,246 | 
 127.0.0.1 |   1924
  Executing indexed scan for (max(-9939393), max(0)] | 08:36:55,247 | 
 127.0.0.1 |   2956
  Executing single-partition query on foo.flag_index | 08:36:55,247 | 
 127.0.0.1 |   3192
Acquiring sstable references | 08:36:55,247 | 
 127.0.0.1 |   3220
   Merging memtable contents | 08:36:55,247 | 
 127.0.0.1 |   3265
Scanned 0 rows and matched 0 | 08:36:55,247 | 
 127.0.0.1 |   3396
Request complete | 08:36:55,247 | 
 127.0.0.1 |   3644


 It reads from the secondary index and discards keys that are outside of the 
 token range.

 Cheers


 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 28/01/2013, at 4:24 PM, Mike Sample 
 mike.sam...@gmail.commailto:mike.sam...@gmail.com wrote:

  Does the following FAQ entry hold even when the partion key is also 
  constrained in the query (by token())?
 
  http://wiki.apache.org/cassandra/SecondaryIndexes:
  ==
 Q: How does choice of Consistency Level affect cluster availability 
  when using secondary indexes?
 
 A: Because secondary indexes are distributed, you must have CL nodes 
  available for all token ranges in the cluster in order to complete a 
  query. For example, with RF = 3, when two out of three consecutive nodes 
  in the ring are unavailable, all secondary index queries at CL = QUORUM 
  will fail, however secondary index queries at CL = ONE will succeed. This 
  is true regardless of cluster size.
  ==
 
  For example:
 
  CREATE TABLE foo (
  id uuid,
  seq_num bigint,
  flag boolean,
  some_other_data blob,
  PRIMARY KEY (id,seq_num)
  );
 
  CREATE INDEX flag_index ON foo (flag);
 
  SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND TOKEN(id) = '0' 
  AND flag=true;
 
  Would the above query with LOCAL_QUORUM succeed given the following? IE is 
  the token range used first trim node selection?
 
  * the cluster has 18 nodes
  * foo is in a keyspace with a replication factor of 3 for that data center
  * 2 nodes in one of the replication groups are down
  * the token range in the query is not in the range of the down nodes
 
 
  Thanks in advance!

Re: Node selection when both partition key and secondary index field constrained?

2013-01-30 Thread Edward Capriolo

Hector has this feature because Hector is awesome sauce, but aystynsnax is
new,sexy, and bogged about by netflix.

So the new cassandra trend to force everyone to use less functional new
stuff is at work here making you wish for something that already exists
elsewhere.

On Wednesday, January 30, 2013, Hiller, Dean dean.hil...@nrel.gov wrote:
 I recall someone doing some work in Astyanax and I don't know if it made
it back in where astyanax would retry at a lower CL level when 2 nodes were
down so things could continue to work which was a VERY VERY cool feature.
 You may want to look into that….I know at some point, I plan to.

 Later,
 Dean

 From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com

 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Wednesday, January 30, 2013 7:31 AM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Node selection when both partition key and secondary index
field constrained?

 Any query is going to fail quorum + rf3 + 2 nodes down.

 One thing about 2x indexes (both user defined and built in) is that
finding an answer using them requires more nodes to be up then just a
single get or slice.

 On Monday, January 28, 2013, Mike Sample mike.sam...@gmail.commailto:
mike.sam...@gmail.com wrote:
 Thanks Aaron.   So basically it's merging the results 2 separate
queries:   Indexed scan (token-range) intersect foo.flag_index=true
where the latter query hits the entire cluster as per the secondary index
FAQ entry.   Thus the overall query would fail if LOCAL_QUORUM was
requested, RF=3 and 2 nodes in a given replication group were down. Darn.
 Is there any way of efficiently getting around this (ie scope the query to
just the nodes in the token range)?




 On Mon, Jan 28, 2013 at 11:44 AM, aaron morton aa...@thelastpickle.com
mailto:aa...@thelastpickle.com wrote:

 It uses the index...

 cqlsh:dev tracing on;
 Now tracing requests.
 cqlsh:dev
 cqlsh:dev
 cqlsh:dev SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND
TOKEN(id) = '0' AND flag=true;

 Tracing session: 128cab90-6982-11e2-8cd1-51eaa232562e

  activity   | timestamp|
source| source_elapsed

+--+---+
  execute_cql3_query | 08:36:55,244 |
127.0.0.1 |  0
   Parsing statement | 08:36:55,244 |
127.0.0.1 |600
  Peparing statement | 08:36:55,245 |
127.0.0.1 |   1408
   Determining replicas to query | 08:36:55,246 |
127.0.0.1 |   1924
  Executing indexed scan for (max(-9939393), max(0)] | 08:36:55,247 |
127.0.0.1 |   2956
  Executing single-partition query on foo.flag_index | 08:36:55,247 |
127.0.0.1 |   3192
Acquiring sstable references | 08:36:55,247 |
127.0.0.1 |   3220
   Merging memtable contents | 08:36:55,247 |
127.0.0.1 |   3265
Scanned 0 rows and matched 0 | 08:36:55,247 |
127.0.0.1 |   3396
Request complete | 08:36:55,247 |
127.0.0.1 |   3644


 It reads from the secondary index and discards keys that are outside of
the token range.

 Cheers


 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 28/01/2013, at 4:24 PM, Mike Sample mike.sam...@gmail.commailto:
mike.sam...@gmail.com wrote:

  Does the following FAQ entry hold even when the partion key is also
constrained in the query (by token())?
 
  http://wiki.apache.org/cassandra/SecondaryIndexes:
  ==
 Q: How does choice of Consistency Level affect cluster
availability when using secondary indexes?
 
 A: Because secondary indexes are distributed, you must have CL
nodes available for all token ranges in the cluster in order to complete a
query. For example, with RF = 3, when two out of three consecutive nodes in
the ring are unavailable, all secondary index queries at CL = QUORUM will
fail, however secondary index queries at CL = ONE will succeed. This is
true regardless of cluster size.
  ==
 
  For example:
 
  CREATE TABLE foo (
  id uuid,
  seq_num bigint,
  flag boolean,
  some_other_data blob,
  PRIMARY KEY (id,seq_num)
  );
 
  CREATE INDEX flag_index ON foo (flag);
 
  SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND TOKEN(id)
= '0' AND flag=true;
 
  Would the above query with LOCAL_QUORUM succeed given the following?
IE is the token range used first trim node selection?
 
  * the cluster has 18 nodes
  * foo is in a keyspace with a replication factor of 3 for that data
center
  * 2 nodes in one of the replication groups

Re: Node selection when both partition key and secondary index field constrained?

2013-01-30 Thread Peter Lin

I'd also point out, Hector has better support for CQL3 features than
Astyanax. I contributed some stuff to hector back in December, but I
don't have time to apply those changes to astyanax.

I have other contributions in mind for hector, which I hope to work on
later this year.

On Wed, Jan 30, 2013 at 9:45 AM, Edward Capriolo edlinuxg...@gmail.com wrote:
 Hector has this feature because Hector is awesome sauce, but aystynsnax is
 new,sexy, and bogged about by netflix.

 So the new cassandra trend to force everyone to use less functional new
 stuff is at work here making you wish for something that already exists
 elsewhere.


 On Wednesday, January 30, 2013, Hiller, Dean dean.hil...@nrel.gov wrote:
 I recall someone doing some work in Astyanax and I don't know if it made
 it back in where astyanax would retry at a lower CL level when 2 nodes were
 down so things could continue to work which was a VERY VERY cool feature.
 You may want to look into that….I know at some point, I plan to.

 Later,
 Dean

 From: Edward Capriolo
 edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Wednesday, January 30, 2013 7:31 AM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Node selection when both partition key and secondary index
 field constrained?

 Any query is going to fail quorum + rf3 + 2 nodes down.

 One thing about 2x indexes (both user defined and built in) is that
 finding an answer using them requires more nodes to be up then just a single
 get or slice.

 On Monday, January 28, 2013, Mike Sample
 mike.sam...@gmail.commailto:mike.sam...@gmail.com wrote:
 Thanks Aaron.   So basically it's merging the results 2 separate queries:
 Indexed scan (token-range) intersect foo.flag_index=true where the
 latter query hits the entire cluster as per the secondary index FAQ entry.
 Thus the overall query would fail if LOCAL_QUORUM was requested, RF=3 and 2
 nodes in a given replication group were down. Darn.  Is there any way of
 efficiently getting around this (ie scope the query to just the nodes in the
 token range)?




 On Mon, Jan 28, 2013 at 11:44 AM, aaron morton
 aa...@thelastpickle.commailto:aa...@thelastpickle.com wrote:

 It uses the index...

 cqlsh:dev tracing on;
 Now tracing requests.
 cqlsh:dev
 cqlsh:dev
 cqlsh:dev SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND
 TOKEN(id) = '0' AND flag=true;

 Tracing session: 128cab90-6982-11e2-8cd1-51eaa232562e

  activity   | timestamp|
 source| source_elapsed

 +--+---+
  execute_cql3_query | 08:36:55,244 |
 127.0.0.1 |  0
   Parsing statement | 08:36:55,244 |
 127.0.0.1 |600
  Peparing statement | 08:36:55,245 |
 127.0.0.1 |   1408
   Determining replicas to query | 08:36:55,246 |
 127.0.0.1 |   1924
  Executing indexed scan for (max(-9939393), max(0)] | 08:36:55,247 |
 127.0.0.1 |   2956
  Executing single-partition query on foo.flag_index | 08:36:55,247 |
 127.0.0.1 |   3192
Acquiring sstable references | 08:36:55,247 |
 127.0.0.1 |   3220
   Merging memtable contents | 08:36:55,247 |
 127.0.0.1 |   3265
Scanned 0 rows and matched 0 | 08:36:55,247 |
 127.0.0.1 |   3396
Request complete | 08:36:55,247 |
 127.0.0.1 |   3644


 It reads from the secondary index and discards keys that are outside of
 the token range.

 Cheers


 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 28/01/2013, at 4:24 PM, Mike Sample
 mike.sam...@gmail.commailto:mike.sam...@gmail.com wrote:

  Does the following FAQ entry hold even when the partion key is also
  constrained in the query (by token())?
 
  http://wiki.apache.org/cassandra/SecondaryIndexes:
  ==
 Q: How does choice of Consistency Level affect cluster availability
  when using secondary indexes?
 
 A: Because secondary indexes are distributed, you must have CL
  nodes available for all token ranges in the cluster in order to complete 
  a
  query. For example, with RF = 3, when two out of three consecutive nodes 
  in
  the ring are unavailable, all secondary index queries at CL = QUORUM will
  fail, however secondary index queries at CL = ONE will succeed. This is 
  true
  regardless of cluster size.
  ==
 
  For example:
 
  CREATE TABLE foo (
  id uuid,
  seq_num bigint,
  flag boolean,
  some_other_data blob,
  PRIMARY KEY (id,seq_num

Re: Node selection when both partition key and secondary index field constrained?

2013-01-28 Thread aaron morton

It uses the index...

cqlsh:dev tracing on;
Now tracing requests.
cqlsh:dev 
cqlsh:dev 
cqlsh:dev SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND TOKEN(id) 
= '0' AND flag=true;

Tracing session: 128cab90-6982-11e2-8cd1-51eaa232562e

 activity   | timestamp| source
| source_elapsed
+--+---+
 execute_cql3_query | 08:36:55,244 | 127.0.0.1 
|  0
  Parsing statement | 08:36:55,244 | 127.0.0.1 
|600
 Peparing statement | 08:36:55,245 | 127.0.0.1 
|   1408
  Determining replicas to query | 08:36:55,246 | 127.0.0.1 
|   1924
 Executing indexed scan for (max(-9939393), max(0)] | 08:36:55,247 | 127.0.0.1 
|   2956
 Executing single-partition query on foo.flag_index | 08:36:55,247 | 127.0.0.1 
|   3192
   Acquiring sstable references | 08:36:55,247 | 127.0.0.1 
|   3220
  Merging memtable contents | 08:36:55,247 | 127.0.0.1 
|   3265
   Scanned 0 rows and matched 0 | 08:36:55,247 | 127.0.0.1 
|   3396
   Request complete | 08:36:55,247 | 127.0.0.1 
|   3644


It reads from the secondary index and discards keys that are outside of the 
token range. 

Cheers
 

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/01/2013, at 4:24 PM, Mike Sample mike.sam...@gmail.com wrote:

 Does the following FAQ entry hold even when the partion key is also 
 constrained in the query (by token())?
 
 http://wiki.apache.org/cassandra/SecondaryIndexes:
 ==
Q: How does choice of Consistency Level affect cluster availability when 
 using secondary indexes?
 
A: Because secondary indexes are distributed, you must have CL nodes 
 available for all token ranges in the cluster in order to complete a query. 
 For example, with RF = 3, when two out of three consecutive nodes in the ring 
 are unavailable, all secondary index queries at CL = QUORUM will fail, 
 however secondary index queries at CL = ONE will succeed. This is true 
 regardless of cluster size.
 ==
 
 For example:
 
 CREATE TABLE foo (
 id uuid,  
 seq_num bigint, 
 flag boolean, 
 some_other_data blob,
 PRIMARY KEY (id,seq_num) 
 );
 
 CREATE INDEX flag_index ON foo (flag);
 
 SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND TOKEN(id) = '0' 
 AND flag=true;
 
 Would the above query with LOCAL_QUORUM succeed given the following? IE is 
 the token range used first trim node selection?
 
 * the cluster has 18 nodes
 * foo is in a keyspace with a replication factor of 3 for that data center
 * 2 nodes in one of the replication groups are down
 * the token range in the query is not in the range of the down nodes
 
 
 Thanks in advance!

Re: Node selection when both partition key and secondary index field constrained?

2013-01-28 Thread Mike Sample

Thanks Aaron.   So basically it's merging the results 2 separate queries:
Indexed scan (token-range) intersect foo.flag_index=true where the
latter query hits the entire cluster as per the secondary index FAQ
entry.   Thus the overall query would fail if LOCAL_QUORUM was requested,
RF=3 and 2 nodes in a given replication group were down. Darn.  Is there
any way of efficiently getting around this (ie scope the query to just the
nodes in the token range)?




On Mon, Jan 28, 2013 at 11:44 AM, aaron morton aa...@thelastpickle.comwrote:

 It uses the index...

 cqlsh:dev tracing on;
 Now tracing requests.
 cqlsh:dev
 cqlsh:dev
 cqlsh:dev SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND
 TOKEN(id) = '0' AND flag=true;

 Tracing session: 128cab90-6982-11e2-8cd1-51eaa232562e

  activity   | timestamp|
 source| source_elapsed

 +--+---+
  execute_cql3_query | 08:36:55,244 |
 127.0.0.1 |  0
   Parsing statement | 08:36:55,244 |
 127.0.0.1 |600
  Peparing statement | 08:36:55,245 |
 127.0.0.1 |   1408
   Determining replicas to query | 08:36:55,246 |
 127.0.0.1 |   1924
  Executing indexed scan for (max(-9939393), max(0)] | 08:36:55,247 |
 127.0.0.1 |   2956
  Executing single-partition query on foo.flag_index | 08:36:55,247 |
 127.0.0.1 |   3192
Acquiring sstable references | 08:36:55,247 |
 127.0.0.1 |   3220
   Merging memtable contents | 08:36:55,247 |
 127.0.0.1 |   3265
Scanned 0 rows and matched 0 | 08:36:55,247 |
 127.0.0.1 |   3396
Request complete | 08:36:55,247 |
 127.0.0.1 |   3644


 It reads from the secondary index and discards keys that are outside of
 the token range.

 Cheers


 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 28/01/2013, at 4:24 PM, Mike Sample mike.sam...@gmail.com wrote:

  Does the following FAQ entry hold even when the partion key is also
 constrained in the query (by token())?
 
  http://wiki.apache.org/cassandra/SecondaryIndexes:
  ==
 Q: How does choice of Consistency Level affect cluster availability
 when using secondary indexes?
 
 A: Because secondary indexes are distributed, you must have CL nodes
 available for all token ranges in the cluster in order to complete a query.
 For example, with RF = 3, when two out of three consecutive nodes in the
 ring are unavailable, all secondary index queries at CL = QUORUM will fail,
 however secondary index queries at CL = ONE will succeed. This is true
 regardless of cluster size.
  ==
 
  For example:
 
  CREATE TABLE foo (
  id uuid,
  seq_num bigint,
  flag boolean,
  some_other_data blob,
  PRIMARY KEY (id,seq_num)
  );
 
  CREATE INDEX flag_index ON foo (flag);
 
  SELECT id, flag from foo WHERE TOKEN(id)  '-9939393' AND TOKEN(id) =
 '0' AND flag=true;
 
  Would the above query with LOCAL_QUORUM succeed given the following? IE
 is the token range used first trim node selection?
 
  * the cluster has 18 nodes
  * foo is in a keyspace with a replication factor of 3 for that data
 center
  * 2 nodes in one of the replication groups are down
  * the token range in the query is not in the range of the down nodes
 
 
  Thanks in advance!

Re: Node selection when both partition key and secondary index field constrained?

Re: Node selection when both partition key and secondary index field constrained?

Re: Node selection when both partition key and secondary index field constrained?

Re: Node selection when both partition key and secondary index field constrained?

Re: Node selection when both partition key and secondary index field constrained?

Re: Node selection when both partition key and secondary index field constrained?

Re: Node selection when both partition key and secondary index field constrained?

7 matches

Site Navigation

Mail list logo

Footer information