Re: is the select result grouped by the value of the partition key?

2013-09-11 Thread John Lumby
I would like to make quite sure about this implicit GROUP BY feature,

since it seems really important yet does not seem to be mentioned in the
CQL reference documentation.



Aaron,   you said yes  --   is that yes,  always,   in all scenarios no 
matter what

or yes usually?  Is it something we can bet the farm and farmer's family 
on?



The kinds of scenarios where I am wondering if it's possible for partition-key 
groups
to get intermingled are :



  .   what if the node containing primary copy of a row is down
                and 
cassandra fetches this row from a replica on a different node
               (e.g.  with CONSISTENCY ONE)

  .   what if there is a heavy stream of UPDATE activity from applications which
      connect to all nodes,   causing different nodes to have different 
versions of replicas of same row?



Can you point me to some place in the cassandra source code where this grouping 
is ensured?



Many thanks,

John Lumby

Re: is the select result grouped by the value of the partition key?

2013-09-11 Thread Aaron Morton
 GROUP BY feature,
I would not think of it like that, this is about physical order of rows.  

 since it seems really important yet does not seem to be mentioned in the
 CQL reference documentation.
It's baked in, this is how the data is organised on the row. 

http://www.datastax.com/dev/blog/thrift-to-cql3
We often say the PRIMARY KEY is the PARTITION KEY and the GROUPING COLUMNS
http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_reference/create_table_r.html

See also http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html

 Is it something we can bet the farm and farmer's family on?
Sure. 

 The kinds of scenarios where I am wondering if it's possible for 
 partition-key groups
 to get intermingled are :
All instances of the table entity with the same value(s) for the PARTITION KEY 
portion of the PRIMARY KEY existing in the same storage engine row. 

   .   what if the node containing primary copy of a row is down
There is no primary copy of a row. 

   .   what if there is a heavy stream of UPDATE activity from applications 
 which
   connect to all nodes,   causing different nodes to have different 
 versions of replicas of same row?
That's fine with me. 
It's only an issue when the data is read, and at that point the Consistency 
Level determines what we do. 

Hope that helps. 


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/09/2013, at 7:43 AM, John Lumby johnlu...@hotmail.com wrote:

 I would like to make quite sure about this implicit GROUP BY feature,
 
 since it seems really important yet does not seem to be mentioned in the
 CQL reference documentation.
 
 
 
 Aaron,   you said yes  --   is that yes,  always,   in all scenarios no 
 matter what
 
 or yes usually?  Is it something we can bet the farm and farmer's 
 family on?
 
 
 
 The kinds of scenarios where I am wondering if it's possible for 
 partition-key groups
 to get intermingled are :
 
 
 
   .   what if the node containing primary copy of a row is down
 and 
 cassandra fetches this row from a replica on a different node
(e.g.  with CONSISTENCY ONE)
 
   .   what if there is a heavy stream of UPDATE activity from applications 
 which
   connect to all nodes,   causing different nodes to have different 
 versions of replicas of same row?
 
 
 
 Can you point me to some place in the cassandra source code where this 
 grouping is ensured?
 
 
 
 Many thanks,
 
 John Lumby  



Re: is the select result grouped by the value of the partition key?

2013-04-14 Thread aaron morton
 
 Is it guaranteed that the rows are grouped by the value of the
 partition key? That is, is it guaranteed that I'll get
yes.


-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/04/2013, at 7:24 PM, Sorin Manolache sor...@gmail.com wrote:

 On 2013-04-11 22:10, aaron morton wrote:
 Is it guaranteed that the rows are grouped by the value of the
 partition key? That is, is it guaranteed that I'll get
 Your primary key (k1, k2) is considered in type parts (partition_key ,
 grouping_columns). In your case the primary_key is key and the grouping
 column in k2. Columns are ordered by the grouping columns, k2.
 
 See http://thelastpickle.com/2013/01/11/primary-keys-in-cql/
 
 Thank you for the answer.
 
 However my question was about the _grouping_ (not ordering) of _rows_ (not 
 columns).
 
 Sorin
 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 12/04/2013, at 3:19 AM, Sorin Manolache sor...@gmail.com
 mailto:sor...@gmail.com wrote:
 
 Hello,
 
 Let us consider that we have a table t created as follows:
 
 create table t(k1 vachar, k2 varchar, value varchar, primary key (k1,
 k2));
 
 Its contents is
 
 a m x
 a n y
 z 0 9
 z 1 8
 
 and I perform a
 
 select * from p where k1 in ('a', 'z');
 
 Is it guaranteed that the rows are grouped by the value of the
 partition key? That is, is it guaranteed that I'll get
 
 a m x
 a n y
 z 0 9
 z 1 8
 
 or
 
 a n y
 a m x
 z 1 8
 z 0 9
 
 or even
 
 z 0 9
 z 1 8
 a n y
 a m x
 
 but NEVER
 
 a m x
 z 0 9
 a n y
 z 1 8
 
 
 Thank you,
 Sorin
 
 



is the select result grouped by the value of the partition key?

2013-04-11 Thread Sorin Manolache

Hello,

Let us consider that we have a table t created as follows:

create table t(k1 vachar, k2 varchar, value varchar, primary key (k1, k2));

Its contents is

a m x
a n y
z 0 9
z 1 8

and I perform a

select * from p where k1 in ('a', 'z');

Is it guaranteed that the rows are grouped by the value of the partition 
key? That is, is it guaranteed that I'll get


a m x
a n y
z 0 9
z 1 8

or

a n y
a m x
z 1 8
z 0 9

or even

z 0 9
z 1 8
a n y
a m x

but NEVER

a m x
z 0 9
a n y
z 1 8


Thank you,
Sorin


Re: is the select result grouped by the value of the partition key?

2013-04-11 Thread aaron morton
 Is it guaranteed that the rows are grouped by the value of the partition key? 
 That is, is it guaranteed that I'll get
Your primary key (k1, k2) is considered in type parts (partition_key , 
grouping_columns). In your case the primary_key is key and the grouping column 
in k2. Columns are ordered by the grouping columns, k2. 

See http://thelastpickle.com/2013/01/11/primary-keys-in-cql/

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 12/04/2013, at 3:19 AM, Sorin Manolache sor...@gmail.com wrote:

 Hello,
 
 Let us consider that we have a table t created as follows:
 
 create table t(k1 vachar, k2 varchar, value varchar, primary key (k1, k2));
 
 Its contents is
 
 a m x
 a n y
 z 0 9
 z 1 8
 
 and I perform a
 
 select * from p where k1 in ('a', 'z');
 
 Is it guaranteed that the rows are grouped by the value of the partition key? 
 That is, is it guaranteed that I'll get
 
 a m x
 a n y
 z 0 9
 z 1 8
 
 or
 
 a n y
 a m x
 z 1 8
 z 0 9
 
 or even
 
 z 0 9
 z 1 8
 a n y
 a m x
 
 but NEVER
 
 a m x
 z 0 9
 a n y
 z 1 8
 
 
 Thank you,
 Sorin