Re: is the select result grouped by the value of the partition key?
I would like to make quite sure about this implicit GROUP BY feature, since it seems really important yet does not seem to be mentioned in the CQL reference documentation. Aaron, you said yes -- is that yes, always, in all scenarios no matter what or yes usually? Is it something we can bet the farm and farmer's family on? The kinds of scenarios where I am wondering if it's possible for partition-key groups to get intermingled are : . what if the node containing primary copy of a row is down and cassandra fetches this row from a replica on a different node (e.g. with CONSISTENCY ONE) . what if there is a heavy stream of UPDATE activity from applications which connect to all nodes, causing different nodes to have different versions of replicas of same row? Can you point me to some place in the cassandra source code where this grouping is ensured? Many thanks, John Lumby
Re: is the select result grouped by the value of the partition key?
GROUP BY feature, I would not think of it like that, this is about physical order of rows. since it seems really important yet does not seem to be mentioned in the CQL reference documentation. It's baked in, this is how the data is organised on the row. http://www.datastax.com/dev/blog/thrift-to-cql3 We often say the PRIMARY KEY is the PARTITION KEY and the GROUPING COLUMNS http://www.datastax.com/documentation/cql/3.0/webhelp/index.html#cql/cql_reference/create_table_r.html See also http://thelastpickle.com/blog/2013/01/11/primary-keys-in-cql.html Is it something we can bet the farm and farmer's family on? Sure. The kinds of scenarios where I am wondering if it's possible for partition-key groups to get intermingled are : All instances of the table entity with the same value(s) for the PARTITION KEY portion of the PRIMARY KEY existing in the same storage engine row. . what if the node containing primary copy of a row is down There is no primary copy of a row. . what if there is a heavy stream of UPDATE activity from applications which connect to all nodes, causing different nodes to have different versions of replicas of same row? That's fine with me. It's only an issue when the data is read, and at that point the Consistency Level determines what we do. Hope that helps. - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 12/09/2013, at 7:43 AM, John Lumby johnlu...@hotmail.com wrote: I would like to make quite sure about this implicit GROUP BY feature, since it seems really important yet does not seem to be mentioned in the CQL reference documentation. Aaron, you said yes -- is that yes, always, in all scenarios no matter what or yes usually? Is it something we can bet the farm and farmer's family on? The kinds of scenarios where I am wondering if it's possible for partition-key groups to get intermingled are : . what if the node containing primary copy of a row is down and cassandra fetches this row from a replica on a different node (e.g. with CONSISTENCY ONE) . what if there is a heavy stream of UPDATE activity from applications which connect to all nodes, causing different nodes to have different versions of replicas of same row? Can you point me to some place in the cassandra source code where this grouping is ensured? Many thanks, John Lumby
Re: is the select result grouped by the value of the partition key?
Is it guaranteed that the rows are grouped by the value of the partition key? That is, is it guaranteed that I'll get yes. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 12/04/2013, at 7:24 PM, Sorin Manolache sor...@gmail.com wrote: On 2013-04-11 22:10, aaron morton wrote: Is it guaranteed that the rows are grouped by the value of the partition key? That is, is it guaranteed that I'll get Your primary key (k1, k2) is considered in type parts (partition_key , grouping_columns). In your case the primary_key is key and the grouping column in k2. Columns are ordered by the grouping columns, k2. See http://thelastpickle.com/2013/01/11/primary-keys-in-cql/ Thank you for the answer. However my question was about the _grouping_ (not ordering) of _rows_ (not columns). Sorin Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 12/04/2013, at 3:19 AM, Sorin Manolache sor...@gmail.com mailto:sor...@gmail.com wrote: Hello, Let us consider that we have a table t created as follows: create table t(k1 vachar, k2 varchar, value varchar, primary key (k1, k2)); Its contents is a m x a n y z 0 9 z 1 8 and I perform a select * from p where k1 in ('a', 'z'); Is it guaranteed that the rows are grouped by the value of the partition key? That is, is it guaranteed that I'll get a m x a n y z 0 9 z 1 8 or a n y a m x z 1 8 z 0 9 or even z 0 9 z 1 8 a n y a m x but NEVER a m x z 0 9 a n y z 1 8 Thank you, Sorin
is the select result grouped by the value of the partition key?
Hello, Let us consider that we have a table t created as follows: create table t(k1 vachar, k2 varchar, value varchar, primary key (k1, k2)); Its contents is a m x a n y z 0 9 z 1 8 and I perform a select * from p where k1 in ('a', 'z'); Is it guaranteed that the rows are grouped by the value of the partition key? That is, is it guaranteed that I'll get a m x a n y z 0 9 z 1 8 or a n y a m x z 1 8 z 0 9 or even z 0 9 z 1 8 a n y a m x but NEVER a m x z 0 9 a n y z 1 8 Thank you, Sorin
Re: is the select result grouped by the value of the partition key?
Is it guaranteed that the rows are grouped by the value of the partition key? That is, is it guaranteed that I'll get Your primary key (k1, k2) is considered in type parts (partition_key , grouping_columns). In your case the primary_key is key and the grouping column in k2. Columns are ordered by the grouping columns, k2. See http://thelastpickle.com/2013/01/11/primary-keys-in-cql/ Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 12/04/2013, at 3:19 AM, Sorin Manolache sor...@gmail.com wrote: Hello, Let us consider that we have a table t created as follows: create table t(k1 vachar, k2 varchar, value varchar, primary key (k1, k2)); Its contents is a m x a n y z 0 9 z 1 8 and I perform a select * from p where k1 in ('a', 'z'); Is it guaranteed that the rows are grouped by the value of the partition key? That is, is it guaranteed that I'll get a m x a n y z 0 9 z 1 8 or a n y a m x z 1 8 z 0 9 or even z 0 9 z 1 8 a n y a m x but NEVER a m x z 0 9 a n y z 1 8 Thank you, Sorin