from:"Avi Kivity"

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

2022-12-08 Thread Avi Kivity via dev

IMO it's wrong to change an aggregate's meaning from "aggregate across
GROUPs or entire SELECT" to "aggregate within column". Aggregation is
long established in SQL and it will just confuse experienced database
users.

PostgresQL maintains the meaning of max:

CREATE TABLE tab (
    x int[]
);

INSERT INTO tab(x) VALUES ( '{1, 2}' );
INSERT INTO tab(x) VALUES ( '{3, 4}' );

SELECT max(x) FROM tab;

max
[3,4]

One option is to treat the collection as a tiny table:

SELECT (SELECT max(key) FROM a_set_column) AS m1, (SELECT max(value)
FROM a_map_column) FROM tab;

Though it's better to look for existing practice and emulate it than to
exercise creativity here, IMO.


On Tue, 2022-12-06 at 13:30 +, Benedict wrote:
> Thanks Andres, I think community input on direction here will be
> invaluable. There’s a bunch of interrelated tickets, and my opinions
> are as follows:
> 
> 1. I think it is a mistake to offer a function MAX that operates over
> rows containing collections, returning the collection with the most
> elements. This is just a nonsensical operation to support IMO. We
> should decide as a community whether we “fix” this aggregation, or
> remove it.
> 2. I think “collection_" prefixed methods are non-intuitive for
> discovery, and all-else equal it would be better to use MAX,MIN, etc,
> same as for aggregations.
> 3. I think it is peculiar to permit methods named collection_ to
> operate over non-collection types when they are explicitly collection
> variants.
> 
> Given (1), (2) becomes simple except for COUNT which remains
> ambiguous, but this could be solved by either providing a separate
> method for collections (e.g. SIZE) which seems fine to me, or by
> offering a precedence order for matching and a keyword for overriding
> the precedence order (e.g. COUNT(collection AS COLLECTION)).
> 
> Given (2), (3) is a little more difficult. However, I think this can
> be solved several ways. 
>  - We could permit explicit casts to collection types, that for a
> collection type would be a no-op, and for a single value would create
> a collection
>  - With precedence orders, by always selecting the scalar function
> last
>  - By permitting WRITETIME to accept a binary operator reduce
> function to resolve multiple values
> 
> These decisions all imply trade-offs on each other, and affect the
> evolution of CQL, so I think community input would be helpful.
> 
> > On 6 Dec 2022, at 12:44, Andrés de la Peña 
> > wrote:
> > 
> > 
> > This will require some long introduction for context:
> > 
> > The MAX/MIN functions aggregate rows to get the row with min/max
> > column value according to their comparator. For collections, the
> > comparison is on the lexicographical order of the collection
> > elements. That's the very same comparator that is used when
> > collections are used as clustering keys and for ORDER BY.
> > 
> > However, a bug in the MIN/MAX aggregate functions used to make that
> > the results were presented in their unserialized form, although the
> > row selection was correct. That bug was recently solved by
> > CASSANDRA-17811. During that ticket it was also considered the
> > option of simply disabling MIN/MAX on collection since applying
> > those functions to collections, since they don't seem super useful.
> > However, that option was quickly discarded and the operation was
> > fixed so the MIN/MAX functions correctly work for every data type.
> > 
> > As a byproduct of the internal improvements of that fix, CASSANDRA-
> > 8877 introduced a new set of functions that can perform
> > aggregations of the elements of a collection. Those where named
> > "map_keys", "map_values", "collection_min", "collection_max",
> > "collection_sum", and "collection_count". Those are the names
> > mentioned on the mail list thread about function naming
> > conventions. Despite doing a kind of within-collection aggregation,
> > these functions are not what we usually call aggregate functions,
> > since they don't aggregate multiple rows together.
> > 
> > On a different line of work, CASSANDRA-17425 added to trunk a
> > MAXWRITETIME function to get the max timestamp of a multi-cell
> > column. However, the new collection functions can be used in
> > combination with the WRITETIME and TTL functions to retrieve the
> > min/max/sum/avg timestamp or ttl of a multi-cell column. Since the
> > new functions give a generic way of aggreagting timestamps ant TTLs
> > of multi-cell columns, CASSANDRA-18078 proposed to remove that
> > MAXWRITETIME function.
> > 
> > Yifan Cai, author of the MAXWRITETIME function, agreed to remove
> > that function in favour of the new generic collection functions.
> > However, the MAXWRITETIME function can work on both single-cell and
> > multi-cell columns, whereas "COLLECTION_MAX(WRITETIME(column))"
> > would only work on multi-cell columns, That's because MAXWRITETIME
> > of a not-multicell column doesn't return a collection, and one
> > should simply use "WRITETIME(column)"

Re: [DISCUSS] CEP-20: Dynamic Data Masking

2022-08-30 Thread Avi Kivity via dev

Agree with views, or alternatively, column permissions together with 
computed columns:



CREATE TABLE foo (

  id int PRIMARY KEY,

  unmasked_name text,

  name text GENERATED ALWAYS AS some_mask_function(text, 'xxx', 7)

)


(syntax from postgresql)


GRANT SELECT ON foo.name TO general_use;

GRANT SELECT ON foo.unmasked_name TO top_secret;


On 26/08/2022 00.10, Benedict wrote:
I’m inclined to agree that this seems a more straightforward approach 
that makes fewer implied promises.


Perhaps we could deliver simple views backed by virtual tables, and 
model our approach on that of Postgres, MySQL et al?


Views in C* would be very simple, just offering a subset of fields 
with some UDFs applied. It would allow users to define roles with 
access only to the views, or for applications to use the views for 
presentation purposes.


It feels like a cleaner approach to me, and we’d get two features for 
the price of one. BUT I don’t feel super strongly about this.


On 25 Aug 2022, at 20:16, Derek Chen-Becker  
wrote:



To make sure I understand, if I wanted to use a masked column for a 
conditional update, you're saying we would need SELECT_MASKED to use 
it in the IF clause? I worry that this proposal is increasing in 
complexity; I would actually be OK starting with something smaller in 
scope. Perhaps just providing the masking functions and not tying 
masking to schema would be sufficient for an initial goal? That 
wouldn't preclude additional permissions, schema integration, or 
perhaps just plain Views in the future.


Cheers,

Derek

On Thu, Aug 25, 2022 at 11:12 AM Andrés de la Peña 
 wrote:


I have modified the proposal adding a new SELECT_MASKED
permission. Using masked columns on WHERE/IF clauses would
require having SELECT and either UNMASK or SELECT_MASKED
permissions. Seeing the unmasked values in the query results
would always require both SELECT and UNMASK.

This way we can have the best of both worlds, allowing admins to
decide whether they trust their immediate users or not. wdyt?

On Wed, 24 Aug 2022 at 16:06, Henrik Ingo
 wrote:

This is the difference between security and compliance I
guess :-D

The way I see this, the attacker or threat in this concept is
not the developer with access to the database. Rather a
feature like this is just a convenient way to apply some
masking rule in a centralized way. The protection is against
an end user of the application, who should not be able to see
the personal data of someone else. Or themselves, even. As
long as the application end user doesn't have access to run
arbitrary CQL, then these frorms of masking prevent
accidental unauthorized use/leaking of personal data.

henrik



On Wed, Aug 24, 2022 at 10:40 AM Benedict
 wrote:

Is it typical for a masking feature to make no effort to
prevent unmasking? I’m just struggling to see the value
of this without such mechanisms. Otherwise it’s just a
default formatter, and we should consider renaming the
feature IMO


On 23 Aug 2022, at 21:27, Andrés de la Peña
 wrote:


As mentioned in the CEP document, dynamic data masking
doesn't try to prevent malicious users with SELECT
permissions to indirectly guess the real value of the
masked value. This can easily be done by just trying
values on the WHERE clause of SELECT queries. DDM would
not be a replacement for proper column-level permissions.

The data served by the database is usually consumed by
applications that present this data to end users. These
end users are not necessarily the users directly
connecting to the database. With DDM, it would be easy
for applications to mask sensitive data that is going to
be consumed by the end users. However, the users
directly connecting to the database should be trusted,
provided that they have the right SELECT permissions.

In other words, DDM doesn't directly protect the data,
but it eases the production of protected data.

Said that, we could later go one step ahead and add a
way to prevent untrusted users from inferring the masked
data. That could be done adding a new permission
required to use certain columns on WHERE clauses,
different to the current SELECT permission. That would
play especially well with column-level permissions,
which is something that we still have pending.

On Tue, 23 Aug 2022 at 19:13, Aaron Ploetz
 wrote:

Applying this should prevent querying on a
field, else you could leak its contents, surely?

Re: CEP-15 multi key transaction syntax

2022-08-22 Thread Avi Kivity via dev

I wasn't referring to specific syntax but to the concept. If a SQL 
dialect (or better, the standard) has a way to select data into a 
variable, let's adopt it.


If such syntax doesn't exist, LET (a, b, c) = (SELECT x, y, z FROM tab) 
is my preference.


On 8/22/22 19:13, Patrick McFadin wrote:

The replies got trashed pretty badly in the responses.
When you say: "Agree it's better to reuse existing syntax than invent 
new syntax."


Which syntax are you referring to?

Patrick


On Mon, Aug 22, 2022 at 1:36 AM Avi Kivity via dev 
 wrote:


Agree it's better to reuse existing syntax than invent new syntax.

On 8/21/22 16:52, Konstantin Osipov wrote:
    > * Avi Kivity via dev  [22/08/14 15:59]:
>
> MySQL supports SELECT  INTO  FROM ... WHERE
> ...
>
> PostgreSQL supports pretty much the same syntax.
>
> Maybe instead of LET use the ANSI/MySQL/PostgreSQL DECLARE var
TYPE and
> MySQL/PostgreSQL SELECT ... INTO?
>
>> On 14/08/2022 01.29, Benedict Elliott Smith wrote:
>>> 
>>> I’ll do my best to express with my thinking, as well as how I
would
>>> explain the feature to a user.
>>>
>>> My mental model for LET statements is that they are simply SELECT
>>> statements where the columns that are selected become variables
>>> accessible anywhere in the scope of the transaction. That is
to say, you
>>> should be able to run something like s/LET/SELECT and
>>> s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET
statement
>>> and produce a valid SELECT statement, and vice versa. Both should
>>> perform identically.
>>>
>>> e.g.
>>> SELECT pk AS key, v AS value FROM table
>>>
>>> =>
>>> LET key = pk, value = v FROM table
>>
>> "=" is a CQL/SQL operator. Cassandra doesn't support it yet,
but SQL
>> supports selecting comparisons:
>>
>>
>> $ psql
>> psql (14.3)
>> Type "help" for help.
>>
>> avi=# SELECT 1 = 2, 3 = 3, NULL = NULL;
>>   ?column? | ?column? | ?column?
>> --+--+--
>>   f    | t    |
>> (1 row)
>>
>>
>> Using "=" as a syntactic element in LET would make SELECT and LET
>> incompatible once comparisons become valid selectors. Unless
they become
>> mandatory (and then you'd write "LET q = a = b" if you wanted
to select a
>> comparison).
>>
>>
>> I personally prefer the nested query syntax:
>>
>>
>>      LET (a, b, c) = (SELECT foo, bar, x+y FROM ...);
>>
>>
>> So there aren't two similar-but-not-quite-the-same syntaxes.
SELECT is
>> immediately recognizable by everyone as a query, LET is not.
>>
>>
>>> Identical form, identical behaviour. Every statement should be
directly
>>> translatable with some simple text manipulation.
>>>
>>> We can then make this more powerful for users by simply
expanding SELECT
>>> statements, e.g. by permitting them to declare constants and
tuples in
>>> the column results. In this scheme LET x = * is simply
syntactic sugar
>>> for LET x = (pk, ck, field1, …) This scheme then supports
options 2, 4
>>> and 5 all at once, consistently alongside each other.
>>>
>>> Option 6 is in fact very similar, but is strictly less
flexible for the
>>> user as they have no way to declare multiple scalar variables
without
>>> scoping them inside a tuple.
>>>
>>> e.g.
>>> LET key = pk, value = v FROM table
>>> IF key > 1 AND value > 1 THEN...
>>>
>>> =>
>>> LET row = SELECT pk AS key, v AS value FROM table
>>> IF row.key > 1 AND row.value > 1 THEN…
>>>
>>> However, both are expressible in the existing proposal, as if
you prefer
>>> this naming scheme you can simply write
>>>
>>> LET row = (pk AS key, v AS value) FROM table
>>> IF row.key > 1 AND row.value > 1 THEN…
>>>
>>> With respect to auto converting single column results to a
scalar, we do
>>> need a way for the user to say they care whether the row was
null or the
>>> column. I think an implicit conversion here could be
surprising. However
>>> we

Re: CEP-15 multi key transaction syntax

2022-08-22 Thread Avi Kivity via dev


Agree it's better to reuse existing syntax than invent new syntax.

On 8/21/22 16:52, Konstantin Osipov wrote:

* Avi Kivity via dev  [22/08/14 15:59]:

MySQL supports SELECT  INTO  FROM ... WHERE
...

PostgreSQL supports pretty much the same syntax.

Maybe instead of LET use the ANSI/MySQL/PostgreSQL DECLARE var TYPE and
MySQL/PostgreSQL SELECT ... INTO?


On 14/08/2022 01.29, Benedict Elliott Smith wrote:


I’ll do my best to express with my thinking, as well as how I would
explain the feature to a user.

My mental model for LET statements is that they are simply SELECT
statements where the columns that are selected become variables
accessible anywhere in the scope of the transaction. That is to say, you
should be able to run something like s/LET/SELECT and
s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET statement
and produce a valid SELECT statement, and vice versa. Both should
perform identically.

e.g.
SELECT pk AS key, v AS value FROM table

=>
LET key = pk, value = v FROM table


"=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL
supports selecting comparisons:


$ psql
psql (14.3)
Type "help" for help.

avi=# SELECT 1 = 2, 3 = 3, NULL = NULL;
  ?column? | ?column? | ?column?
--+--+--
  f    | t    |
(1 row)


Using "=" as a syntactic element in LET would make SELECT and LET
incompatible once comparisons become valid selectors. Unless they become
mandatory (and then you'd write "LET q = a = b" if you wanted to select a
comparison).


I personally prefer the nested query syntax:


     LET (a, b, c) = (SELECT foo, bar, x+y FROM ...);


So there aren't two similar-but-not-quite-the-same syntaxes. SELECT is
immediately recognizable by everyone as a query, LET is not.



Identical form, identical behaviour. Every statement should be directly
translatable with some simple text manipulation.

We can then make this more powerful for users by simply expanding SELECT
statements, e.g. by permitting them to declare constants and tuples in
the column results. In this scheme LET x = * is simply syntactic sugar
for LET x = (pk, ck, field1, …) This scheme then supports options 2, 4
and 5 all at once, consistently alongside each other.

Option 6 is in fact very similar, but is strictly less flexible for the
user as they have no way to declare multiple scalar variables without
scoping them inside a tuple.

e.g.
LET key = pk, value = v FROM table
IF key > 1 AND value > 1 THEN...

=>
LET row = SELECT pk AS key, v AS value FROM table
IF row.key > 1 AND row.value > 1 THEN…

However, both are expressible in the existing proposal, as if you prefer
this naming scheme you can simply write

LET row = (pk AS key, v AS value) FROM table
IF row.key > 1 AND row.value > 1 THEN…

With respect to auto converting single column results to a scalar, we do
need a way for the user to say they care whether the row was null or the
column. I think an implicit conversion here could be surprising. However
we could implement tuple expressions anyway and let the user explicitly
declare v as a tuple as Caleb has suggested for the existing proposal as
well.

Assigning constants or other values not selected from a table would also
be a little clunky:

LET v1 = someFunc(), v2 = someOtherFunc(?)
IF v1 > 1 AND v2 > 1 THEN…

=>
LET row = SELECT someFunc() AS v1, someOtherFunc(?) AS v2
IF row.v1 > 1 AND row.v2 > 1 THEN...

That said, the proposals are /close/ to identical, it is just slightly
more verbose and slightly less flexible.

Which one would be most intuitive to users is hard to predict. It might
be that Option 6 would be slightly easier, but I’m unsure if there would
be a huge difference.



On 13 Aug 2022, at 16:59, Patrick McFadin  wrote:

I'm really happy to see CEP-15 getting closer to a final
implementation. I'm going to walk through my reasoning for your
proposals wrt trying to explain this to somebody new.

Looking at all the options, the first thing that comes up for me is
the Cassandra project's complicated relationship with NULL.  We have
prior art with EXISTS/NOT EXISTS when creating new tables. IS
NULL/IS NOT NULL is used in materialized views similarly to
proposals 2,4 and 5.

CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name
   AS SELECT [ (column_list) ]
   FROM [keyspace_name.]table_name
   [ WHERE column_name IS NOT NULL
   [ AND column_name IS NOT NULL ... ] ]
   [ AND relation [ AND ... ] ]
   PRIMARY KEY ( column_list )
   [ WITH [ table_properties ]
   [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) ] ] ;

  Based on that, I believe 1 and 3 would just confuse users, so -1 on
those.

Trying to explain the difference between row and column operations
with LET, I can't see the difference between a row and column in #2.

#4 introduces a boolean instead of column names and just adds more
syntax.

#5 is verbose and, in my opinion, easier to rea

Re: [Proposal] add pull request template

2022-08-18 Thread Avi Kivity via dev



On 18/08/2022 18.46, Mick Semb Wever wrote:




Until IDEs auto cross-reference JIRA,

I'm going to lightly touch the lid of Pandora's Box here and walk
away slowly. It drives me *nuts* when I'm git blaming a file to
understand the context of why a change was made (to make sure I
continue to respect it!) and I see "merge 3.11 into trunk" or some
other such useless commit message, then have to dig into the git
integration and history, then figure out which merge commits were
real and which were -s ours and silently changed, etc.

So about those merge commits... ;)



The beef I have with this is it's just not that difficult: just look 
at the parent 2 commit of the merge.


```
|git log -n1 ^2|
```
(you can also use `git log --follow .` if you like history without 
merge commits)




There's `git merge --log` which provides a short-form log in the merge 
commit.

Re: CEP-15 multi key transaction syntax

2022-08-14 Thread Avi Kivity via dev



On 14/08/2022 17.50, Benedict Elliott Smith wrote:


> SELECT and LET incompatible once comparisons become valid selectors

I don’t think this would be ambiguous, as = is required in the LET 
syntax as we have to bind the result to a variable name.


But, I like the deconstructed tuple syntax improvement over “Option 
6”. This would also seem to easily support assigning from non-query 
statements, such as LET (a, b) = (someFunc(), someOtherFunc(?))


I don’t think it is ideal to depend on relative position in the tuple 
for assigning results to a variable name, as it leaves more scope for 
errors. It would be nice to have a simple way to deconstruct safely. 
But, I think this proposal is good, and I’d be fine with it as an 
alternative if others concur. I agree that seeing the SELECT 
independently may be more easily recognisable to users.


With this approach there remains the question of how we handle single 
column results. I’d be inclined to treat in the following way:


LET (a) = SELECT val FROM table
IF a > 1 THEN...

LET a = SELECT val FROM table
IF a.val > 1 THEN...



I think SQL dialects require subqueries to be parenthesized (not sure). 
If that's the case I think we should keep the tradition.





There is also the question of whether we support SELECT without a FROM 
clause, e.g.

LET x = SELECT someFunc() AS v1, someOtherFunc() AS v2

Or just LET (since they are no longer equivalent)
e.g.
LET x = (someFunc() AS v1, someOtherFunc() as v2)
LET (v1, v2) = (someFunc(), someOtherFunc())



I see no harm in making FROM optional, as it's recognized by other SQL 
dialects.





Also since LET is only binding variables, is there any reason we 
shouldn’t support multiple SELECT assignments in a single LET?, e.g.

LET (x, y) = ((SELECT x FROM…), (SELECT y FROM))



What if an inner select returns a tuple? Would y be a tuple?


I think this is redundant and atypical enough to not be worth 
supporting. Most people would use separate LETs.





Also whether we support tuples in SELECT statements anyway, e.g.
LET (tuple1, tuple2) = SELECT (a, b), (c, d) FROM..
IF tuple1.a > 1 AND tuple2.d > 1…



Absolutely, this just flows naturally from having tuples. There's no 
difference between "SELECT (a, b)" and "SELECT a_but_a_is_a_tuple".






and whether we support nested deconstruction, e.g.
LET (a, b, (c, d)) = SELECT a, b, someTuple FROM..
IF a > 1 AND d > 1…



I think this can be safely deferred. Most people would again separate it 
into separate LETs.



I'd add (to the specification) that LETs cannot override a previously 
defined variable, just to reduce ambiguity.










On 14 Aug 2022, at 13:55, Avi Kivity via dev 
 wrote:



On 14/08/2022 01.29, Benedict Elliott Smith wrote:


I’ll do my best to express with my thinking, as well as how I would 
explain the feature to a user.


My mental model for LET statements is that they are simply SELECT 
statements where the columns that are selected become variables 
accessible anywhere in the scope of the transaction. That is to say, 
you should be able to run something like s/LET/SELECT and 
s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET 
statement and produce a valid SELECT statement, and vice versa. Both 
should perform identically.


e.g.
SELECT pk AS key, v AS value FROM table

=>
LET key = pk, value = v FROM table



"=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL 
supports selecting comparisons:



$ psql
psql (14.3)
Type "help" for help.

avi=# SELECT 1 = 2, 3 = 3, NULL = NULL;
 ?column? | ?column? | ?column?
--+--+--
 f    | t    |
(1 row)


Using "=" as a syntactic element in LET would make SELECT and LET 
incompatible once comparisons become valid selectors. Unless they 
become mandatory (and then you'd write "LET q = a = b" if you wanted 
to select a comparison).



I personally prefer the nested query syntax:


    LET (a, b, c) = (SELECT foo, bar, x+y FROM ...);


So there aren't two similar-but-not-quite-the-same syntaxes. SELECT 
is immediately recognizable by everyone as a query, LET is not.





Identical form, identical behaviour. Every statement should be 
directly translatable with some simple text manipulation.


We can then make this more powerful for users by simply expanding 
SELECT statements, e.g. by permitting them to declare constants and 
tuples in the column results. In this scheme LET x = * is simply 
syntactic sugar for LET x = (pk, ck, field1, …) This scheme then 
supports options 2, 4 and 5 all at once, consistently alongside each 
other.


Option 6 is in fact very similar, but is strictly less flexible for 
the user as they have no way to declare multiple scalar variables 
without scoping them inside a tuple.


e.g.
LET key = pk, value = v FROM table
IF key > 1 AND value > 1 THEN...

=>
LET row = SELECT pk AS key, v AS value FROM table
I

Re: CEP-15 multi key transaction syntax

2022-08-14 Thread Avi Kivity via dev



On 14/08/2022 01.29, Benedict Elliott Smith wrote:


I’ll do my best to express with my thinking, as well as how I would 
explain the feature to a user.


My mental model for LET statements is that they are simply SELECT 
statements where the columns that are selected become variables 
accessible anywhere in the scope of the transaction. That is to say, 
you should be able to run something like s/LET/SELECT and 
s/([^=]+)=([^,]+)(,|$)/\2 AS \1\3/g on the columns of a LET statement 
and produce a valid SELECT statement, and vice versa. Both should 
perform identically.


e.g.
SELECT pk AS key, v AS value FROM table

=>
LET key = pk, value = v FROM table



"=" is a CQL/SQL operator. Cassandra doesn't support it yet, but SQL 
supports selecting comparisons:



$ psql
psql (14.3)
Type "help" for help.

avi=# SELECT 1 = 2, 3 = 3, NULL = NULL;
 ?column? | ?column? | ?column?
--+--+--
 f    | t    |
(1 row)


Using "=" as a syntactic element in LET would make SELECT and LET 
incompatible once comparisons become valid selectors. Unless they become 
mandatory (and then you'd write "LET q = a = b" if you wanted to select 
a comparison).



I personally prefer the nested query syntax:


    LET (a, b, c) = (SELECT foo, bar, x+y FROM ...);


So there aren't two similar-but-not-quite-the-same syntaxes. SELECT is 
immediately recognizable by everyone as a query, LET is not.





Identical form, identical behaviour. Every statement should be 
directly translatable with some simple text manipulation.


We can then make this more powerful for users by simply expanding 
SELECT statements, e.g. by permitting them to declare constants and 
tuples in the column results. In this scheme LET x = * is simply 
syntactic sugar for LET x = (pk, ck, field1, …) This scheme then 
supports options 2, 4 and 5 all at once, consistently alongside each 
other.


Option 6 is in fact very similar, but is strictly less flexible for 
the user as they have no way to declare multiple scalar variables 
without scoping them inside a tuple.


e.g.
LET key = pk, value = v FROM table
IF key > 1 AND value > 1 THEN...

=>
LET row = SELECT pk AS key, v AS value FROM table
IF row.key > 1 AND row.value > 1 THEN…

However, both are expressible in the existing proposal, as if you 
prefer this naming scheme you can simply write


LET row = (pk AS key, v AS value) FROM table
IF row.key > 1 AND row.value > 1 THEN…

With respect to auto converting single column results to a scalar, we 
do need a way for the user to say they care whether the row was null 
or the column. I think an implicit conversion here could be 
surprising. However we could implement tuple expressions anyway and 
let the user explicitly declare v as a tuple as Caleb has suggested 
for the existing proposal as well.


Assigning constants or other values not selected from a table would 
also be a little clunky:


LET v1 = someFunc(), v2 = someOtherFunc(?)
IF v1 > 1 AND v2 > 1 THEN…

=>
LET row = SELECT someFunc() AS v1, someOtherFunc(?) AS v2
IF row.v1 > 1 AND row.v2 > 1 THEN...

That said, the proposals are /close/ to identical, it is just slightly 
more verbose and slightly less flexible.


Which one would be most intuitive to users is hard to predict. It 
might be that Option 6 would be slightly easier, but I’m unsure if 
there would be a huge difference.




On 13 Aug 2022, at 16:59, Patrick McFadin  wrote:

I'm really happy to see CEP-15 getting closer to a final 
implementation. I'm going to walk through my reasoning for your 
proposals wrt trying to explain this to somebody new.


Looking at all the options, the first thing that comes up for me is 
the Cassandra project's complicated relationship with NULL.  We have 
prior art with EXISTS/NOT EXISTS when creating new tables. IS NULL/IS 
NOT NULL is used in materialized views similarly to proposals 2,4 and 5.


CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] [keyspace_name.]view_name
  AS SELECT [ (column_list) ]
  FROM [keyspace_name.]table_name
  [ WHERE column_name IS NOT NULL
  [ AND column_name IS NOT NULL ... ] ]
  [ AND relation [ AND ... ] ]
  PRIMARY KEY ( column_list )
  [ WITH [ table_properties ]
  [ [ AND ] CLUSTERING ORDER BY (cluster_column_name order_option) ] ] ;

 Based on that, I believe 1 and 3 would just confuse users, so -1 on 
those.


Trying to explain the difference between row and column operations 
with LET, I can't see the difference between a row and column in #2.


#4 introduces a boolean instead of column names and just adds more 
syntax.


#5 is verbose and, in my opinion, easier to reason when writing a 
query. Thinking top down, I need to know if these exact rows and/or 
column values exist before changing them, so I'll define them first. 
Then I'll iterate over the state I created in my actual changes so I 
know I'm changing precisely what I want.


#5 could use a bit more to be clearer to somebody who doesn't write 
CQL queries daily and wouldn't require memorizing subtle

Re: Evolving the client protocol

2018-04-29 Thread Avi Kivity

"Bullied"? Neither me nor anyone else made any demands or threats. I 
proposed cooperation, and acknowledged up front, in my first email, that 
cooperation might not be wanted by Cassandra.





On 2018-04-28 20:50, Jeff Jirsa wrote:


You're a committer Mick, if you think it belongs in the database, write the
patches and get them reviewed.  Until then, the project isn't going to be
bullied into changing the protocol without an implementation.

- Jeff




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Evolving the client protocol

2018-04-24 Thread Avi Kivity




On 2018-04-24 04:18, Nate McCall wrote:

Folks,
Before this goes much further, let's take a step back for a second.

I am hearing the following: Folks are fine with CASSANDRA-14311 and
CASSANDRA-2848 *BUT* they don't make much sense from the project's
perspective without a reference implementation. I think the shard
concept is too abstract for the project right now, so we should
probably set that one aside.

Dor and Avi, I appreciate you both engaging directly on this. Where
can we find common ground on this?



I started with three options:

1. Scylla (or other protocol implementers) contribute spec changes, and 
each implementer implements them on their own


This was rejected.

2. Scylla defines and implements spec changes on its own, and when 
Cassandra implements similar changes, it will retroactively apply the 
Scylla change if it makes technical sense


IOW, no gratuitous divergence, but no hard commitment either.

I received no feedback on this.

3. No cooperation.

This is the fall-back option which I would like to avoid if possible. 
It's main advantage is that it avoids long email threads and flamewars.


There was also a suggestion made in this thread:

4. Scylla defines spec changes and also implements them for Cassandra

That works for some changes but not all (for example, thread-per-core 
awareness, or changes that require significant effort). I would like to 
find a way that works for all of the changes that we want to make.



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Evolving the client protocol

2018-04-24 Thread Avi Kivity

On 2018-04-23 17:59, Ben Bromhead wrote:

>> This doesn't work without additional changes, for RF>1. The
token ring could place two replicas of the same token range on the
same physical server, even though those are two separate cores of
the same server. You could add another element to the hierarchy
(cluster -> datacenter -> rack -> node -> core/shard), but that
generates unneeded range movements when a node is added.
> I have seen rack awareness used/abused to solve this.
>

But then you lose real rack awareness. It's fine for a quick hack,
but
not a long-term solution.

(it also creates a lot more tokens, something nobody needs)

I'm having trouble understanding how you loose "real" rack awareness, 
as these shards are in the same rack anyway, because the address and 
port are on the same server in the same rack. So it behaves as 
expected. Could you explain a situation where the shards on a single 
server would be in different racks (or fault domains)?

You're right - it continues to work.

If you wanted to support a situation where you have a single rack per 
DC for simple deployments, extending NetworkTopologyStrategy to behave 
the way it did before 
https://issues.apache.org/jira/browse/CASSANDRA-7544 with respect to 
treating InetAddresses as servers rather than the address and port 
would be simple. Both this implementation in Apache Cassandra and the 
respective load balancing classes in the drivers are explicitly 
designed to be pluggable so that would be an easier integration point 
for you.

I'm not sure how it creates more tokens? If a server normally owns 256 
tokens, each shard on a different port would just advertise ownership 
of 256/# of cores (e.g. 4 tokens if you had 64 cores).

Having just 4 tokens results in imbalance. CASSANDRA-7032 mitigates it, 
but only for one replication factor, and doesn't work for decommission.

(and if you have 60 lcores then you get between 4 and 5 tokens per 
lcore, which is a 20% imbalance right there)

> Regards,
> Ariel
    >
>> On Apr 22, 2018, at 8:26 AM, Avi Kivity <a...@scylladb.com
<mailto:a...@scylladb.com>> wrote:
>>
>>
>>
>>> On 2018-04-19 21:15, Ben Bromhead wrote:
>>> Re #3:
>>>
>>> Yup I was thinking each shard/port would appear as a discrete
server to the
>>> client.
>> This doesn't work without additional changes, for RF>1. The
token ring could place two replicas of the same token range on the
same physical server, even though those are two separate cores of
the same server. You could add another element to the hierarchy
(cluster -> datacenter -> rack -> node -> core/shard), but that
generates unneeded range movements when a node is added.
>>
>>> If the per port suggestion is unacceptable due to hardware
requirements,
>>> remembering that Cassandra is built with the concept scaling
*commodity*
>>> hardware horizontally, you'll have to spend your time and
energy convincing
>>> the community to support a protocol feature it has no
(current) use for or
>>> find another interim solution.
>> Those servers are commodity servers (not x86, but still
commodity). In any case 60+ logical cores are common now (hello
AWS i3.16xlarge or even i3.metal), and we can only expect logical
core count to continue to increase (there are 48-core ARM
processors now).
>>
>>> Another way, would be to build support and consensus around a
clear
>>> technical need in the Apache Cassandra project as it stands today.
>>>
>>> One way to build community support might be to contribute an
Apache
>>> licensed thread per core implementation in Java that matches
the protocol
>>> change and shard concept you are looking for ;P
>> I doubt I'll survive the egregious top-posting that is going on
in this list.
>>
>>>
>>>> On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg
<ar...@weisberg.ws <mailto:ar...@weisberg.ws>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> So at technical level I don't understand this yet.
>>>>
>>>> So you have a database consisting of single threaded shards
and a socket
>>>> for accept that is generating TCP connections and in advance
you don't know
>>>> which connection is going to send messages to which shard.
>>>>
>>>> What is the mechanism by which you get the packets for a
given TCP
>>>> connection delivered to a spec

Re: Evolving the client protocol

2018-04-24 Thread Avi Kivity


I have not asked this list to do any work on the drivers.


If Cassandra agrees to Scylla protocol changes (either proactively or 
retroactively) then the benefit to Cassandra is that if the drivers are 
changed (by the driver maintainers or by Scylla developers) then 
Cassandra developers need not do additional work to update the drivers. 
So there is less work for you, in the future, if those features are of 
interest to you.



On 2018-04-24 02:13, Jonathan Haddad wrote:

 From where I stand it looks like you've got only two options for any
feature that involves updating the protocol:

1. Don't built the feature
2. Built it in Cassanda & scylladb, update the drivers accordingly

I don't think you have a third option, which is built it only in ScyllaDB,
because that means you have to fork *all* the drivers and make it work,
then maintain them.  Your business model appears to be built on not doing
any of the driver work yourself, and you certainly aren't giving back to
the open source community via a permissive license on ScyllaDB itself, so
I'm a bit lost here.

To me it looks like you're asking a bunch of volunteers that work on
Cassandra to accommodate you.  What exactly do we get out of this
relationship?  What incentive do I or anyone else have to spend time
helping you instead of working on something that interests me?

Jon


On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead <b...@instaclustr.com> wrote:


This doesn't work without additional changes, for RF>1. The token ring

could place two replicas of the same token range on the same physical
server, even though those are two separate cores of the same server. You
could add another element to the hierarchy (cluster -> datacenter -> rack
-> node -> core/shard), but that generates unneeded range movements when

a

node is added.

I have seen rack awareness used/abused to solve this.


But then you lose real rack awareness. It's fine for a quick hack, but
not a long-term solution.

(it also creates a lot more tokens, something nobody needs)


I'm having trouble understanding how you loose "real" rack awareness, as
these shards are in the same rack anyway, because the address and port are
on the same server in the same rack. So it behaves as expected. Could you
explain a situation where the shards on a single server would be in
different racks (or fault domains)?

If you wanted to support a situation where you have a single rack per DC
for simple deployments, extending NetworkTopologyStrategy to behave the way
it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
respect to treating InetAddresses as servers rather than the address and
port would be simple. Both this implementation in Apache Cassandra and the
respective load balancing classes in the drivers are explicitly designed to
be pluggable so that would be an easier integration point for you.

I'm not sure how it creates more tokens? If a server normally owns 256
tokens, each shard on a different port would just advertise ownership of
256/# of cores (e.g. 4 tokens if you had 64 cores).



Regards,
Ariel


On Apr 22, 2018, at 8:26 AM, Avi Kivity <a...@scylladb.com> wrote:




On 2018-04-19 21:15, Ben Bromhead wrote:
Re #3:

Yup I was thinking each shard/port would appear as a discrete server

to the

client.

This doesn't work without additional changes, for RF>1. The token ring

could place two replicas of the same token range on the same physical
server, even though those are two separate cores of the same server. You
could add another element to the hierarchy (cluster -> datacenter -> rack
-> node -> core/shard), but that generates unneeded range movements when

a

node is added.

If the per port suggestion is unacceptable due to hardware

requirements,

remembering that Cassandra is built with the concept scaling

*commodity*

hardware horizontally, you'll have to spend your time and energy

convincing

the community to support a protocol feature it has no (current) use

for or

find another interim solution.

Those servers are commodity servers (not x86, but still commodity). In

any case 60+ logical cores are common now (hello AWS i3.16xlarge or even
i3.metal), and we can only expect logical core count to continue to
increase (there are 48-core ARM processors now).

Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the

protocol

change and shard concept you are looking for ;P

I doubt I'll survive the egregious top-posting that is going on in

this

list.

On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg <ar...@weisberg.ws>

wrote:

Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a

socket

for accept that is generating TCP connect

Re: Evolving the client protocol

2018-04-23 Thread Avi Kivity

On 2018-04-22 23:35, Josh McKenzie wrote:

The drivers are not part of Cassandra, so what "the server" is for drivers is
up to their maintainer.

I'm pretty sure the driver communities don't spend a lot of time
worrying about their Scylla compatibility. That's your cross to bear.

To clarify, I wasn't asking this list for help with the client drivers.
The purpose of this thread was to see if we can find a way to avoid
forking the protocol.

On Sun, Apr 22, 2018 at 11:00 AM, Ariel Weisberg <adwei...@fastmail.fm> wrote:

Hi,

This doesn't work without additional changes, for RF>1. The token ring could place two
replicas of the same token range on the same physical server, even though those are two
separate cores of the same server. You could add another element to the hierarchy (cluster
-> datacenter -> rack -> node -> core/shard), but that generates unneeded range
movements when a node is added.

I have seen rack awareness used/abused to solve this.

Regards,
Ariel

On Apr 22, 2018, at 8:26 AM, Avi Kivity <a...@scylladb.com> wrote:

On 2018-04-19 21:15, Ben Bromhead wrote:
Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.

If the per port suggestion is unacceptable due to hardware requirements,
remembering that Cassandra is built with the concept scaling *commodity*
hardware horizontally, you'll have to spend your time and energy convincing
the community to support a protocol feature it has no (current) use for or
find another interim solution.

Those servers are commodity servers (not x86, but still commodity). In any case
60+ logical cores are common now (hello AWS i3.16xlarge or even i3.metal), and
we can only expect logical core count to continue to increase (there are
48-core ARM processors now).

Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P

I doubt I'll survive the egregious top-posting that is going on in this list.

On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg <ar...@weisberg.ws> wrote:

Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket
for accept that is generating TCP connections and in advance you don't know
which connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP
connection delivered to a specific core? I know that a given TCP connection
will normally have all of its packets delivered to the same queue from the
NIC because the tuple of source address + port and destination address +
port is typically hashed to pick one of the queues the NIC presents. I
might have the contents of the tuple slightly wrong, but it always includes
a component you don't get to control.

Since it's hashing how do you manipulate which queue packets for a TCP
connection go to and how is it made worse by having an accept socket per
shard?

You also mention 160 ports as bad, but it doesn't sound like a big number
resource wise. Is it an operational headache?

RE tokens distributed amongst shards. The way that would work right now is
that each port number appears to be a discrete instance of the server. So
you could have shards be actual shards that are simply colocated on the
same box, run in the same process, and share resources. I know this pushes
more of the complexity into the server vs the driver as the server expects
all shards to share some client visible like system tables and certain
identifiers.

Ariel

On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
Port-per-shard is likely the easiest option but it's too ugly to
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
IIRC), it will be just horrible to have 160 open ports.

It also doesn't fit will with the NICs ability to automatically
distribute packets among cores using multiple queues, so the kernel
would have to shuffle those packets around. Much better to have those
packets delivered directly to the core that will service them.

(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)

On 2018-04-19 19:46, Ben Bromhead wrote:
WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support th

Re: Evolving the client protocol

2018-04-23 Thread Avi Kivity

On 2018-04-22 18:00, Ariel Weisberg wrote:

Hi,

I have seen rack awareness used/abused to solve this.

But then you lose real rack awareness. It's fine for a quick hack, but
not a long-term solution.

(it also creates a lot more tokens, something nobody needs)

Regards,
Ariel

On Apr 22, 2018, at 8:26 AM, Avi Kivity <a...@scylladb.com> wrote:

On 2018-04-19 21:15, Ben Bromhead wrote:
Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.

Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P

I doubt I'll survive the egregious top-posting that is going on in this list.

On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg <ar...@weisberg.ws> wrote:

Hi,

So at technical level I don't understand this yet.

Since it's hashing how do you manipulate which queue packets for a TCP
connection go to and how is it made worse by having an accept socket per
shard?

You also mention 160 ports as bad, but it doesn't sound like a big number
resource wise. Is it an operational headache?

Ariel

(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)

On 2018-04-19 19:46, Ben Bromhead wrote:
WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596). I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.

On Thu, Apr 19, 2018 at 12:10 PM Arie

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity




On 2018-04-19 21:15, Ben Bromhead wrote:

Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.


This doesn't work without additional changes, for RF>1. The token ring 
could place two replicas of the same token range on the same physical 
server, even though those are two separate cores of the same server. You 
could add another element to the hierarchy (cluster -> datacenter -> 
rack -> node -> core/shard), but that generates unneeded range movements 
when a node is added.



If the per port suggestion is unacceptable due to hardware requirements,
remembering that Cassandra is built with the concept scaling *commodity*
hardware horizontally, you'll have to spend your time and energy convincing
the community to support a protocol feature it has no (current) use for or
find another interim solution.


Those servers are commodity servers (not x86, but still commodity). In 
any case 60+ logical cores are common now (hello AWS i3.16xlarge or even 
i3.metal), and we can only expect logical core count to continue to 
increase (there are 48-core ARM processors now).




Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P


I doubt I'll survive the egregious top-posting that is going on in this 
list.





On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg <ar...@weisberg.ws> wrote:


Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket
for accept that is generating TCP connections and in advance you don't know
which connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP
connection delivered to a specific core? I know that a given TCP connection
will normally have all of its packets delivered to the same queue from the
NIC because the tuple of source address + port and destination address +
port is typically hashed to pick one of the queues the NIC presents. I
might have the contents of the tuple slightly wrong, but it always includes
a component you don't get to control.

Since it's hashing how do you manipulate which queue packets for a TCP
connection go to and how is it made worse by having an accept socket per
shard?

You also mention 160 ports as bad, but it doesn't sound like a big number
resource wise. Is it an operational headache?

RE tokens distributed amongst shards. The way that would work right now is
that each port number appears to be a discrete instance of the server. So
you could have shards be actual shards that are simply colocated on the
same box, run in the same process, and share resources. I know this pushes
more of the complexity into the server vs the driver as the server expects
all shards to share some client visible like system tables and certain
identifiers.

Ariel
On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:

Port-per-shard is likely the easiest option but it's too ugly to
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
IIRC), it will be just horrible to have 160 open ports.


It also doesn't fit will with the NICs ability to automatically
distribute packets among cores using multiple queues, so the kernel
would have to shuffle those packets around. Much better to have those
packets delivered directly to the core that will service them.


(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)

On 2018-04-19 19:46, Ben Bromhead wrote:

WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg <ar...@weisberg.ws>

wrote:

Hi,

I think that updating the protocol spec to Cassandra puts the onus on

the

party changing the protocol specification to have an implementation

of the

spec in Cassandra as well as the Java and Python driver (those are

both

used in the Cassandra repo). Until it's implemented in Cassandra we

haven't

fully evaluated the specification change. There is no substitute for

trying

to make it work.

There are also realities to consider as to what the maintainers of the
drivers are willing to commit.

RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range

scans.

In JIRA Jeremiah made the point that you can still do this from the

client

by breaking up the tok

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity

You're right in principle, but in practice we haven't seen problems with 
the term.



On 2018-04-19 20:31, Michael Shuler wrote:

This is purely my own opinion, but I find the use of the term 'shard'
quite unfortunate in the context of a distributed database. The
historical usage of the term has been the notion of data partitions that
reside on separate database servers. There is a learning curve with
distributed databases, and I can foresee the use of the term adding
additional confusion for new users. Not a fan.




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity




On 2018-04-20 12:03, Sylvain Lebresne wrote:


Those were just given as examples. Each would be discussed on its own,
assuming we are able to find a way to cooperate.


These are relatively simple and it wouldn't be hard for use to patch
Cassandra. But I want to find a way to make more complicated protocol
changes where it wouldn't be realistic for us to modify Cassandra.


That's where I'm confused with what you are truly asking.

The native protocol is the protocol of the Apache Cassandra project and was
never meant to be a standard protocol. If the ask is to move towards more
of handling the protocol as a standard that would evolve independently of
whether Cassandra implements it (would the project commit to implement it
eventually?), then let's be clear on what the concrete suggestion is and
have this discussion (but to be upfront, the short version of my personal
opinion is that this would likely be a big distraction with relatively low
merits for the project, so I'm very unconvinced).


I proposed several ways to cooperate. Yes, my "mode 1" essentially makes 
the protocol a standard.


For better or for worse, there are now at least 4 server-side 
implementations of the protocol, 5 if you count dse as a separate 
implementation. So it is de-facto a standard.




But if that's not the ask, what is it exactly? That we agree to commit
changes
to the protocol spec before we have actually implemented them? If so, I just
don't get it. The downsides are clear (we risk the feature is either never
implemeted due to lack of contributions/loss of interest, or that the
protocol
changes committed are not fully suitable to the final implementation) but
what
benefit to the project can that ever have?


If another implementation defines a protocol change, and drivers are 
patched to implement that change, then when Cassandra implements that 
change it gets those driver changes for free. Provided of course that 
the protocol change has a technical match with the implementation.




Don't get me wrong, protocol-impacting changes/additions are very much
welcome
if reasonable for Cassandra, and both CASSANDRA-14311 and CASSANDRA-2848 are
certainly worthy. Both the definition of done of those ticket certainly
include the server implementation imo, not just changing the protocol spec
file. As for the shard notion, it makes no sense for Cassandra at this point
in time, so unless an additional contribution makes it so that it start to
make
sense, I'm not sure why we'd add anything related to it to the protocol.

--
Sylvain




RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves

in by defining a spec we haven't implemented, tested, and decided we are
satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
what if Cassandra decides to go a different direction in some way?

Such a proposal would include negotiation about the sharding algorithm
used to prevent Cassandra being boxed in. Of course it's impossible to
guarantee that a new idea won't come up that requires more changes.


I don't think there is much discussion to be had without an example of

the the changes to the CQL specification to look at, but even then if it
looks risky I am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:

On 2018/04/19 07:19:27, kurt greaves  wrote:

1. The protocol change is developed using the Cassandra process in
 a JIRA ticket, culminating in a patch to
 doc/native_protocol*.spec when consensus is achieved.

I don't think forking would be desirable (for anyone) so this seems
the most reasonable to me. For 1 and 2 it certainly makes sense but
can't say I know enough about sharding to comment on 3 - seems to me
like it could be locking in a design before anyone truly knows what
sharding in C* looks like. But hopefully I'm wrong and there are
devs out there that have already thought that through.

Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are
future proof and flexible. We certainly don't want to propose a design
that works just for Scylla, but would support reasonable
implementations regardless of how they may look like.


Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority. 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity

On 2018-04-19 20:43, Ariel Weisberg wrote:

Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket for
accept that is generating TCP connections and in advance you don't know which
connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP connection
delivered to a specific core? I know that a given TCP connection will normally
have all of its packets delivered to the same queue from the NIC because the
tuple of source address + port and destination address + port is typically
hashed to pick one of the queues the NIC presents. I might have the contents of
the tuple slightly wrong, but it always includes a component you don't get to
control.

Right, that's how it's done. The component you typically don't get to
control is the client-side local port, but you can bind to a local port
if you want.

Since it's hashing how do you manipulate which queue packets for a TCP
connection go to and how is it made worse by having an accept socket per shard?

It's not made worse, it's just not made better.

There are three ways at least to get multiqueue to work with
thread-per-core without software movement of packets, none of them pretty:

1. The client tells the server which shard to connect to. The server
uses "Flow Director" [1] or an equivalent to bypass the hash and bind
the connection to a particular queue. This is problematic since you need
to bypass the tcp stack, and since there are a limited number of entries
in the flow director table.
2. The client asks the server which shard it happened to connect to.
This requires the client to open many connections in order to reach all
shards, and then close any excess connections (did I mention it wasn't
pretty?).
3. The server communicates the hash function to the client, or perhaps
suggests local ports for the client to use in order to reach a shard.
This can be problematic if the server doesn't know the hash function
(can happen in some virtualized environments, or with new NICs, or with
limited knowledge of the hardware topology). See similar approach in [2].

[1]
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/intel-ethernet-flow-director.pdf
[2]
https://github.com/scylladb/seastar/blob/0b8b851b432a1d04522a80d9830e07449d71caa2/net/tcp.hh#L790

You also mention 160 ports as bad, but it doesn't sound like a big number
resource wise. Is it an operational headache?

Port 9042 + N can easily conflict with another statically allocated port
on the server. I guess you can listen on ephemeral ports, but then if
you firewall them, you need to adjust the firewall rules.

In any case it doesn't solve the problem of directing a connection's
packets to a specific queue.

RE tokens distributed amongst shards. The way that would work right now is that
each port number appears to be a discrete instance of the server. So you could
have shards be actual shards that are simply colocated on the same box, run in
the same process, and share resources. I know this pushes more of the
complexity into the server vs the driver as the server expects all shards to
share some client visible like system tables and certain identifiers.

This has its own problems, I'll address them in the other sub-thread (or
using our term, other continuation).

Ariel
On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:

Port-per-shard is likely the easiest option but it's too ugly to
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
IIRC), it will be just horrible to have 160 open ports.

(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)

On 2018-04-19 19:46, Ben Bromhead wrote:

WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596). I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.

On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg <ar...@weisberg.ws> wrote:

Hi,

I think that updating the protocol spec to Cassandra puts the onus on the
party changing the protocol specification to have an implementation of the
spec in Cassandra as well as the Java and Python driver (those are both
used in the Cassandra repo). Until it's implemented in Cassandra we haven't
fully evaluated the specification change. There is no subst

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity




On 2018-04-19 20:33, Ariel Weisberg wrote:

Hi,


That basically means a fork in the protocol (perhaps a temporary fork if
we go for mode 2 where Cassandra retroactively adopts our protocol
changes, if they fit will).

Implementing a protocol change may be easy for some simple changes, but
in the general case, it is not realistic to expect it.
Can you elaborate? No one is forcing driver maintainers to update their
drivers to support new features, either for Cassandra or Scylla, but
there should be no reason for them to reject a contribution adding that
support.

I think it's unrealistic to expect the next version of the protocol spec to 
include functionality that is not supported by either  the server or drivers 
once a version of the server or driver supporting that protocol version is  
released. Putting something in the spec is making a hard commitment for the 
driver and server without also specifying who will do the work.

So yes a temporary fork is fine, but then you run into things like "we" don't 
like the spec change and find we want to change it again. For us it's fine because we 
never committed to supporting the fork either way. For the driver maintainers it's fine 
because they probably never accepted the spec change either and didn't update the 
drivers. This is because the maintainers aren't going to accept changes that are 
incompatible with what the Cassandra server implements.

So if you have a temporary fork of the spec you might also be committing to a 
temporary fork of the drivers as well as the headaches that come with the final 
version of the spec not matching your fork. We would do what we can to avoid 
that by having the conversation around the protocol design up front.

What I am largely getting at is that I think Apache Cassandra and its drivers 
can only truly commit to a spec where there is a released implementation in the 
server and drivers.


The drivers are not part of Cassandra, so what "the server" is for 
drivers is up to their maintainer.



  Up until that point the spec is subject to change. We are less likely to 
change it if there is an implementation because we have already done the work 
and dug up most of the issues.

For sharding this is thorny and I think Ben makes a really good suggestion RE 
leveraging CASSANDRA-7544.  For paging state and timeouts I think it's likely 
we could stick to what we work out spec wise and we are happy to have the 
discussion and learn from ScyllaDB de-risking protocol changes, but if no one 
commits to doing the work you might find we release the next protocol version 
without the tentative spec changes.


So I think my proposed mode 1 (where the protocol, but not the server) 
is updated in cassandra.git is rejected. Let's discuss the two remaining 
options:


mode 2: cassandra.git reserves the prefix "SCYLLA" for the 
OPTIONS/SUPPORTED message, and, when it comes to implement a protocol 
extensions it will consider Scylla extensions and incorporate them into 
cassandra.git if they are found to be technically acceptable (but may of 
course extend the protocol in a different way if there is a technical 
reason)


mode 3: cassandra.git ignores Scylla


For Cassandra, the advantage of mode 2 is that if driver maintainers add 
support for the change (on their own or by merging changes authored by 
Scylla developers), then Cassandra developers get driver support with 
less effort.




Ariel
On Thu, Apr 19, 2018, at 12:53 PM, Avi Kivity wrote:


On 2018-04-19 19:10, Ariel Weisberg wrote:

Hi,

I think that updating the protocol spec to Cassandra puts the onus on the party 
changing the protocol specification to have an implementation of the spec in 
Cassandra as well as the Java and Python driver (those are both used in the 
Cassandra repo). Until it's implemented in Cassandra we haven't fully evaluated 
the specification change. There is no substitute for trying to make it work.

That basically means a fork in the protocol (perhaps a temporary fork if
we go for mode 2 where Cassandra retroactively adopts our protocol
changes, if they fit will).

Implementing a protocol change may be easy for some simple changes, but
in the general case, it is not realistic to expect it.


There are also realities to consider as to what the maintainers of the drivers 
are willing to commit.

Can you elaborate? No one is forcing driver maintainers to update their
drivers to support new features, either for Cassandra or Scylla, but
there should be no reason for them to reject a contribution adding that
support.

If you refer to a potential politically-motivated rejection by the
DataStax-maintained drivers, then those drivers should and will be
forked. That's not true open source. However, I'm not assuming that will
happen.


RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client by 
breaking up the token ranges

Re: Evolving the client protocol

2018-04-19 Thread Avi Kivity




On 2018-04-19 10:19, kurt greaves wrote:

1. The protocol change is developed using the Cassandra process in a JIRA
ticket, culminating in a patch to doc/native_protocol*.spec when consensus
is achieved.

I don't think forking would be desirable (for anyone) so this seems the
most reasonable to me. For 1 and 2 it certainly makes sense but can't say I
know enough about sharding to comment on 3 - seems to me like it could be
locking in a design before anyone truly knows what sharding in C* looks
like. But hopefully I'm wrong and there are devs out there that have
already thought that through.


Too bad you missed your flight or you'd have seen my NGCC presentation 
about all the mistakes we made when developing the sharding algorithm.




Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority.




Why is that?

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Evolving the client protocol

2018-04-19 Thread Avi Kivity




On 2018-04-19 19:10, Ariel Weisberg wrote:

Hi,

I think that updating the protocol spec to Cassandra puts the onus on the party 
changing the protocol specification to have an implementation of the spec in 
Cassandra as well as the Java and Python driver (those are both used in the 
Cassandra repo). Until it's implemented in Cassandra we haven't fully evaluated 
the specification change. There is no substitute for trying to make it work.


That basically means a fork in the protocol (perhaps a temporary fork if 
we go for mode 2 where Cassandra retroactively adopts our protocol 
changes, if they fit will).


Implementing a protocol change may be easy for some simple changes, but 
in the general case, it is not realistic to expect it.



There are also realities to consider as to what the maintainers of the drivers 
are willing to commit.


Can you elaborate? No one is forcing driver maintainers to update their 
drivers to support new features, either for Cassandra or Scylla, but 
there should be no reason for them to reject a contribution adding that 
support.


If you refer to a potential politically-motivated rejection by the 
DataStax-maintained drivers, then those drivers should and will be 
forked. That's not true open source. However, I'm not assuming that will 
happen.




RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client by 
breaking up the token ranges, but it's a leaky abstraction to have a paging 
interface that isn't a vanilla ResultSet interface. Serial vs. parallel is kind 
of orthogonal as the driver can do either.

I agree it looks like the current specification doesn't make what should be 
simple as simple as it could be for driver implementers.

RE #2,

+1 on this change assuming an implementation in Cassandra and the Java and 
Python drivers.


Those were just given as examples. Each would be discussed on its own, 
assuming we are able to find a way to cooperate.



These are relatively simple and it wouldn't be hard for use to patch 
Cassandra. But I want to find a way to make more complicated protocol 
changes where it wouldn't be realistic for us to modify Cassandra.



RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves in by 
defining a spec we haven't implemented, tested, and decided we are satisfied 
with. Having it in ScyllaDB de-risks it to a certain extent, but what if 
Cassandra decides to go a different direction in some way?


Such a proposal would include negotiation about the sharding algorithm 
used to prevent Cassandra being boxed in. Of course it's impossible to 
guarantee that a new idea won't come up that requires more changes.



I don't think there is much discussion to be had without an example of the the 
changes to the CQL specification to look at, but even then if it looks risky I 
am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:


On 2018/04/19 07:19:27, kurt greaves  wrote:

1. The protocol change is developed using the Cassandra process in
a JIRA ticket, culminating in a patch to
doc/native_protocol*.spec when consensus is achieved.

I don't think forking would be desirable (for anyone) so this seems
the most reasonable to me. For 1 and 2 it certainly makes sense but
can't say I know enough about sharding to comment on 3 - seems to me
like it could be locking in a design before anyone truly knows what
sharding in C* looks like. But hopefully I'm wrong and there are
devs out there that have already thought that through.

Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are
future proof and flexible. We certainly don't want to propose a design
that works just for Scylla, but would support reasonable
implementations regardless of how they may look like.


Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority. 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Evolving the client protocol

2018-04-19 Thread Avi Kivity

Port-per-shard is likely the easiest option but it's too ugly to 
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t 
IIRC), it will be just horrible to have 160 open ports.



It also doesn't fit will with the NICs ability to automatically 
distribute packets among cores using multiple queues, so the kernel 
would have to shuffle those packets around. Much better to have those 
packets delivered directly to the core that will service them.



(also, some protocol changes are needed so the driver knows how tokens 
are distributed among shards)


On 2018-04-19 19:46, Ben Bromhead wrote:

WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg  wrote:


Hi,

I think that updating the protocol spec to Cassandra puts the onus on the
party changing the protocol specification to have an implementation of the
spec in Cassandra as well as the Java and Python driver (those are both
used in the Cassandra repo). Until it's implemented in Cassandra we haven't
fully evaluated the specification change. There is no substitute for trying
to make it work.

There are also realities to consider as to what the maintainers of the
drivers are willing to commit.

RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client
by breaking up the token ranges, but it's a leaky abstraction to have a
paging interface that isn't a vanilla ResultSet interface. Serial vs.
parallel is kind of orthogonal as the driver can do either.

I agree it looks like the current specification doesn't make what should
be simple as simple as it could be for driver implementers.

RE #2,

+1 on this change assuming an implementation in Cassandra and the Java and
Python drivers.

RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves in
by defining a spec we haven't implemented, tested, and decided we are
satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
what if Cassandra decides to go a different direction in some way?

I don't think there is much discussion to be had without an example of the
the changes to the CQL specification to look at, but even then if it looks
risky I am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:


On 2018/04/19 07:19:27, kurt greaves  wrote:

1. The protocol change is developed using the Cassandra process in
a JIRA ticket, culminating in a patch to
doc/native_protocol*.spec when consensus is achieved.

I don't think forking would be desirable (for anyone) so this seems
the most reasonable to me. For 1 and 2 it certainly makes sense but
can't say I know enough about sharding to comment on 3 - seems to me
like it could be locking in a design before anyone truly knows what
sharding in C* looks like. But hopefully I'm wrong and there are
devs out there that have already thought that through.

Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are
future proof and flexible. We certainly don't want to propose a design
that works just for Scylla, but would support reasonable
implementations regardless of how they may look like.


Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority. 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

--

Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Evolving the client protocol

2018-04-18 Thread Avi Kivity


Hello Cassandra developers,


We're starting to see client protocol limitations impact performance, 
and so we'd like to evolve the protocol to remove the limitations. In 
order to avoid fragmenting the driver ecosystem and reduce work 
duplication for driver authors, we'd like to avoid forking the protocol. 
Since these issues affect Cassandra, either now or in the future, I'd 
like to cooperate on protocol development.



Some issues that we'd like to work on near-term are:


1. Token-aware range queries


When the server returns a page in a range query, it will also return a 
token to continue on. In case that token is on a different node, the 
client selects a new coordinator based on the token. This eliminates a 
network hop for range queries.



For the first page, the PREPARE message returns information allowing the 
client to compute where the first page is held, given the query 
parameters. This is just information identifying how to compute the 
token, given the query parameters (non-range queries already do this).



https://issues.apache.org/jira/browse/CASSANDRA-14311


2. Per-request timeouts


Allow each request to have its own timeout. This allows the user to set 
short timeouts on business-critical queries that are invalid if not 
served within a short time, long timeouts for scanning or indexed 
queries, and even longer timeouts for administrative tasks like TRUNCATE 
and DROP.



https://issues.apache.org/jira/browse/CASSANDRA-2848


3. Shard-aware driver


This admittedly is a burning issue for ScyllaDB, but not so much for 
Cassandra at this time.



In the same way that drivers are token-aware, they can be shard-aware - 
know how many shards each node has, and the sharding algorithm. They can 
then open a connection per shard and send cql requests directly to the 
shard that will serve them, instead of requiring cross-core 
communication to happen on the server.



https://issues.apache.org/jira/browse/CASSANDRA-10989


I see three possible modes of cooperation:


1. The protocol change is developed using the Cassandra process in a 
JIRA ticket, culminating in a patch to doc/native_protocol*.spec when 
consensus is achieved.



The advantage to this mode is that Cassandra developers can verify that 
the change is easily implementable; when they are ready to implement the 
feature, drivers that were already adapted to support it will just work.



2. The protocol change is developed outside the Cassandra process.


In this mode, we develop the change in a forked version of 
native_protocol*.spec; Cassandra can still retroactively merge that 
change when (and if) it is implemented, but the ability to influence the 
change during development is reduced.



If we agree on this, I'd like to allocate a prefix for feature names in 
the SUPPORTED message for our use.



3. No cooperation.


This requires the least amount of effort from Cassandra developers (just 
enough to reach this point in this email), but will cause duplication of 
effort for driver authors who wish to support both projects, and may 
cause Cassandra developers to redo work that we already did.



Looking forward to your views.


Avi


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: Aggregate functions on collections, collection functions and MAXWRITETIME

Re: [DISCUSS] CEP-20: Dynamic Data Masking

Re: CEP-15 multi key transaction syntax

Re: CEP-15 multi key transaction syntax

Re: [Proposal] add pull request template

Re: CEP-15 multi key transaction syntax

Re: CEP-15 multi key transaction syntax

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Re: Evolving the client protocol

Evolving the client protocol

22 matches

Site Navigation

Mail list logo

Footer information