[jira] [Commented] (CASSANDRA-8959) More efficient frozen UDT, tuple and collection serialization format

2022-09-06 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600853#comment-17600853
 ] 

Avi Kivity commented on CASSANDRA-8959:
---

If you're changing the representation in the native protocol, please make it 
negotiated, so an old driver can continue talking to a new node, and so that 
new and old nodes can coexist.

> More efficient frozen UDT, tuple and collection serialization format
> 
>
> Key: CASSANDRA-8959
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8959
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Aleksey Yeschenko
>Priority: Normal
>  Labels: performance
> Fix For: 4.x
>
>
> The current serialization format for UDTs has a fixed overhead of 4 bytes per 
> defined field (encoding the size of the field).
> It is inefficient for sparse UDTs - ones with many defined fields, but few of 
> them present. We could keep a bitset to indicate the missing fields, if any.
> It's sub-optimal for encoding UDTs with all the values present as well. We 
> could use varint encoding for the field sizes of blob/text fields and encode 
> 'fixed' sized types directly, without the 4-bytes size prologue.
> That or something more brilliant. Any improvement right now is lhf.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17762) LWT IF col = NULL is inconsistent with SQL NULL

2022-08-23 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583746#comment-17583746
 ] 

Avi Kivity commented on CASSANDRA-17762:


One problem is that X = NULL is valid SQL, it just has a different meaning than 
CQL. (it means UNKNOWN or NULL). So the syntax can't be deprecated, just the 
interpretation. So I think it makes sense to have a config item indicate which 
interpretation to use, so applications can be migrated independently of the 
update schedule (and problems with mixed version clusters avoided).

> LWT IF col = NULL is inconsistent with SQL NULL
> ---
>
> Key: CASSANDRA-17762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17762
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Avi Kivity
>Priority: Normal
> Fix For: 4.x
>
>
> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in a 
> condition. To test for NULLness, you use IS NULL or IS NOT NULL.
> But LWT uses IF col = NULL as a NULLness test. This is likely to confuse 
> people coming from SQL and hamper attempts to extend the dialect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17762) LWT IF col = NULL is inconsistent with SQL NULL

2022-07-24 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570456#comment-17570456
 ] 

Avi Kivity edited comment on CASSANDRA-17762 at 7/24/22 10:55 AM:
--

I tested 4.0.4 with NULL provided via a bind variable to the WHERE clause, and 
it rejects the query.


was (Author: avi.kivity):
I tested 4.0.4 with NULL provided via a bind variable, and it rejects the query.

> LWT IF col = NULL is inconsistent with SQL NULL
> ---
>
> Key: CASSANDRA-17762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17762
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Avi Kivity
>Priority: Normal
> Fix For: 4.x
>
>
> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in a 
> condition. To test for NULLness, you use IS NULL or IS NOT NULL.
> But LWT uses IF col = NULL as a NULLness test. This is likely to confuse 
> people coming from SQL and hamper attempts to extend the dialect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17762) LWT IF col = NULL is inconsistent with SQL NULL

2022-07-24 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570456#comment-17570456
 ] 

Avi Kivity commented on CASSANDRA-17762:


I tested 4.0.4 with NULL provided via a bind variable, and it rejects the query.

> LWT IF col = NULL is inconsistent with SQL NULL
> ---
>
> Key: CASSANDRA-17762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17762
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Avi Kivity
>Priority: Normal
> Fix For: 4.x
>
>
> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in a 
> condition. To test for NULLness, you use IS NULL or IS NOT NULL.
> But LWT uses IF col = NULL as a NULLness test. This is likely to confuse 
> people coming from SQL and hamper attempts to extend the dialect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17762) LWT IF col = NULL is inconsistent with SQL NULL

2022-07-20 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17569121#comment-17569121
 ] 

Avi Kivity commented on CASSANDRA-17762:


{{WHERE}} is protected by a grammar limitation:

 
{noformat}
cassandra@cqlsh> select * from system.local where key = null ALLOW FILTERING;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid 
null value in condition for column key"
cassandra@cqlsh> select * from system.local where data_center = null ALLOW 
FILTERING;
InvalidRequest: Error from server: code=2200 [Invalid query] 
message="Unsupported null value for column data_center"{noformat}
Although there's a good chance it will fail with a NULL supplied via a bind 
variable.

> LWT IF col = NULL is inconsistent with SQL NULL
> ---
>
> Key: CASSANDRA-17762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17762
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Avi Kivity
>Priority: Normal
> Fix For: 4.x
>
>
> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in a 
> condition. To test for NULLness, you use IS NULL or IS NOT NULL.
> But LWT uses IF col = NULL as a NULLness test. This is likely to confuse 
> people coming from SQL and hamper attempts to extend the dialect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17762) LWT IF col = NULL is inconsistent with SQL NULL

2022-07-20 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17568993#comment-17568993
 ] 

Avi Kivity commented on CASSANDRA-17762:


A possible solution is to add a configuration variable specifying how to handle 
= NULL: legacy, legacy+warn, sql+warn, sql. Eventually the legacy 
interpretation should be deprecated.

> LWT IF col = NULL is inconsistent with SQL NULL
> ---
>
> Key: CASSANDRA-17762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17762
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Avi Kivity
>Priority: Normal
>
> In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in a 
> condition. To test for NULLness, you use IS NULL or IS NOT NULL.
> But LWT uses IF col = NULL as a NULLness test. This is likely to confuse 
> people coming from SQL and hamper attempts to extend the dialect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17762) LWT IF col = NULL is inconsistent with SQL NULL

2022-07-20 Thread Avi Kivity (Jira)
Avi Kivity created CASSANDRA-17762:
--

 Summary: LWT IF col = NULL is inconsistent with SQL NULL
 Key: CASSANDRA-17762
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17762
 Project: Cassandra
  Issue Type: Bug
  Components: CQL/Semantics
Reporter: Avi Kivity


In SQL, any comparison with NULL is NULL, which is interpreted as FALSE in a 
condition. To test for NULLness, you use IS NULL or IS NOT NULL.

But LWT uses IF col = NULL as a NULLness test. This is likely to confuse people 
coming from SQL and hamper attempts to extend the dialect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17693) Missing type checking on reused bind variables

2022-06-16 Thread Avi Kivity (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-17693:
---
Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)  (was: Parent values: Correctness(12982)Level 1 
values: Recoverable Corruption / Loss(12986))

> Missing type checking on reused bind variables
> --
>
> Key: CASSANDRA-17693
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17693
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Semantics
>Reporter: Avi Kivity
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
> Attachments: bind-var-type-conflict.py
>
>
> A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
> VALUES(:var)}}, {{:var}} has the same type as {{tab.a}}. But what if the same 
> variable is used in contexts that have different types? Assume {{a}} and 
> {{b}} have incompatible types:
> {{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}
> The server should reject the query, but it doesn't.
> I think what happens is that one :a shadows the other, so the other :a can't 
> even get a value.
> More complete reproducer attached.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17693) Missing type checking on reused bind variables

2022-06-15 Thread Avi Kivity (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-17693:
---
Description: 
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var, :var)}} has the same type as {{tab.a}}. But what if the same 
variable is used in contexts that have different types? Assume {{a}} and {{b}} 
have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.

  was:
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var, :var)}} has the same type as {{{}tab.a{}}}). But what if the same 
variable is used in contexts that have different types? Assume {{a}} and {{b}} 
have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.


> Missing type checking on reused bind variables
> --
>
> Key: CASSANDRA-17693
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17693
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Avi Kivity
>Priority: Normal
> Attachments: bind-var-type-conflict.py
>
>
> A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
> VALUES(:var, :var)}} has the same type as {{tab.a}}. But what if the same 
> variable is used in contexts that have different types? Assume {{a}} and 
> {{b}} have incompatible types:
> {{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}
> The server should reject the query, but it doesn't.
> I think what happens is that one :a shadows the other, so the other :a can't 
> even get a value.
> More complete reproducer attached.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17693) Missing type checking on reused bind variables

2022-06-15 Thread Avi Kivity (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-17693:
---
Description: 
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var) ,:var}} has the same type as {{{}tab.a{}}}). But what if the same 
variable is used in contexts that have different types? Assume {{a}} and {{b}} 
have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.

  was:
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var)}} , {{:var}} has the same type as {{{}tab.a{}}}). But what if the 
same variable is used in contexts that have different types? Assume {{a}} and 
{{b}} have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.


> Missing type checking on reused bind variables
> --
>
> Key: CASSANDRA-17693
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17693
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Avi Kivity
>Priority: Normal
> Attachments: bind-var-type-conflict.py
>
>
> A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
> VALUES(:var) ,:var}} has the same type as {{{}tab.a{}}}). But what if the 
> same variable is used in contexts that have different types? Assume {{a}} and 
> {{b}} have incompatible types:
> {{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}
> The server should reject the query, but it doesn't.
> I think what happens is that one :a shadows the other, so the other :a can't 
> even get a value.
> More complete reproducer attached.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17693) Missing type checking on reused bind variables

2022-06-15 Thread Avi Kivity (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-17693:
---
Description: 
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var)}}, {{:var}} has the same type as {{tab.a}}. But what if the same 
variable is used in contexts that have different types? Assume {{a}} and {{b}} 
have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.

  was:
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var)}} has the same type as {{tab.a}}. But what if the same variable is 
used in contexts that have different types? Assume {{a}} and {{b}} have 
incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.


> Missing type checking on reused bind variables
> --
>
> Key: CASSANDRA-17693
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17693
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Avi Kivity
>Priority: Normal
> Attachments: bind-var-type-conflict.py
>
>
> A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
> VALUES(:var)}}, {{:var}} has the same type as {{tab.a}}. But what if the same 
> variable is used in contexts that have different types? Assume {{a}} and 
> {{b}} have incompatible types:
> {{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}
> The server should reject the query, but it doesn't.
> I think what happens is that one :a shadows the other, so the other :a can't 
> even get a value.
> More complete reproducer attached.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17693) Missing type checking on reused bind variables

2022-06-15 Thread Avi Kivity (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-17693:
---
Description: 
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var)}} has the same type as {{tab.a}}. But what if the same variable is 
used in contexts that have different types? Assume {{a}} and {{b}} have 
incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.

  was:
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var, :var)}} has the same type as {{tab.a}}. But what if the same 
variable is used in contexts that have different types? Assume {{a}} and {{b}} 
have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.


> Missing type checking on reused bind variables
> --
>
> Key: CASSANDRA-17693
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17693
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Avi Kivity
>Priority: Normal
> Attachments: bind-var-type-conflict.py
>
>
> A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
> VALUES(:var)}} has the same type as {{tab.a}}. But what if the same variable 
> is used in contexts that have different types? Assume {{a}} and {{b}} have 
> incompatible types:
> {{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}
> The server should reject the query, but it doesn't.
> I think what happens is that one :a shadows the other, so the other :a can't 
> even get a value.
> More complete reproducer attached.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17693) Missing type checking on reused bind variables

2022-06-15 Thread Avi Kivity (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-17693:
---
Description: 
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var, :var)}} has the same type as {{{}tab.a{}}}). But what if the same 
variable is used in contexts that have different types? Assume {{a}} and {{b}} 
have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.

  was:
A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var) ,:var}} has the same type as {{{}tab.a{}}}). But what if the same 
variable is used in contexts that have different types? Assume {{a}} and {{b}} 
have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.


> Missing type checking on reused bind variables
> --
>
> Key: CASSANDRA-17693
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17693
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Avi Kivity
>Priority: Normal
> Attachments: bind-var-type-conflict.py
>
>
> A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
> VALUES(:var, :var)}} has the same type as {{{}tab.a{}}}). But what if the 
> same variable is used in contexts that have different types? Assume {{a}} and 
> {{b}} have incompatible types:
> {{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}
> The server should reject the query, but it doesn't.
> I think what happens is that one :a shadows the other, so the other :a can't 
> even get a value.
> More complete reproducer attached.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17693) Missing type checking on reused bind variables

2022-06-15 Thread Avi Kivity (Jira)
Avi Kivity created CASSANDRA-17693:
--

 Summary: Missing type checking on reused bind variables
 Key: CASSANDRA-17693
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17693
 Project: Cassandra
  Issue Type: Bug
Reporter: Avi Kivity
 Attachments: bind-var-type-conflict.py

A bind variable gets its type from its context (e.g. in {{INSERT INTO tab(a) 
VALUES(:var)}} , {{:var}} has the same type as {{{}tab.a{}}}). But what if the 
same variable is used in contexts that have different types? Assume {{a}} and 
{{b}} have incompatible types:

{{INSERT INTO ks.tab (id, a, b) VALUES(:id, :a, :a)}}


The server should reject the query, but it doesn't.

I think what happens is that one :a shadows the other, so the other :a can't 
even get a value.

More complete reproducer attached.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol

2020-12-17 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250971#comment-17250971
 ] 

Avi Kivity commented on CASSANDRA-13304:


I filed https://issues.apache.org/jira/browse/CASSANDRA-16360 proposing to 
change to CRC32C.

> Add checksumming to the native protocol
> ---
>
> Key: CASSANDRA-13304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13304
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Michael Kjellman
>Assignee: Sam Tunnicliffe
>Priority: Urgent
> Fix For: 4.0, 4.0-alpha1
>
> Attachments: 13304_v1.diff, boxplot-read-throughput.png, 
> boxplot-write-throughput.png
>
>
> The native binary transport implementation doesn't include checksums. This 
> makes it highly susceptible to silently inserting corrupted data either due 
> to hardware issues causing bit flips on the sender/client side, C*/receiver 
> side, or network in between.
> Attaching an implementation that makes checksum'ing mandatory (assuming both 
> client and server know about a protocol version that supports checksums) -- 
> and also adds checksumming to clients that request compression.
> The serialized format looks something like this:
> {noformat}
>  *  1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
>  *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Number of Compressed Chunks  | Compressed Length (e1)/
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * /  Compressed Length cont. (e1) |Uncompressed Length (e1)   /
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e1) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (e2)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Compressed Bytes (e2)   +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e2) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (en)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Compressed Bytes (en)  +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (en) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
> {noformat}
> The first pass here adds checksums only to the actual contents of the frame 
> body itself (and doesn't actually checksum lengths and headers). While it 
> would be great to fully add checksuming across the entire protocol, the 
> proposed implementation will ensure we at least catch corrupted data and 
> likely protect ourselves pretty well anyways.
> I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor 
> implementation as it's been deprecated for a while -- is really slow and 
> crappy compared to LZ4 -- and we should do everything in our power to make 
> sure no one in the community is still using it. I left it in (for obvious 
> backwards compatibility aspects) old for clients that don't know about the 
> new protocol.
> The current protocol has a 256MB (max) frame body -- where the serialized 
> contents are simply written in to the frame body.
> If the client sends a compression option in the startup, we will install a 
> FrameCompressor inline. Unfortunately, we went with a decision to treat the 
> frame body separately from the header bits etc in a given message. So, 
> instead we put a compressor implementation in the options and then 

[jira] [Created] (CASSANDRA-16360) CRC32 is inefficient on x86

2020-12-17 Thread Avi Kivity (Jira)
Avi Kivity created CASSANDRA-16360:
--

 Summary: CRC32 is inefficient on x86
 Key: CASSANDRA-16360
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16360
 Project: Cassandra
  Issue Type: Improvement
  Components: Messaging/Client
Reporter: Avi Kivity


The client/server protocol specifies CRC24 and CRC32 as the checksum algorithm 
(cql_protocol_V5_framing.asc). Those however are expensive to compute; this 
affects both the client and the server.

 

A better checksum algorithm is CRC32C, which has hardware support on x86 (as 
well as other modern architectures).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol

2020-12-13 Thread Avi Kivity (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248576#comment-17248576
 ] 

Avi Kivity commented on CASSANDRA-13304:


Please consider using the CRC32C polynomial instead of the CRC32 polynomial. 
The CRC32C polynomial has hardware implementations on x86, while the CRC32 
polynomial does not.

 

CRC32C is natively supported by Java: 
https://docs.oracle.com/javase/9/docs/api/java/util/zip/CRC32C.html

> Add checksumming to the native protocol
> ---
>
> Key: CASSANDRA-13304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13304
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Michael Kjellman
>Assignee: Sam Tunnicliffe
>Priority: Urgent
> Fix For: 4.0, 4.0-alpha1
>
> Attachments: 13304_v1.diff, boxplot-read-throughput.png, 
> boxplot-write-throughput.png
>
>
> The native binary transport implementation doesn't include checksums. This 
> makes it highly susceptible to silently inserting corrupted data either due 
> to hardware issues causing bit flips on the sender/client side, C*/receiver 
> side, or network in between.
> Attaching an implementation that makes checksum'ing mandatory (assuming both 
> client and server know about a protocol version that supports checksums) -- 
> and also adds checksumming to clients that request compression.
> The serialized format looks something like this:
> {noformat}
>  *  1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
>  *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Number of Compressed Chunks  | Compressed Length (e1)/
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * /  Compressed Length cont. (e1) |Uncompressed Length (e1)   /
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e1) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (e2)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (e2) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * | Compressed Bytes (e2)   +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (e2) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |Compressed Length (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |   Uncompressed Length (en)|
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |CRC32 Checksum of Lengths (en) |
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  Compressed Bytes (en)  +//
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  * |  CRC32 Checksum (en) ||
>  * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
> {noformat}
> The first pass here adds checksums only to the actual contents of the frame 
> body itself (and doesn't actually checksum lengths and headers). While it 
> would be great to fully add checksuming across the entire protocol, the 
> proposed implementation will ensure we at least catch corrupted data and 
> likely protect ourselves pretty well anyways.
> I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor 
> implementation as it's been deprecated for a while -- is really slow and 
> crappy compared to LZ4 -- and we should do everything in our power to make 
> sure no one in the community is still using it. I left it in (for obvious 
> backwards compatibility aspects) old for clients that don't know about the 
> new protocol.
> The current protocol has a 256MB (max) frame body -- where the serialized 
> contents are simply written in to the frame body.
> If the client sends a compression option in the startup, we will install a 
> FrameCompressor inline. 

[jira] [Created] (CASSANDRA-15317) CAST AS function vulnerable to integer overflow

2019-09-08 Thread Avi Kivity (Jira)
Avi Kivity created CASSANDRA-15317:
--

 Summary: CAST AS function vulnerable to integer overflow
 Key: CASSANDRA-15317
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15317
 Project: Cassandra
  Issue Type: Bug
  Components: CQL/Interpreter
Reporter: Avi Kivity


 
{noformat}
cqlsh:ks1> create table bigdec (k decimal  primary key);

cqlsh:ks1> insert into bigdec (k) values (100);

cqlsh:ks1> select * from bigdec;
 k
-
 100
(1 rows)

cqlsh:ks1> select cast(k as int) from bigdec;
 cast(k as int)

      276447232{noformat}
This overflow is unexpected for the user and can lead to incorrect results. 
Better to refuse to execute the query.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15250) Large cartesian produces in IN queries cause the server to run out of memory

2019-07-27 Thread Avi Kivity (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894414#comment-16894414
 ] 

Avi Kivity commented on CASSANDRA-15250:


Note: low priority as far as I am concerned, I just hit the same problem in 
Scylla and saw that Cassandra is also vulnerable and so reported it.

 

It should be possible to serve such large queries by giving up the 
sort-by-token order and not materializing the cartesian product, but it's 
simpler to just limit the number of rows.

> Large cartesian produces in IN queries cause the server to run out of memory
> 
>
> Key: CASSANDRA-15250
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15250
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Interpreter
>Reporter: Avi Kivity
>Priority: Normal
>
> The queries
>  
> {{    create table tab (pk1 int, pk2 int, pk3 int, pk4 int, pk5 int, pk6 int, 
> pk7 int, pk8 int, pk9 int, primary key((pk1, pk2, pk3, pk4, pk5, pk6, pk7, 
> pk8, pk9)));}}
>  
> {{    select * from tab where pk1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk2 
> in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk3 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) 
> and pk4 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk5 in (1, 2, 3, 4, 5, 6, 7, 
> 8, 9, 10) and pk6 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk7 in (1, 2, 3, 4, 
> 5, 6, 7, 8, 9, 10) and pk8 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk9 in (1, 
> 2, 3, 4, 5, 6, 7, 8, 9, 10) ; }}
>  
> Will cause the server to enter a garbage collection spiral from which it does 
> not recover. The queries generate a large (1 billion row) cartesian product 
> which the server presumably materializes in memory, and fails.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15250) Large cartesian produces in IN queries cause the server to run out of memory

2019-07-27 Thread Avi Kivity (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-15250:
---
Description: 
The queries

 

{{    create table tab (pk1 int, pk2 int, pk3 int, pk4 int, pk5 int, pk6 int, 
pk7 int, pk8 int, pk9 int, primary key((pk1, pk2, pk3, pk4, pk5, pk6, pk7, pk8, 
pk9)));}}

 

{{    select * from tab where pk1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk2 in 
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk3 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and 
pk4 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk5 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 
10) and pk6 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk7 in (1, 2, 3, 4, 5, 6, 7, 
8, 9, 10) and pk8 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk9 in (1, 2, 3, 4, 5, 
6, 7, 8, 9, 10); }}

 

Will cause the server to enter a garbage collection spiral from which it does 
not recover. The queries generate a large (1 billion row) cartesian product 
which the server presumably materializes in memory, and fails.

 

  was:
The queries

 

{{    create table tab (pk1 int, pk2 int, pk3 int, pk4 int, pk5 int, pk6 int, 
pk7 int, pk8 int, pk9 int, primary key((pk1, pk2, pk3, pk4, pk5, pk6, pk7, pk8, 
pk9)));}}

 

{{    select * from tab where pk1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk2 in 
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk3 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and 
pk4 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk5 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 
10) and pk6 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk7 in (1, 2, 3, 4, 5, 6, 7, 
8, 9, 10) and pk8 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk9 in (1, 2, 3, 4, 5, 
6, 7, 8, 9, 10) ; }}

 

Will cause the server to enter a garbage collection spiral from which it does 
not recover. The queries generate a large (1 billion row) cartesian product 
which the server presumably materializes in memory, and fails.

 


> Large cartesian produces in IN queries cause the server to run out of memory
> 
>
> Key: CASSANDRA-15250
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15250
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Interpreter
>Reporter: Avi Kivity
>Priority: Normal
>
> The queries
>  
> {{    create table tab (pk1 int, pk2 int, pk3 int, pk4 int, pk5 int, pk6 int, 
> pk7 int, pk8 int, pk9 int, primary key((pk1, pk2, pk3, pk4, pk5, pk6, pk7, 
> pk8, pk9)));}}
>  
> {{    select * from tab where pk1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk2 
> in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk3 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) 
> and pk4 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk5 in (1, 2, 3, 4, 5, 6, 7, 
> 8, 9, 10) and pk6 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk7 in (1, 2, 3, 4, 
> 5, 6, 7, 8, 9, 10) and pk8 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk9 in (1, 
> 2, 3, 4, 5, 6, 7, 8, 9, 10); }}
>  
> Will cause the server to enter a garbage collection spiral from which it does 
> not recover. The queries generate a large (1 billion row) cartesian product 
> which the server presumably materializes in memory, and fails.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15250) Large cartesian produces in IN queries cause the server to run out of memory

2019-07-27 Thread Avi Kivity (JIRA)
Avi Kivity created CASSANDRA-15250:
--

 Summary: Large cartesian produces in IN queries cause the server 
to run out of memory
 Key: CASSANDRA-15250
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15250
 Project: Cassandra
  Issue Type: Bug
  Components: CQL/Interpreter
Reporter: Avi Kivity


The queries

 

{{    create table tab (pk1 int, pk2 int, pk3 int, pk4 int, pk5 int, pk6 int, 
pk7 int, pk8 int, pk9 int, primary key((pk1, pk2, pk3, pk4, pk5, pk6, pk7, pk8, 
pk9)));}}

 

{{    select * from tab where pk1 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk2 in 
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk3 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and 
pk4 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk5 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 
10) and pk6 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk7 in (1, 2, 3, 4, 5, 6, 7, 
8, 9, 10) and pk8 in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) and pk9 in (1, 2, 3, 4, 5, 
6, 7, 8, 9, 10) ; }}

 

Will cause the server to enter a garbage collection spiral from which it does 
not recover. The queries generate a large (1 billion row) cartesian product 
which the server presumably materializes in memory, and fails.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14541) Order of warning and custom payloads is unspecified in the protocol specification

2018-06-24 Thread Avi Kivity (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avi Kivity updated CASSANDRA-14541:
---
Attachment: v1-0001-Document-order-of-tracing-warning-and-custom-payl.patch
Status: Patch Available  (was: Open)

> Order of warning and custom payloads is unspecified in the protocol 
> specification
> -
>
> Key: CASSANDRA-14541
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14541
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Avi Kivity
>Priority: Trivial
> Attachments: 
> v1-0001-Document-order-of-tracing-warning-and-custom-payl.patch
>
>
> Section 2.2 of the protocol specification documents the types of tracing, 
> warning, and custom payloads, but does not document their order in the body.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14541) Order of warning and custom payloads is unspecified in the protocol specification

2018-06-24 Thread Avi Kivity (JIRA)
Avi Kivity created CASSANDRA-14541:
--

 Summary: Order of warning and custom payloads is unspecified in 
the protocol specification
 Key: CASSANDRA-14541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14541
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation and Website
Reporter: Avi Kivity


Section 2.2 of the protocol specification documents the types of tracing, 
warning, and custom payloads, but does not document their order in the body.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14348) Per-request timeouts

2018-03-28 Thread Avi Kivity (JIRA)
Avi Kivity created CASSANDRA-14348:
--

 Summary: Per-request timeouts
 Key: CASSANDRA-14348
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14348
 Project: Cassandra
  Issue Type: Improvement
  Components: Coordination
Reporter: Avi Kivity


Cassandra currently allows separate timeout configuration for writes, 
single-partition reads, and range queries. However, this suffers from several 
deficiencies:
 * configuration file changes must be replicated across all nodes, and the 
nodes must be restarted for them to take effect
 * single-partition vs. large partition doesn't correlate with short vs. long 
queries, if you consider large partitions
 * the same cluster may need to serve time-critical queries and non-critical 
queries simultaneously; there is no way to configure that.

We should have a way to configure the timeout at the request level, in the same 
way we can configure the consistency level. An alternative is to add a WITH 
TIMEOUT clause to CQL, similar to USING TTL or TIMESTAMP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14311) Allow Token-Aware drivers for range scans

2018-03-15 Thread Avi Kivity (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400621#comment-16400621
 ] 

Avi Kivity commented on CASSANDRA-14311:


Thanks JIRA for converting my code into emojis. Is nowhere safe?

> Allow Token-Aware drivers for range scans
> -
>
> Key: CASSANDRA-14311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14311
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Avi Kivity
>Priority: Major
>
> Currently, range scans are not token aware. This means that an extra hop is 
> needed for most requests. Since range scans are usually data intensive, this 
> causes significant extra traffic.
>  
> Token awareness could be enabled by having the coordinator return the token 
> for the next (still unread) row in the response, so the driver can select a 
> next coordinator that owns this row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14311) Allow Token-Aware drivers for range scans

2018-03-15 Thread Avi Kivity (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400619#comment-16400619
 ] 

Avi Kivity commented on CASSANDRA-14311:


{quote}We would need to add some hints here about the token ranges covered by 
the query for the driver to use.
{quote}
There's no need for hints. You send the first page (which will likely miss the 
replica), and in addition to has_more_pages you also get a token for the next 
page.

 

We could also optimize the first page by providing more metadata. There are a 
few cases to consider:

 
 # SELECT ... FROM ... WHERE token(pk) >= ?
 # SELECT ... FROM ... WHERE token(pk) >= token(?)
 # SELECT ... FROM ... WHERE (no lower-bound specified)

(1 and 2 also need to support >).

 

If the metadata describes these cases, then we can send the first page query to 
a coordinator that is also a replica.

> Allow Token-Aware drivers for range scans
> -
>
> Key: CASSANDRA-14311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14311
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Avi Kivity
>Priority: Major
>
> Currently, range scans are not token aware. This means that an extra hop is 
> needed for most requests. Since range scans are usually data intensive, this 
> causes significant extra traffic.
>  
> Token awareness could be enabled by having the coordinator return the token 
> for the next (still unread) row in the response, so the driver can select a 
> next coordinator that owns this row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14311) Allow Token-Aware drivers for range scans

2018-03-15 Thread Avi Kivity (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400592#comment-16400592
 ] 

Avi Kivity commented on CASSANDRA-14311:


{quote}this allows querying multiple token ranges in parallel and gives even 
more benefits than just node hopping for the “next” page
{quote}
 

That changes the query semantics, not all users are prepared for parallel scan 
in a single thread. For sure, it is viable.

 

My proposal optimizes existing use case where you have a sequential query and 
want to keep it sequential.

> Allow Token-Aware drivers for range scans
> -
>
> Key: CASSANDRA-14311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14311
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Avi Kivity
>Priority: Major
>
> Currently, range scans are not token aware. This means that an extra hop is 
> needed for most requests. Since range scans are usually data intensive, this 
> causes significant extra traffic.
>  
> Token awareness could be enabled by having the coordinator return the token 
> for the next (still unread) row in the response, so the driver can select a 
> next coordinator that owns this row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14311) Allow Token-Aware drivers for range scans

2018-03-14 Thread Avi Kivity (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398201#comment-16398201
 ] 

Avi Kivity commented on CASSANDRA-14311:


{quote}Interesting. If the page border spans to another token, you'd still want 
to fill the whole page, wouldn't you? I assume so.
{quote}
 

You could, but you could also return a short page. In any case node-crossings 
are rare, so it doesn't matter much.

 
{quote}Doesn't the driver already have an option to choose the next host in the 
ring if it determines the paging state is beyond the original coordinator's 
boundary?
{quote}
 

paging_state is opaque, the driver cannot interpret it.

 

> Allow Token-Aware drivers for range scans
> -
>
> Key: CASSANDRA-14311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14311
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Avi Kivity
>Priority: Major
>
> Currently, range scans are not token aware. This means that an extra hop is 
> needed for most requests. Since range scans are usually data intensive, this 
> causes significant extra traffic.
>  
> Token awareness could be enabled by having the coordinator return the token 
> for the next (still unread) row in the response, so the driver can select a 
> next coordinator that owns this row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14311) Allow Token-Aware drivers for range scans

2018-03-13 Thread Avi Kivity (JIRA)
Avi Kivity created CASSANDRA-14311:
--

 Summary: Allow Token-Aware drivers for range scans
 Key: CASSANDRA-14311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14311
 Project: Cassandra
  Issue Type: Improvement
  Components: Coordination
Reporter: Avi Kivity


Currently, range scans are not token aware. This means that an extra hop is 
needed for most requests. Since range scans are usually data intensive, this 
causes significant extra traffic.

 

Token awareness could be enabled by having the coordinator return the token for 
the next (still unread) row in the response, so the driver can select a next 
coordinator that owns this row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13550) Partition-level isolation of batch writes almost impossible to use

2017-05-24 Thread Avi Kivity (JIRA)
Avi Kivity created CASSANDRA-13550:
--

 Summary: Partition-level isolation of batch writes almost 
impossible to use
 Key: CASSANDRA-13550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13550
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation and Website
Reporter: Avi Kivity


The documentation for the {{BATCH}} statement states:

bq. All updates in a @BATCH@ belonging to a given partition key are performed 
in isolation.

However, it is almost impossible to make use of this guarantee while reading:
 - if paging is enabled, then the server may insert a page boundary at any 
point; and there is no isolation among different pages of a single query
 - repairs can cause individual rows within a partition to be repaired, even if 
they belonged to a larger isolated update
 - in Cassandra 3 and above, reconciliation is incremental, again breaking 
isolation
 - if rows are overwritten, then even row-level isolation is not guaranteed 
(the old timestamp collision problem)

While it's possible to write an application that makes use of the existing 
isolation guarantees, it is very hard, and will likely impose constraints on 
the application (paging must be disabled and overwrites and row-level deletes 
never used).

The rules should be clarified here and relaxed; relaxation could allow a more 
efficient implementation (in Scylla we went full MVCC) and prevent users from 
relying on a guarantee that is not in fact provided.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

2016-07-14 Thread Avi Kivity (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376513#comment-15376513
 ] 

Avi Kivity commented on CASSANDRA-9318:
---

FWIW this is exactly what we do.  It's not perfect but it seems to work.

> Bound the number of in-flight requests at the coordinator
> -
>
> Key: CASSANDRA-9318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths, Streaming and Messaging
>Reporter: Ariel Weisberg
>Assignee: Sergio Bossa
> Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, 
> limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)