[jira] [Commented] (CASSANDRA-16550) Improve LICENSE/NOTICE compliance with ASF guidance

2021-03-30 Thread Ben Bromhead (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312078#comment-17312078
 ] 

Ben Bromhead commented on CASSANDRA-16550:
--

Thanks [~Anthony Grasso], addressed. 

I think this can get merged and I can address [~jmclean]'s completeness 
comments in a seperate ticket.

Otherwise if the ticket is still open tomorrow I'll update here.

> Improve LICENSE/NOTICE compliance with ASF guidance 
> 
>
> Key: CASSANDRA-16550
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16550
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Normal
>
> PRs on GitHub:
> [https://github.com/apache/cassandra/pull/943]
> [https://github.com/apache/cassandra/pull/944] 
>  
> A number of issues were identified with our LICENSE.txt and NOTICE.txt files 
> (https://lists.apache.org/thread.html/r66496e495c96efeb31c6531eb748ec739bfb734d5c115077d925ebac%40%3Cdev.cassandra.apache.org%3E),
>  specifically related to identifing bundled source and their respective 
> licenses in accordance with ASF guidance 
> ([https://infra.apache.org/licensing-howto.html]).
>  
> *LICENSE.txt*
> We don't specifically identify the licenses of a number of bundled components 
> included with the source distro of Apache Cassandra in our License file in 
> accordance with ([https://infra.apache.org/licensing-howto.html]). 
> Specifically:
>  # src/java/org/apache/cassandra/index/sasi/utils/AbstractIterator.java
>  # src/java/org/apache/cassandra/utils/LongTimSort.java
>  # src/java/org/apache/cassandra/index/sasi/utils/trie/Cursor.java
>  # test/resources/tokenization/adventures_of_huckleberry_finn_mark_twain.txt
>  # content in doc/source/data_modeling/
> Note: src/java/org/apache/cassandra/utils/vint/VIntCoding.java makes 
> reference of borrowing ideas from Google Protocol Buffers.
> I'm not sure if this is code, concepts or a reference to the concepts in the 
> documentation for an understanding of the idea. I've included it as its a 
> compatible licenses to be on the safe side.
> I've also removed the reference to the lib/ folder as this license (as I 
> understand) currently applies to the source release rather than convenience 
> binaries.
>  
> *NOTICE.txt*
> Removing references for dependencies that are not bundled (e.g. pulled in 
> dynamically).
> Bundled dep src/java/org/apache/cassandra/utils/LongTimSort.java uses ALv2 
> but is not owned by ASF so providing attribution.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16550) Improve LICENSE/NOTICE compliance with ASF guidance

2021-03-30 Thread Ben Bromhead (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead updated CASSANDRA-16550:
-
Description: 
PRs on GitHub:

[https://github.com/apache/cassandra/pull/943]

[https://github.com/apache/cassandra/pull/944] 

 

A number of issues were identified with our LICENSE.txt and NOTICE.txt files 
(https://lists.apache.org/thread.html/r66496e495c96efeb31c6531eb748ec739bfb734d5c115077d925ebac%40%3Cdev.cassandra.apache.org%3E),
 specifically related to identifing bundled source and their respective 
licenses in accordance with ASF guidance 
([https://infra.apache.org/licensing-howto.html]).

 

*LICENSE.txt*

We don't specifically identify the licenses of a number of bundled components 
included with the source distro of Apache Cassandra in our License file in 
accordance with ([https://infra.apache.org/licensing-howto.html]). Specifically:
 # src/java/org/apache/cassandra/index/sasi/utils/AbstractIterator.java
 # src/java/org/apache/cassandra/utils/LongTimSort.java
 # src/java/org/apache/cassandra/index/sasi/utils/trie/Cursor.java
 # test/resources/tokenization/adventures_of_huckleberry_finn_mark_twain.txt
 # content in doc/source/data_modeling/

Note: src/java/org/apache/cassandra/utils/vint/VIntCoding.java makes reference 
of borrowing ideas from Google Protocol Buffers.

I'm not sure if this is code, concepts or a reference to the concepts in the 
documentation for an understanding of the idea. I've included it as its a 
compatible licenses to be on the safe side.

I've also removed the reference to the lib/ folder as this license (as I 
understand) currently applies to the source release rather than convenience 
binaries.

 

*NOTICE.txt*

Removing references for dependencies that are not bundled (e.g. pulled in 
dynamically).

Bundled dep src/java/org/apache/cassandra/utils/LongTimSort.java uses ALv2 but 
is not owned by ASF so providing attribution.

 

  was:
PRs on GitHub:

[https://github.com/apache/cassandra/pull/943]

[https://github.com/apache/cassandra/pull/944] 

 

A number of issues were identified with our LICENSE.txt and NOTICE.txt files, 
specifically related to identifing bundled source and their respective licenses 
in accordance with ASF guidance 
([https://infra.apache.org/licensing-howto.html]).

 

*LICENSE.txt*

We don't specifically identify the licenses of a number of bundled components 
included with the source distro of Apache Cassandra in our License file in 
accordance with ([https://infra.apache.org/licensing-howto.html]). Specifically:
 # src/java/org/apache/cassandra/index/sasi/utils/AbstractIterator.java
 # src/java/org/apache/cassandra/utils/LongTimSort.java
 # src/java/org/apache/cassandra/index/sasi/utils/trie/Cursor.java
 # test/resources/tokenization/adventures_of_huckleberry_finn_mark_twain.txt
 # content in doc/source/data_modeling/

Note: src/java/org/apache/cassandra/utils/vint/VIntCoding.java makes reference 
of borrowing ideas from Google Protocol Buffers.

I'm not sure if this is code, concepts or a reference to the concepts in the 
documentation for an understanding of the idea. I've included it as its a 
compatible licenses to be on the safe side.

I've also removed the reference to the lib/ folder as this license (as I 
understand) currently applies to the source release rather than convenience 
binaries.

 

*NOTICE.txt*

Removing references for dependencies that are not bundled (e.g. pulled in 
dynamically).

Bundled dep src/java/org/apache/cassandra/utils/LongTimSort.java uses ALv2 but 
is not owned by ASF so providing attribution.

 


> Improve LICENSE/NOTICE compliance with ASF guidance 
> 
>
> Key: CASSANDRA-16550
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16550
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Normal
>
> PRs on GitHub:
> [https://github.com/apache/cassandra/pull/943]
> [https://github.com/apache/cassandra/pull/944] 
>  
> A number of issues were identified with our LICENSE.txt and NOTICE.txt files 
> (https://lists.apache.org/thread.html/r66496e495c96efeb31c6531eb748ec739bfb734d5c115077d925ebac%40%3Cdev.cassandra.apache.org%3E),
>  specifically related to identifing bundled source and their respective 
> licenses in accordance with ASF guidance 
> ([https://infra.apache.org/licensing-howto.html]).
>  
> *LICENSE.txt*
> We don't specifically identify the licenses of a number of bundled components 
> included with the source distro of Apache Cassandra in our License file in 
> accordance with ([https://infra.apache.org/licensing-howto.html]). 
> Specifically:
>  # src/java/org/apache/cassandra/index/sasi/utils/AbstractIterator.java
>  # src/java/org/apache/cassandra/utils/LongTimSort.java
>  # 

[jira] [Created] (CASSANDRA-16550) Improve LICENSE/NOTICE compliance with ASF guidance

2021-03-30 Thread Ben Bromhead (Jira)
Ben Bromhead created CASSANDRA-16550:


 Summary: Improve LICENSE/NOTICE compliance with ASF guidance 
 Key: CASSANDRA-16550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16550
 Project: Cassandra
  Issue Type: Bug
Reporter: Ben Bromhead
Assignee: Ben Bromhead


PRs on GitHub:

[https://github.com/apache/cassandra/pull/943]

[https://github.com/apache/cassandra/pull/944] 

 

A number of issues were identified with our LICENSE.txt and NOTICE.txt files, 
specifically related to identifing bundled source and their respective licenses 
in accordance with ASF guidance 
([https://infra.apache.org/licensing-howto.html]).

 

*LICENSE.txt*

We don't specifically identify the licenses of a number of bundled components 
included with the source distro of Apache Cassandra in our License file in 
accordance with ([https://infra.apache.org/licensing-howto.html]). Specifically:
 # src/java/org/apache/cassandra/index/sasi/utils/AbstractIterator.java
 # src/java/org/apache/cassandra/utils/LongTimSort.java
 # src/java/org/apache/cassandra/index/sasi/utils/trie/Cursor.java
 # test/resources/tokenization/adventures_of_huckleberry_finn_mark_twain.txt
 # content in doc/source/data_modeling/

Note: src/java/org/apache/cassandra/utils/vint/VIntCoding.java makes reference 
of borrowing ideas from Google Protocol Buffers.

I'm not sure if this is code, concepts or a reference to the concepts in the 
documentation for an understanding of the idea. I've included it as its a 
compatible licenses to be on the safe side.

I've also removed the reference to the lib/ folder as this license (as I 
understand) currently applies to the source release rather than convenience 
binaries.

 

*NOTICE.txt*

Removing references for dependencies that are not bundled (e.g. pulled in 
dynamically).

Bundled dep src/java/org/apache/cassandra/utils/LongTimSort.java uses ALv2 but 
is not owned by ASF so providing attribution.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2020-10-12 Thread Ben Bromhead (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212789#comment-17212789
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

Ah that will teach me for blindly clicking accept on the patch notes in GitHub 
:P

I've reverted back to the old behavior for `getAllByNameOverrideDefaults` as 
the subsqeuent call to  `getByAddressOverrideDefaults` will actually do the 
null check and set to the default for us. 

Included a new boolean conf value to revert to old behavior as well. Wasn't 
100% on naming conventions so let me know. 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Low
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2020-10-11 Thread Ben Bromhead (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212119#comment-17212119
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

Updated PR to address comments and NITs!

Added a note in NEWS.txt as well.

Yeah I don't think we need to have a flag to revert old behaviour due to new 
major version. 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Low
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2020-05-25 Thread Ben Bromhead (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116318#comment-17116318
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

Rebased (I think, my git foo is weak).

I'll look to move the threshold property into cassandra.yaml as well.

I'm not sure if we want to revert to the old behaviour as the old behavior is 
probably less deterministic that people realise. 

For example if you get entries in an A records for a single query most Java 
implementations will maintain the order of those records, but some OSs will do 
ordering differently (e.g. Windows Vista will order records based on which 
address shares the most number of significant bits with the network adapters 
IP??... which broke a lot of round robin implementations back in the day).

On top of this, the DNS resolver also doesn't provide ordering guarantees and 
most by default will round robin the order of the entries in an A record. Plus 
this is all ignoring DNS caching behaviour, which on the other hand, is 
probably masking a lot of the above.

So any reliance on old behavior is potentially non-deterministic anyway!

Plus its 4.0 and I would guess the majority of people are using IP addresses or 
single entries A records anyway?

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Low
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-10789) Allow DBAs to kill individual client sessions without bouncing JVM

2018-05-08 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467993#comment-16467993
 ] 

Ben Bromhead edited comment on CASSANDRA-10789 at 5/8/18 9:34 PM:
--

I agree with your point on having it query-able (e.g. I would advocate doing 
via JMX, again to keep things simple).

CASSANDRA-13985 - Can do similar things but at the user level. I wish 
application that used Cassandra would ensure each client instance had an 
individual user set of credentials and rotated secrets etc, but based on our 
experience this is not normally the case.

The current suite of tools and admin commands that Cassandra supports at the 
moment pushes this kind of coordination to external tools and I'm not sure it's 
worth waiting for internal management of cluster wide commands unless they are 
just round the corner (which would be awesome).


was (Author: benbromhead):
I agree with your point on having it query-able (e.g. I would advocate doing 
via JMX, again to keep things simple).

CASSANDRA-13985 - Can do similar things but at the user level. I wish each 
client instance had an individual user and rotated credentials, but this is not 
normally the case.

The current suite of tools and admin commands that Cassandra supports at the 
moment pushes this kind of coordination to external tools and I'm not sure it's 
worth waiting for internal management of cluster wide commands unless they are 
just round the corner (which would be awesome).

> Allow DBAs to kill individual client sessions without bouncing JVM
> --
>
> Key: CASSANDRA-10789
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10789
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Wei Deng
>Assignee: Damien Stevenson
>Priority: Major
> Fix For: 4.x
>
> Attachments: 10789-trunk-dtest.txt, 10789-trunk.txt
>
>
> In production, there could be hundreds of clients connected to a Cassandra 
> cluster (maybe even from different applications), and if they use DataStax 
> Java Driver, each client will establish at least one TCP connection to a 
> Cassandra server (see 
> https://datastax.github.io/java-driver/2.1.9/features/pooling/). This is all 
> normal and at any given time, you can indeed see hundreds of ESTABLISHED 
> connections to port 9042 on a C* server (from netstat -na). The problem is 
> that sometimes when a C* cluster is under heavy load, when the DBA identifies 
> some client session that sends abusive amount of traffic to the C* server and 
> would like to stop it, they would like a lightweight approach rather than 
> shutting down the JVM or rolling restart the whole cluster to kill all 
> hundreds of connections in order to kill a single client session. If the DBA 
> had root privilege, they would have been able to do something at the OS 
> network level to achieve the same goal but oftentimes enterprise DBA role is 
> separate from OS sysadmin role, so the DBAs usually don't have that privilege.
> This is especially helpful when you have a multi-tenant C* cluster and you 
> want to have the impact for handling such client to be minimal to the other 
> applications. This feature (killing individual session) seems to be a common 
> feature in other databases (regardless of whether the client has some 
> reconnect logic or not). It could be implemented as a JMX MBean method and 
> exposed through nodetool to the DBAs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10789) Allow DBAs to kill individual client sessions without bouncing JVM

2018-05-08 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467993#comment-16467993
 ] 

Ben Bromhead commented on CASSANDRA-10789:
--

I agree with your point on having it query-able (e.g. I would advocate doing 
via JMX, again to keep things simple).

CASSANDRA-13985 - Can do similar things but at the user level. I wish each 
client instance had an individual user and rotated credentials, but this is not 
normally the case.

The current suite of tools and admin commands that Cassandra supports at the 
moment pushes this kind of coordination to external tools and I'm not sure it's 
worth waiting for internal management of cluster wide commands unless they are 
just round the corner (which would be awesome).

> Allow DBAs to kill individual client sessions without bouncing JVM
> --
>
> Key: CASSANDRA-10789
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10789
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Wei Deng
>Assignee: Damien Stevenson
>Priority: Major
> Fix For: 4.x
>
> Attachments: 10789-trunk-dtest.txt, 10789-trunk.txt
>
>
> In production, there could be hundreds of clients connected to a Cassandra 
> cluster (maybe even from different applications), and if they use DataStax 
> Java Driver, each client will establish at least one TCP connection to a 
> Cassandra server (see 
> https://datastax.github.io/java-driver/2.1.9/features/pooling/). This is all 
> normal and at any given time, you can indeed see hundreds of ESTABLISHED 
> connections to port 9042 on a C* server (from netstat -na). The problem is 
> that sometimes when a C* cluster is under heavy load, when the DBA identifies 
> some client session that sends abusive amount of traffic to the C* server and 
> would like to stop it, they would like a lightweight approach rather than 
> shutting down the JVM or rolling restart the whole cluster to kill all 
> hundreds of connections in order to kill a single client session. If the DBA 
> had root privilege, they would have been able to do something at the OS 
> network level to achieve the same goal but oftentimes enterprise DBA role is 
> separate from OS sysadmin role, so the DBAs usually don't have that privilege.
> This is especially helpful when you have a multi-tenant C* cluster and you 
> want to have the impact for handling such client to be minimal to the other 
> applications. This feature (killing individual session) seems to be a common 
> feature in other databases (regardless of whether the client has some 
> reconnect logic or not). It could be implemented as a JMX MBean method and 
> exposed through nodetool to the DBAs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-10789) Allow DBAs to kill individual client sessions without bouncing JVM

2018-05-08 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467541#comment-16467541
 ] 

Ben Bromhead edited comment on CASSANDRA-10789 at 5/8/18 3:15 PM:
--

[~aweisberg] - Given we allow per node management of non-persistent settings 
via nodetool (e.g. compactionthroughput etc) and we might not want to blacklist 
clients on all nodes (e.g. only specific DCs) I think doing this at a per node 
level makes sense.

If it becomes a burden for an operations perspective, then we can improve on 
the underlying work? I'd love to see this land without having to wait on an 
underlying in Cassandra coordination mechanism. 

[~spo...@gmail.com] - Throttling and connection limiting solves a different 
problem.
 * Throttling/Limiting = I want to enforce good behavior on my clients.
 * Blocking = I don't want a client to connect to this node, no matter how well 
behaved it is.


was (Author: benbromhead):
[~aweisberg]

Given we allow per node management of non-persistent settings via nodetool 
(e.g. compactionthroughput etc) and we might not want to blacklist clients on 
all nodes (e.g. only specific DCs) I think doing this at a per node level makes 
sense.

If it becomes a burden for an operations perspective, then we can improve on 
the underlying work.

[~spo...@gmail.com] - Throttling and connection limiting solves a different 
problem.
 * Throttling/Limiting = I want to enforce good behavior on my clients.
 * Blocking = I don't want a client to connect to this node, no matter how well 
behaved it is.

> Allow DBAs to kill individual client sessions without bouncing JVM
> --
>
> Key: CASSANDRA-10789
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10789
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Wei Deng
>Assignee: Damien Stevenson
>Priority: Major
> Fix For: 4.x
>
> Attachments: 10789-trunk-dtest.txt, 10789-trunk.txt
>
>
> In production, there could be hundreds of clients connected to a Cassandra 
> cluster (maybe even from different applications), and if they use DataStax 
> Java Driver, each client will establish at least one TCP connection to a 
> Cassandra server (see 
> https://datastax.github.io/java-driver/2.1.9/features/pooling/). This is all 
> normal and at any given time, you can indeed see hundreds of ESTABLISHED 
> connections to port 9042 on a C* server (from netstat -na). The problem is 
> that sometimes when a C* cluster is under heavy load, when the DBA identifies 
> some client session that sends abusive amount of traffic to the C* server and 
> would like to stop it, they would like a lightweight approach rather than 
> shutting down the JVM or rolling restart the whole cluster to kill all 
> hundreds of connections in order to kill a single client session. If the DBA 
> had root privilege, they would have been able to do something at the OS 
> network level to achieve the same goal but oftentimes enterprise DBA role is 
> separate from OS sysadmin role, so the DBAs usually don't have that privilege.
> This is especially helpful when you have a multi-tenant C* cluster and you 
> want to have the impact for handling such client to be minimal to the other 
> applications. This feature (killing individual session) seems to be a common 
> feature in other databases (regardless of whether the client has some 
> reconnect logic or not). It could be implemented as a JMX MBean method and 
> exposed through nodetool to the DBAs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10789) Allow DBAs to kill individual client sessions without bouncing JVM

2018-05-08 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467541#comment-16467541
 ] 

Ben Bromhead commented on CASSANDRA-10789:
--

[~aweisberg]

Given we allow per node management of non-persistent settings via nodetool 
(e.g. compactionthroughput etc) and we might not want to blacklist clients on 
all nodes (e.g. only specific DCs) I think doing this at a per node level makes 
sense.

If it becomes a burden for an operations perspective, then we can improve on 
the underlying work.

[~spo...@gmail.com] - Throttling and connection limiting solves a different 
problem.
 * Throttling/Limiting = I want to enforce good behavior on my clients.
 * Blocking = I don't want a client to connect to this node, no matter how well 
behaved it is.

> Allow DBAs to kill individual client sessions without bouncing JVM
> --
>
> Key: CASSANDRA-10789
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10789
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Wei Deng
>Assignee: Damien Stevenson
>Priority: Major
> Fix For: 4.x
>
> Attachments: 10789-trunk-dtest.txt, 10789-trunk.txt
>
>
> In production, there could be hundreds of clients connected to a Cassandra 
> cluster (maybe even from different applications), and if they use DataStax 
> Java Driver, each client will establish at least one TCP connection to a 
> Cassandra server (see 
> https://datastax.github.io/java-driver/2.1.9/features/pooling/). This is all 
> normal and at any given time, you can indeed see hundreds of ESTABLISHED 
> connections to port 9042 on a C* server (from netstat -na). The problem is 
> that sometimes when a C* cluster is under heavy load, when the DBA identifies 
> some client session that sends abusive amount of traffic to the C* server and 
> would like to stop it, they would like a lightweight approach rather than 
> shutting down the JVM or rolling restart the whole cluster to kill all 
> hundreds of connections in order to kill a single client session. If the DBA 
> had root privilege, they would have been able to do something at the OS 
> network level to achieve the same goal but oftentimes enterprise DBA role is 
> separate from OS sysadmin role, so the DBAs usually don't have that privilege.
> This is especially helpful when you have a multi-tenant C* cluster and you 
> want to have the impact for handling such client to be minimal to the other 
> applications. This feature (killing individual session) seems to be a common 
> feature in other databases (regardless of whether the client has some 
> reconnect logic or not). It could be implemented as a JMX MBean method and 
> exposed through nodetool to the DBAs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-09 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431361#comment-16431361
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

Updated branch to include additional logging and a Warning for seed lists >= 20

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425809#comment-16425809
 ] 

Ben Bromhead edited comment on CASSANDRA-14361 at 4/4/18 4:30 PM:
--

I would argue that you need to be intentional about your DNS records, but I'm 
willing to concede that it moves config further away from the system which can 
hide the reason for behavior.

Would it be worth adding some info/debug level logging around what IPs got 
resolved for each time getSeeds() gets called?

At least that way the behavior of what gets looked up is observable and the 
answer to any potential WTFs is in the logs. 


was (Author: benbromhead):
I would argue you need to be intentional about your DNS records, but I'm 
willing to concede that it moves config further away from the system which can 
hide the reason for behavior.

Would it be worth adding some info/debug level logging around what IPs got 
resolved for each time getSeeds() gets called?

At least that way the behavior of what gets looked up is observable and the 
answer to any potential WTFs is in the logs. 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425809#comment-16425809
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

I would argue you need to be intentional about your DNS records, but I'm 
willing to concede that it moves config further away from the system which can 
hide the reason for behavior.

Would it be worth adding some info/debug level logging around what IPs got 
resolved for each time getSeeds() gets called?

At least that way the behavior of what gets looked up is observable and the 
answer to any potential WTFs is in the logs. 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425713#comment-16425713
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

Currently nothing stops users adding as many seeds as they want to the current 
implementation. So this change doesn't make the situation any worse. 

I do agree with you that gossip convergence based on the information provided 
in the seed list needs to be further explored, but I don't think this ticket is 
the right place to address it.

I would also argue that the seed providers only job is to act as an external 
oracle about membership independent of Gossip. Based on the current interface 
contract the seed provider should be able to return 1 or 1000 seeds, then 
whatever consumes that seeds list needs to make the correct decision about how 
to use that list.

However that's just my opinion and one I'm not inclined to argue that 
vigorously :) 

 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425647#comment-16425647
 ] 

Ben Bromhead edited comment on CASSANDRA-14361 at 4/4/18 3:15 PM:
--

{quote}JVM property {{networkaddress.cache.ttl}} must be set otherwise 
operators will have to do a rolling restart of the cluster each time the seed 
list changes (unless default is not {{-1}} on their platforms).
{quote}
Caching behavior remains the same, given operators relying on hostnames in the 
seed list would need to be aware of this anyway, whether it looks up one IP or 
multiple. Also the JVM will only cache forever by default if a security manager 
is installed. If a security manager is not installed each specific JVM 
implementations can have a potentially different TTL, but a defined TTL 
nonetheless. See 
[https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html].
{quote}This would also affect the third gossip round: 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L185]
{quote}
Ah, of course.

If you are already using hostnames in the seed list, the behavior would still 
be the same? Other than your list of seeds potentially containing seed IPs you 
might not expect, if you are returning multiple IPs per A record, but not 
relying on the behavior. Resulting in potentially more gossip rounds to hit the 
IP you used to expect?

I've marked this as a 4.0 patch, so I think a change in behavior would be fine. 
I've created another ticket 
https://issues.apache.org/jira/browse/CASSANDRA-14364 to document seed behavior 
with relation to the third gossip round and DNS (including caching) behavior.  


was (Author: benbromhead):
{quote}JVM property {{networkaddress.cache.ttl}} must be set otherwise 
operators will have to do a rolling restart of the cluster each time the seed 
list changes (unless default is not {{-1}} on their platforms).
{quote}
Caching behavior remains the same, given operators relying on hostnames in the 
seed list would need to be aware of this anyway, whether it looks up one IP or 
multiple. Also the JVM will only cache forever by default if a security manager 
is installed. If a security manager is not installed each specific JVM 
implementations can have a potentially different TTL, but a defined TTL 
nonetheless. See 
[https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html].
{quote}This would also affect the third gossip round: 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L185]
{quote}
Ah, of course.

If you are already using hostnames in the seed list, the behavior would still 
be the same? Other than your list of seeds potentially containing seed IPs you 
might not expect, if you are returning multiple IPs per A record, but not 
relying on the behavior. Resulting in potentially more gossip rounds to hit the 
IP you used to expect?

I've marked this as a 4.0 patch, so I think a change in behavior would be fine. 
I've created another ticket 
https://issues.apache.org/jira/browse/CASSANDRA-14364 to document seed behavior 
with relation to the third gossip round and DNS (including caching) behavior. 

 

 

 

 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425647#comment-16425647
 ] 

Ben Bromhead commented on CASSANDRA-14361:
--

{quote}

JVM property networkaddress.cache.ttl must be set otherwise operators 
will have to do a rolling restart of the cluster each time the seed list 
changes (unless default is not -1 on their platforms).

{quote}

Caching behavior remains the same, given operators relying on hostnames in the 
seed list would need to be aware of this anyway, whether it looks up one IP or 
multiple. Also the JVM will only cache forever by default if a security manager 
is installed. If a security manager is not installed each specific JVM 
implementations can have a potentially different TTL, but a defined TTL 
nonetheless. See 
[https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html].

{quote}

This would also affect the third gossip round: 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L185]

{quote}

Ah, of course.

If you are already using hostnames in the seed list, the behavior would still 
be the same? Other than your list of seeds potentially containing seed IPs you 
might not expect, if you are returning multiple IPs per A record, but not 
relying on the behavior. Resulting in potentially more gossip rounds to hit the 
IP you used to expect?

I've marked this as a 4.0 patch, so I think a change in behavior would be fine. 
I've created another ticket 
https://issues.apache.org/jira/browse/CASSANDRA-14364 to document seed behavior 
with relation to the third gossip round and DNS (including caching) behavior. 

 

 

 

 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-04 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425647#comment-16425647
 ] 

Ben Bromhead edited comment on CASSANDRA-14361 at 4/4/18 2:52 PM:
--

{quote}JVM property {{networkaddress.cache.ttl}} must be set otherwise 
operators will have to do a rolling restart of the cluster each time the seed 
list changes (unless default is not {{-1}} on their platforms).
{quote}
Caching behavior remains the same, given operators relying on hostnames in the 
seed list would need to be aware of this anyway, whether it looks up one IP or 
multiple. Also the JVM will only cache forever by default if a security manager 
is installed. If a security manager is not installed each specific JVM 
implementations can have a potentially different TTL, but a defined TTL 
nonetheless. See 
[https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html].
{quote}This would also affect the third gossip round: 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L185]
{quote}
Ah, of course.

If you are already using hostnames in the seed list, the behavior would still 
be the same? Other than your list of seeds potentially containing seed IPs you 
might not expect, if you are returning multiple IPs per A record, but not 
relying on the behavior. Resulting in potentially more gossip rounds to hit the 
IP you used to expect?

I've marked this as a 4.0 patch, so I think a change in behavior would be fine. 
I've created another ticket 
https://issues.apache.org/jira/browse/CASSANDRA-14364 to document seed behavior 
with relation to the third gossip round and DNS (including caching) behavior. 

 

 

 

 


was (Author: benbromhead):
{quote}

JVM property networkaddress.cache.ttl must be set otherwise operators 
will have to do a rolling restart of the cluster each time the seed list 
changes (unless default is not -1 on their platforms).

{quote}

Caching behavior remains the same, given operators relying on hostnames in the 
seed list would need to be aware of this anyway, whether it looks up one IP or 
multiple. Also the JVM will only cache forever by default if a security manager 
is installed. If a security manager is not installed each specific JVM 
implementations can have a potentially different TTL, but a defined TTL 
nonetheless. See 
[https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html].

{quote}

This would also affect the third gossip round: 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L185]

{quote}

Ah, of course.

If you are already using hostnames in the seed list, the behavior would still 
be the same? Other than your list of seeds potentially containing seed IPs you 
might not expect, if you are returning multiple IPs per A record, but not 
relying on the behavior. Resulting in potentially more gossip rounds to hit the 
IP you used to expect?

I've marked this as a 4.0 patch, so I think a change in behavior would be fine. 
I've created another ticket 
https://issues.apache.org/jira/browse/CASSANDRA-14364 to document seed behavior 
with relation to the third gossip round and DNS (including caching) behavior. 

 

 

 

 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian 

[jira] [Created] (CASSANDRA-14364) Update Seed provider documentation

2018-04-04 Thread Ben Bromhead (JIRA)
Ben Bromhead created CASSANDRA-14364:


 Summary: Update Seed provider documentation
 Key: CASSANDRA-14364
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14364
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation and Website
Reporter: Ben Bromhead


Update documentation to describe how Cassandra uses the seed list. Include 
details about nuances of using DNS hostnames vs IP addresses. 

 

[http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html#seed-provider]
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12106) Add ability to blacklist a CQL partition so all requests are ignored

2018-04-02 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422742#comment-16422742
 ] 

Ben Bromhead commented on CASSANDRA-12106:
--

+1 to just setting it at the single node level. 

> Add ability to blacklist a CQL partition so all requests are ignored
> 
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Geoffrey Yu
>Assignee: Sumanth Pasupuleti
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blacklist such 
> partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-02 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead updated CASSANDRA-14361:
-
Status: Patch Available  (was: Open)

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-02 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead updated CASSANDRA-14361:
-
Comment: was deleted

(was: Currently SimpleSeedProvider can accept a comma separated string of IPs 
or hostnames as the set of Cassandra seeds. hostnames are resolved via 
InetAddress.getByName, which will only return the first IP associated with an 
A,  or CNAME record.

By changing to InetAddress.getAllByName, existing behavior is preserved, but 
now Cassandra can discover multiple IP address per record, allowing seed 
discovery by DNS to be a little easier.

Some examples of improved workflows with this change include: 
 * specify the DNS name of a headless service in Kubernetes which will resolve 
to all IP addresses of pods within that service. 
 * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
 * Other common DNS service discovery mechanisms.

The only behavior this is likely to impact would be where users are relying on 
the fact that getByName only returns a single IP address.

I can't imagine any scenario where that is a sane choice. Even when that choice 
has been made, it only impacts the first startup of Cassandra and would not be 
on any critical path.

 )

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-02 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead updated CASSANDRA-14361:
-
Description: 
Currently SimpleSeedProvider can accept a comma separated string of IPs or 
hostnames as the set of Cassandra seeds. hostnames are resolved via 
InetAddress.getByName, which will only return the first IP associated with an 
A,  or CNAME record.

By changing to InetAddress.getAllByName, existing behavior is preserved, but 
now Cassandra can discover multiple IP address per record, allowing seed 
discovery by DNS to be a little easier.

Some examples of improved workflows with this change include: 
 * specify the DNS name of a headless service in Kubernetes which will resolve 
to all IP addresses of pods within that service. 
 * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
 * Other common DNS service discovery mechanisms.

The only behavior this is likely to impact would be where users are relying on 
the fact that getByName only returns a single IP address.

I can't imagine any scenario where that is a sane choice. Even when that choice 
has been made, it only impacts the first startup of Cassandra and would not be 
on any critical path.

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>
> Currently SimpleSeedProvider can accept a comma separated string of IPs or 
> hostnames as the set of Cassandra seeds. hostnames are resolved via 
> InetAddress.getByName, which will only return the first IP associated with an 
> A,  or CNAME record.
> By changing to InetAddress.getAllByName, existing behavior is preserved, but 
> now Cassandra can discover multiple IP address per record, allowing seed 
> discovery by DNS to be a little easier.
> Some examples of improved workflows with this change include: 
>  * specify the DNS name of a headless service in Kubernetes which will 
> resolve to all IP addresses of pods within that service. 
>  * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
>  * Other common DNS service discovery mechanisms.
> The only behavior this is likely to impact would be where users are relying 
> on the fact that getByName only returns a single IP address.
> I can't imagine any scenario where that is a sane choice. Even when that 
> choice has been made, it only impacts the first startup of Cassandra and 
> would not be on any critical path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-02 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead updated CASSANDRA-14361:
-

Currently SimpleSeedProvider can accept a comma separated string of IPs or 
hostnames as the set of Cassandra seeds. hostnames are resolved via 
InetAddress.getByName, which will only return the first IP associated with an 
A,  or CNAME record.

By changing to InetAddress.getAllByName, existing behavior is preserved, but 
now Cassandra can discover multiple IP address per record, allowing seed 
discovery by DNS to be a little easier.

Some examples of improved workflows with this change include: 
 * specify the DNS name of a headless service in Kubernetes which will resolve 
to all IP addresses of pods within that service. 
 * seed discovery for multi-region clusters via AWS route53, AzureDNS etc
 * Other common DNS service discovery mechanisms.

The only behavior this is likely to impact would be where users are relying on 
the fact that getByName only returns a single IP address.

I can't imagine any scenario where that is a sane choice. Even when that choice 
has been made, it only impacts the first startup of Cassandra and would not be 
on any critical path.

 

> Allow SimpleSeedProvider to resolve multiple IPs per DNS name
> -
>
> Key: CASSANDRA-14361
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
> Fix For: 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14361) Allow SimpleSeedProvider to resolve multiple IPs per DNS name

2018-04-02 Thread Ben Bromhead (JIRA)
Ben Bromhead created CASSANDRA-14361:


 Summary: Allow SimpleSeedProvider to resolve multiple IPs per DNS 
name
 Key: CASSANDRA-14361
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14361
 Project: Cassandra
  Issue Type: Improvement
  Components: Configuration
Reporter: Ben Bromhead
Assignee: Ben Bromhead
 Fix For: 4.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13910) Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance

2017-10-02 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188753#comment-16188753
 ] 

Ben Bromhead commented on CASSANDRA-13910:
--

read_repair_chance has caused more pain that it's solved imho. 

* Inexperienced users sometimes increase this value to insane values when 
regular repairs are not working, or they are not doing the right thing with 
their CL. 
* Doesn't ever make a significant difference in consistency given most real 
world workload and query distributions - this generally only ends up triggering 
on the hottest partitions and as mentioned in the issue description will get 
hit by read_repair anyway. 

The only good thing about read_repair_chance is it's something you can turn off 
on a cluster that is under load pressure. This backed up by my many experiences 
of turning it off :)

+1 to creating a thread on @dev / @user for feed back 

> Consider deprecating (then removing) 
> read_repair_chance/dclocal_read_repair_chance
> --
>
> Key: CASSANDRA-13910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13910
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Priority: Minor
>  Labels: CommunityFeedbackRequested
>
> First, let me clarify so this is not misunderstood that I'm not *at all* 
> suggesting to remove the read-repair mechanism of detecting and repairing 
> inconsistencies between read responses: that mechanism is imo fine and 
> useful.  But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} 
> have never been about _enabling_ that mechanism, they are about querying all 
> replicas (even when this is not required by the consistency level) for the 
> sole purpose of maybe read-repairing some of the replica that wouldn't have 
> been queried otherwise. Which btw, bring me to reason 1 for considering their 
> removal: their naming/behavior is super confusing. Over the years, I've seen 
> countless users (and not only newbies) misunderstanding what those options 
> do, and as a consequence misunderstand when read-repair itself was happening.
> But my 2nd reason for suggesting this is that I suspect 
> {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially 
> nowadays, more harmful than anything else when enabled. When those option 
> kick in, what you trade-off is additional resources consumption (all nodes 
> have to execute the read) for a _fairly remote chance_ of having some 
> inconsistencies repaired on _some_ replica _a bit faster_ than they would 
> otherwise be. To justify that last part, let's recall that:
> # most inconsistencies are actually fixed by hints in practice; and in the 
> case where a node stay dead for a long time so that hints ends up timing-out, 
> you really should repair the node when it comes back (if not simply 
> re-bootstrapping it).  Read-repair probably don't fix _that_ much stuff in 
> the first place.
> # again, read-repair do happen without those options kicking in. If you do 
> reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all 
> the same.  Just a tiny bit less quickly.
> # I suspect almost everyone use a low "chance" for those options at best 
> (because the extra resources consumption is real), so at the end of the day, 
> it's up to chance how much faster this fixes inconsistencies.
> Overall, I'm having a hard time imagining real cases where that trade-off 
> really make sense. Don't get me wrong, those options had their places a long 
> time ago when hints weren't working all that well, but I think they bring 
> more confusion than benefits now.
> And I think it's sane to reconsider stuffs every once in a while, and to 
> clean up anything that may not make all that much sense anymore, which I 
> think is the case here.
> Tl;dr, I feel the benefits brought by those options are very slim at best and 
> well overshadowed by the confusion they bring, and not worth maintaining the 
> code that supports them (which, to be fair, isn't huge, but getting rid of 
> {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance).
> Lastly, if the consensus here ends up being that they can have their use in 
> weird case and that we fill supporting those cases is worth confusing 
> everyone else and maintaining that code, I would still suggest disabling them 
> totally by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-05-12 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008528#comment-16008528
 ] 

Ben Bromhead commented on CASSANDRA-11471:
--

bq. The original intent of this ticket was to enable multiple mechanisms to be 
supported simultaneously (e.g. use common name auth for encrypted connections 
if the certificates would allow it and fall back to password auth if not), but 
the patch as it is doesn't exactly do that. It seems to me that an admin could 
provide a custom IAuthenticator which had a list of mechanisms > 1 but it feels 
like that doesn't really improve on the status quo that much. Ideally, I think 
we need to be able to configure multiple IAuthenticators in yaml and have the 
client choose which of them to interact with. There are a few places which make 
an assumption that there is only a single IAuthenticator, so those would need 
to be addressed.

The idea behind this code (and admittedly it is somewhat shoehorned into this 
existing ticket), is to support negotiation for SASL mechanisms. While the SASL 
standard leaves the negotiation mechanism to the underlying protocol, the SASL 
authentication mechanisms are well defined (e.g. PLAIN, GSSAPI, etc) and 
controlled. One interpretation of the SASL standard, is the underlying protocol 
should define a negotiation mechanism and having a negotiation mechanism that 
changes based on the authentication implementation actually is not correct wrt 
to the SASL standard. 

Given this is a protocol level change (to cql), I think it makes sense for 
Cassandra to be opinionated about how the auth mechanism is negotiated rather 
than leaving it up to the individual IAuthenticator itself. 

This would also allow the driver to implement say SCRAM, or MD5 once and not 
actually care about the IAuthenticator mechanism. By introducing a defined 
negotiation mechanism into Cassandra, driver authors only need to implement a 
single authentication mechanism once and you won't have multiple GSSAPI 
implementations that target different IAuthenticator implementations. This 
change will also still let implementors of Authenticators do their own thing if 
they want to use a non-standard authentication mechanism. Supporting multiple 
auth mechanisms, while not the main intent of the ticket, was done mainly cause 
it was easy given we are already changing the CQL protocol to have a well 
defined authentication mechanism negotiation process.

bq. Following from that, I don't think that negotiation of the actual mechanism 
ought to be a function of the SASLNegotiator itself, at least not in it's 
current form (NegotiatingSaslNegotiator). Maybe we can compose the 
available/supported IAuthenticators into some class which aggregates them & 
have it perform the negotiation (i.e. selecting the instance based on the 
client's chosen mechanism). Or maybe this just happens in 
AuthResponse::execute. Basically, the actual IAuthenticator doesn't need to get 
involved until its mechanism has been selected.

This makes sense, but sticking with the above reasoning, I think Cassandra 
should still be responsible for negotiating what auth mechanism is used and 
then hand over to the implementation. It might make sense to pull this out, I 
left it in as part of IAuth mainly to reduce the amount of code changed and fit 
it in the existing implementation. However by my own logic, if C* is going to 
dictate negotiation, it should do this outside of the pluggable IAuth 
interface? I don't have any strong feelings either way tbh.

bq. Rather than adding a new factory method to IAuthenticator, wouldn't it be 
cleaner to add a withCertificates(Certificate[]) method with a default no-op 
implementation to SaslNegotiator? That way, the branching in ServerConnection 
is simplified, the need for the Optional is removed (because we just don't call 
it if the certs are null) and IAuthenticator impls which don't care about certs 
don't have to change at all.

That certainly seems cleaner to me! I'll give it a go.



> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Fix For: 4.x
>
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be 

[jira] [Commented] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-04-14 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969643#comment-15969643
 ] 

Ben Bromhead commented on CASSANDRA-11471:
--

So I had a look at the cassandra-test branch for the python driver. Default 
highest supported protocol is v4 
(https://github.com/datastax/python-driver/blob/cassandra-test/cassandra/cluster.py#L358)
 which means it should have hit the new code path (e.g. it received a list of 
SASL mechanisms instead of the JAVA class).

The thing is the the python PasswordAuthenticator code doesn't actually care 
about the first server challenge, it just sends the username and password 
irrespective of what the server says as part of the startup message. In 
connection.py I can't see where the authenticator_class ever gets used (it does 
however get assigned).

Which would go someway to explaining why it worked.

In the meantime I've updated my branch to use optional. 

It might make sense to land support for SASL negotiation in the python driver 
first before committing this, however like any protocol change both C* and the 
driver the tests depend on will need to change simultaneously with tests 
potentially breaking in between? Unless there is a process around this? 

> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Fix For: 4.x
>
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the {{EXTERNAL}} mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-04-03 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953956#comment-15953956
 ] 

Ben Bromhead commented on CASSANDRA-11471:
--

I've created a new Jira for the Datastax Java Driver 
(https://datastax-oss.atlassian.net/browse/JAVA-1434), I'll look to do one for 
the Python driver as well as iirc cqlsh depends on it.

Updated to address nits:
* Authenticator flow is now only for v5 and up
* Spelling error in IAuthenticator

Any thoughts on passing null certs vs an extra method in the IAuthenticator?

> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Fix For: 4.x
>
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the {{EXTERNAL}} mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-03-24 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15941051#comment-15941051
 ] 

Ben Bromhead edited comment on CASSANDRA-11471 at 3/24/17 8:06 PM:
---

Sorry for the delay, the joy of a new baby :)

Addressed all the comments except one.

bq. Only if encryption is optional? Basically because the authenticator can 
only work if the certificates are there? It seems like this can NPE?

Currently getSaslNegotiator will try to get the certificate chain from the 
channel if client encryption is enabled and connecting on an encrypted session 
is not optional. 

This means null instead of a certificate chain will be passed in when getting 
the new SASL authenticator. I couldn't think of a nice way to pass the 
certificate chain to the authenticator but still respect the fact there are 
authenticators that just don't care about them. 

Originally my thinking was that Optional does not appear to be used in the 
project and I didn't want to add even more more methods to IAuthenticator. 

Thinking about it again, it probably just makes sense to overload 
newV5SaslNegotiator and not have to pass in certificates, which would reduce 
the chance of someone implementing a new Authenticator getting an NPE.

||4.0||
|[Branch|https://github.com/apache/cassandra/compare/trunk...benbromhead:11471]|




was (Author: benbromhead):
Sorry for the delay, the joy of a new baby :)

Addressed all the comments except one.

bq. Only if encryption is optional? Basically because the authenticator can 
only work if the certificates are there? It seems like this can NPE?

Currently getSaslNegotiator will try to get the certificate chain from the 
channel if client encryption is enabled and connecting on an encrypted session 
is not optional. 

This means null instead of a certificate chain will be passed in when getting 
the new SASL authenticator. I couldn't think of a nice way to pass the 
certificate chain to the authenticator but still respect the fact there are 
authenticators that just don't care about them. 

Given that Optional does not appear to be used in the project and I didn't want 
to add even more more methods to IAuthenticator. Thinking about it again, it 
probably just makes sense to overload newV5SaslNegotiator and not have to pass 
in certificates, which would reduce the chance of someone implementing a new 
Authenticator getting an NPE.

||4.0||
|[Branch|https://github.com/apache/cassandra/compare/trunk...benbromhead:11471]|



> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement 

[jira] [Comment Edited] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-03-24 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15941051#comment-15941051
 ] 

Ben Bromhead edited comment on CASSANDRA-11471 at 3/24/17 8:05 PM:
---

Sorry for the delay, the joy of a new baby :)

Addressed all the comments except one.

bq. Only if encryption is optional? Basically because the authenticator can 
only work if the certificates are there? It seems like this can NPE?

Currently getSaslNegotiator will try to get the certificate chain from the 
channel if client encryption is enabled and connecting on an encrypted session 
is not optional. 

This means null instead of a certificate chain will be passed in when getting 
the new SASL authenticator. I couldn't think of a nice way to pass the 
certificate chain to the authenticator but still respect the fact there are 
authenticators that just don't care about them. 

Given that Optional does not appear to be used in the project and I didn't want 
to add even more more methods to IAuthenticator. Thinking about it again, it 
probably just makes sense to overload newV5SaslNegotiator and not have to pass 
in certificates, which would reduce the chance of someone implementing a new 
Authenticator getting an NPE.

||4.0||
|[Branch|https://github.com/apache/cassandra/compare/trunk...benbromhead:11471]|




was (Author: benbromhead):
Addressed all the comments except one.

bq. Only if encryption is optional? Basically because the authenticator can 
only work if the certificates are there? It seems like this can NPE?

Currently getSaslNegotiator will try to get the certificate chain from the 
channel if client encryption is enabled and connecting on an encrypted session 
is not optional. 

This means null instead of a certificate chain will be passed in when getting 
the new SASL authenticator. I couldn't think of a nice way to pass the 
certificate chain to the authenticator but still respect the fact there are 
authenticators that just don't care about them. 

Given that Optional does not appear to be used in the project and I didn't want 
to add even more more methods to IAuthenticator. Thinking about it again, it 
probably just makes sense to overload newV5SaslNegotiator and not have to pass 
in certificates, which would reduce the chance of someone implementing a new 
Authenticator getting an NPE.

||4.0||
|[Branch|https://github.com/apache/cassandra/compare/trunk...benbromhead:11471]|



> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the 

[jira] [Commented] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-03-24 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15941051#comment-15941051
 ] 

Ben Bromhead commented on CASSANDRA-11471:
--

Addressed all the comments except one.

bq. Only if encryption is optional? Basically because the authenticator can 
only work if the certificates are there? It seems like this can NPE?

Currently getSaslNegotiator will try to get the certificate chain from the 
channel if client encryption is enabled and connecting on an encrypted session 
is not optional. 

This means null instead of a certificate chain will be passed in when getting 
the new SASL authenticator. I couldn't think of a nice way to pass the 
certificate chain to the authenticator but still respect the fact there are 
authenticators that just don't care about them. 

Given that Optional does not appear to be used in the project and I didn't want 
to add even more more methods to IAuthenticator. Thinking about it again, it 
probably just makes sense to overload newV5SaslNegotiator and not have to pass 
in certificates, which would reduce the chance of someone implementing a new 
Authenticator getting an NPE.

||4.0||
|[Branch|https://github.com/apache/cassandra/compare/trunk...benbromhead:11471]|



> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the {{EXTERNAL}} mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-03-06 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897645#comment-15897645
 ] 

Ben Bromhead commented on CASSANDRA-11471:
--

Thanks I will look to resolve the comments this week.

> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the {{EXTERNAL}} mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13241) Lower default chunk_length_in_kb from 64kb to 4kb

2017-02-28 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887529#comment-15887529
 ] 

Ben Bromhead commented on CASSANDRA-13241:
--

Given this is an optimization for read performance of SSTables not in the page 
cache, further sacrificing off heap memory that would likely be occupied by the 
page cache anyway might not be a big deal. I have only come across one 
deployment that tries to keep everything in the page cache...

Still 2GB of memory just for storing chunk offsets is pretty crazy, and 
improving that to support smaller chunks that can align with the much smaller 
SSD page sizes would be a pretty good win, even if that doesn't end up being 
the new default.

> Lower default chunk_length_in_kb from 64kb to 4kb
> -
>
> Key: CASSANDRA-13241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13241
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Benjamin Roth
>
> Having a too low chunk size may result in some wasted disk space. A too high 
> chunk size may lead to massive overreads and may have a critical impact on 
> overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and 
> avg reads of 200MB/s. After lowering chunksize (of course aligned with read 
> ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / 
> (total data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but 
> if the model consists rather of small rows or small resultsets, the read 
> overhead with 64kb chunk size is insanely high. This applies for example for 
> (small) skinny rows.
> Please also see here:
> https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY
> To give you some insights what a difference it can make (460GB data, 128GB 
> RAM):
> - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L
> - Disk throughput: https://cl.ly/2a0Z250S1M3c
> - This shows, that the request distribution remained the same, so no "dynamic 
> snitch magic": https://cl.ly/3E0t1T1z2c0J



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13241) Lower default chunk_length_in_kb from 64kb to 4kb

2017-02-27 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886494#comment-15886494
 ] 

Ben Bromhead edited comment on CASSANDRA-13241 at 2/27/17 8:43 PM:
---

I had a quick look at the original SSTable compression ticket 
https://issues.apache.org/jira/browse/CASSANDRA-47 and I can't see any specific 
reasons for the choice of 64kb. Maybe the folks originally working on that 
ticket could comment if there is some reason I'm missing. 
 
Irrespective I've included trivial patch

||Branch||
|[4.0|https://github.com/apache/cassandra/compare/trunk...benbromhead:13241]

Edit: Sorry I didn't see your most recent comment, I've included the trivial 
patch for what it's worth :)


was (Author: benbromhead):
I had a quick look at the original SSTable compression ticket 
https://issues.apache.org/jira/browse/CASSANDRA-47 and I can't see any specific 
reasons for the choice of 64kb. Maybe the folks originally working on that 
ticket could comment if there is some reason I'm missing. 
 
Irrespective I've included trivial patch

||Branch||
|[4.0|https://github.com/apache/cassandra/compare/trunk...benbromhead:13241]

> Lower default chunk_length_in_kb from 64kb to 4kb
> -
>
> Key: CASSANDRA-13241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13241
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Benjamin Roth
>
> Having a too low chunk size may result in some wasted disk space. A too high 
> chunk size may lead to massive overreads and may have a critical impact on 
> overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and 
> avg reads of 200MB/s. After lowering chunksize (of course aligned with read 
> ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / 
> (total data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but 
> if the model consists rather of small rows or small resultsets, the read 
> overhead with 64kb chunk size is insanely high. This applies for example for 
> (small) skinny rows.
> Please also see here:
> https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY
> To give you some insights what a difference it can make (460GB data, 128GB 
> RAM):
> - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L
> - Disk throughput: https://cl.ly/2a0Z250S1M3c
> - This shows, that the request distribution remained the same, so no "dynamic 
> snitch magic": https://cl.ly/3E0t1T1z2c0J



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13241) Lower default chunk_length_in_kb from 64kb to 4kb

2017-02-27 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886494#comment-15886494
 ] 

Ben Bromhead commented on CASSANDRA-13241:
--

I had a quick look at the original SSTable compression ticket 
https://issues.apache.org/jira/browse/CASSANDRA-47 and I can't see any specific 
reasons for the choice of 64kb. Maybe the folks originally working on that 
ticket could comment if there is some reason I'm missing. 
 
Irrespective I've included trivial patch

||Branch||
|[4.0|https://github.com/apache/cassandra/compare/trunk...benbromhead:13241]

> Lower default chunk_length_in_kb from 64kb to 4kb
> -
>
> Key: CASSANDRA-13241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13241
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Benjamin Roth
>
> Having a too low chunk size may result in some wasted disk space. A too high 
> chunk size may lead to massive overreads and may have a critical impact on 
> overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and 
> avg reads of 200MB/s. After lowering chunksize (of course aligned with read 
> ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / 
> (total data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but 
> if the model consists rather of small rows or small resultsets, the read 
> overhead with 64kb chunk size is insanely high. This applies for example for 
> (small) skinny rows.
> Please also see here:
> https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY
> To give you some insights what a difference it can make (460GB data, 128GB 
> RAM):
> - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L
> - Disk throughput: https://cl.ly/2a0Z250S1M3c
> - This shows, that the request distribution remained the same, so no "dynamic 
> snitch magic": https://cl.ly/3E0t1T1z2c0J



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13241) Lower default chunk_length_in_kb from 64kb to 4kb

2017-02-27 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886361#comment-15886361
 ] 

Ben Bromhead commented on CASSANDRA-13241:
--

We generally end up recommending to our customers they reduce their default 
chunk_length_in_kb for most applications generally to be around the average 
size of their reads (dependent on end latency goals) with a floor of the 
underlying disks smallest read unit (generally for SSDs this is the page size, 
rather than block size iirc). This ends up being anywhere from 2kb - 16kb 
depending on hardware. I would say driving higher IOPs/lower latencies through 
the disk rather than throughput is much more aligned with the standard use 
cases for Cassandra.

4kb is pretty common and I would be very happy with it as the default chunk 
length, especially given that SSDs are a pretty much standard recommendation 
for C*. Increasing the chunk length for better compression whilst sacrificing 
read perf should be opt-in rather than default.

+1


> Lower default chunk_length_in_kb from 64kb to 4kb
> -
>
> Key: CASSANDRA-13241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13241
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Benjamin Roth
>
> Having a too low chunk size may result in some wasted disk space. A too high 
> chunk size may lead to massive overreads and may have a critical impact on 
> overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and 
> avg reads of 200MB/s. After lowering chunksize (of course aligned with read 
> ahead), the avg read IO went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / 
> (total data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but 
> if the model consists rather of small rows or small resultsets, the read 
> overhead with 64kb chunk size is insanely high. This applies for example for 
> (small) skinny rows.
> Please also see here:
> https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY
> To give you some insights what a difference it can make (460GB data, 128GB 
> RAM):
> - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L
> - Disk throughput: https://cl.ly/2a0Z250S1M3c
> - This shows, that the request distribution remained the same, so no "dynamic 
> snitch magic": https://cl.ly/3E0t1T1z2c0J



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-02-15 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead reassigned CASSANDRA-11471:


Assignee: Ben Bromhead

> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>Assignee: Ben Bromhead
>  Labels: client-impacting
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the {{EXTERNAL}} mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2017-02-15 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868750#comment-15868750
 ] 

Ben Bromhead commented on CASSANDRA-11471:
--

Ok finally got to work on this a little more and now ready for some feedback 
while I finish up tests. I've also put together a brief overview of the patch 
and negotiation flow in a google doc 
[here|https://docs.google.com/document/d/1u-d9ZMgZ4Fn1VW19-iReo7Kks8aCkDUKqrF4a-3R1Ew/edit?usp=sharing]

||4.0||
|[Branch|https://github.com/apache/cassandra/compare/trunk...benbromhead:11471]|
|[Java 
Driver|https://github.com/datastax/java-driver/compare/3.x...benbromhead:11471]|

> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>  Labels: client-impacting
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the {{EXTERNAL}} mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CASSANDRA-13048) Support SASL mechanism negotiation in existing Authenticators

2016-12-15 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead reassigned CASSANDRA-13048:


Assignee: Ben Bromhead

> Support SASL mechanism negotiation in existing Authenticators
> -
>
> Key: CASSANDRA-13048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13048
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Ben Bromhead
>Assignee: Ben Bromhead
>Priority: Minor
>  Labels: client-auth, client-impacting
>
> [CASSANDRA-11471|https://issues.apache.org/jira/browse/CASSANDRA-11471] adds 
> support for SASL mechanism negotitation to the native protocol. Existing 
> Authenticators should follow the SASL negotiation mechanism used in 
> [CASSANDRA-11471|https://issues.apache.org/jira/browse/CASSANDRA-11471] .
> It may make sense to make the SASL negotiation mechanism extensible so other 
> Authenticators can use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-13048) Support SASL mechanism negotiation in existing Authenticators

2016-12-15 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead updated CASSANDRA-13048:
-
Issue Type: Sub-task  (was: Task)
Parent: CASSANDRA-9362

> Support SASL mechanism negotiation in existing Authenticators
> -
>
> Key: CASSANDRA-13048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13048
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Ben Bromhead
>Priority: Minor
>  Labels: client-auth, client-impacting
>
> [CASSANDRA-11471|https://issues.apache.org/jira/browse/CASSANDRA-11471] adds 
> support for SASL mechanism negotitation to the native protocol. Existing 
> Authenticators should follow the SASL negotiation mechanism used in 
> [CASSANDRA-11471|https://issues.apache.org/jira/browse/CASSANDRA-11471] .
> It may make sense to make the SASL negotiation mechanism extensible so other 
> Authenticators can use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-13048) Support SASL mechanism negotiation in existing Authenticators

2016-12-15 Thread Ben Bromhead (JIRA)
Ben Bromhead created CASSANDRA-13048:


 Summary: Support SASL mechanism negotiation in existing 
Authenticators
 Key: CASSANDRA-13048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13048
 Project: Cassandra
  Issue Type: Task
Reporter: Ben Bromhead
Priority: Minor


[CASSANDRA-11471|https://issues.apache.org/jira/browse/CASSANDRA-11471] adds 
support for SASL mechanism negotitation to the native protocol. Existing 
Authenticators should follow the SASL negotiation mechanism used in 
[CASSANDRA-11471|https://issues.apache.org/jira/browse/CASSANDRA-11471] .

It may make sense to make the SASL negotiation mechanism extensible so other 
Authenticators can use it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11471) Add SASL mechanism negotiation to the native protocol

2016-12-13 Thread Ben Bromhead (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Bromhead updated CASSANDRA-11471:
-
Attachment: CASSANDRA-11471

First cut of support for SASL negotiation in V4 for comment. 

The Authenticate message returns a comma seperated list of SASL mechanisms 
supported by the IAuthenticator class. Backwards compatible with V3 (returns 
classname instead).

Currently does not support detection of downgrade attacks as there is no 
integrity mechanisms. I think it's sufficient to rely on SSL for this.

Currently up to the IAuthenticator implementation to handle client selection of 
SASL mechanism (should be in the first response from client).

I have not updated CQL spec in this patch either. 

> Add SASL mechanism negotiation to the native protocol
> -
>
> Key: CASSANDRA-11471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11471
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: CQL
>Reporter: Sam Tunnicliffe
>  Labels: client-impacting
> Attachments: CASSANDRA-11471
>
>
> Introducing an additional message exchange into the authentication sequence 
> would allow us to support multiple authentication schemes and [negotiation of 
> SASL mechanisms|https://tools.ietf.org/html/rfc4422#section-3.2]. 
> The current {{AUTHENTICATE}} message sent from Client to Server includes the 
> java classname of the configured {{IAuthenticator}}. This could be superceded 
> by a new message which lists the SASL mechanisms supported by the server. The 
> client would then respond with a new message which indicates it's choice of 
> mechanism.  This would allow the server to support multiple mechanisms, for 
> example enabling both {{PLAIN}} for username/password authentication and 
> {{EXTERNAL}} for a mechanism for extracting credentials from SSL 
> certificates\* (see the example in 
> [RFC-4422|https://tools.ietf.org/html/rfc4422#appendix-A]). Furthermore, the 
> server could tailor the list of supported mechanisms on a per-connection 
> basis, e.g. only offering certificate based auth to encrypted clients. 
> The client's response should include the selected mechanism and any initial 
> response data. This is mechanism-specific; the {{PLAIN}} mechanism consists 
> of a single round in which the client sends encoded credentials as the 
> initial response data and the server response indicates either success or 
> failure with no futher challenges required.
> From a protocol perspective, after the mechanism negotiation the exchange 
> would continue as in protocol v4, with one or more rounds of 
> {{AUTH_CHALLENGE}} and {{AUTH_RESPONSE}} messages, terminated by an 
> {{AUTH_SUCCESS}} sent from Server to Client upon successful authentication or 
> an {{ERROR}} on auth failure. 
> XMPP performs mechanism negotiation in this way, 
> [RFC-3920|http://tools.ietf.org/html/rfc3920#section-6] includes a good 
> overview.
> \* Note: this would require some a priori agreement between client and server 
> over the implementation of the {{EXTERNAL}} mechanism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9633) Add ability to encrypt sstables

2016-11-17 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675196#comment-15675196
 ] 

Ben Bromhead commented on CASSANDRA-9633:
-

ok, sounds great! Thanks for the hard work on this patch :) 

though If I could put in a minor (very small) request that compression is off 
by default/user configurable. 

I wouldn't be comfortable with compression before encryption as it creates 
another avenue where it is possible to leak information and increase the 
possibility of a CRIME style attack. I know the accepted wisdom used to be that 
you should compress before you encrypt, as this is the only way to reduce the 
size of what you encrypt, but it has since been proven to reduce the security 
of the chosen encryption scheme.

If a user understands the added risk or it is unlikely an attacker would have 
control over the plain text then they can opt in to using compression. 

> Add ability to encrypt sstables
> ---
>
> Key: CASSANDRA-9633
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9633
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jason Brown
>Assignee: Jason Brown
>  Labels: encryption, security, sstable
> Fix For: 3.x
>
>
> Add option to allow encrypting of sstables.
> I have a version of this functionality built on cassandra 2.0 that 
> piggy-backs on the existing sstable compression functionality and ICompressor 
> interface (similar in nature to what DataStax Enterprise does). However, if 
> we're adding the feature to the main OSS product, I'm not sure if we want to 
> use the pluggable compression framework or if it's worth investigating a 
> different path. I think there's a lot of upside in reusing the sstable 
> compression scheme, but perhaps add a new component in cqlsh for table 
> encryption and a corresponding field in CFMD.
> Encryption configuration in the yaml can use the same mechanism as 
> CASSANDRA-6018 (which is currently pending internal review).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9633) Add ability to encrypt sstables

2016-11-17 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674920#comment-15674920
 ] 

Ben Bromhead commented on CASSANDRA-9633:
-

There appears to be a bit of a stall on this ticket, I'm happy to address / 
review / help out on any outstanding comments / nits. I've included a few of my 
initial thoughts on some of the above issues, however I'm still going through 
the code and testing:

bq. Enabling encryption on a table silently discards any compression settings 
on the table

One note on supporting compression with encryption is that it's tricky to do 
correctly. Performing compression on input that is potentially attacker 
controlled is generally a big no-no. For an example of why this is not a good 
idea, check out the CRIME attack against SSL, which leverages using compression 
influenced size as an oracle https://en.wikipedia.org/wiki/CRIME.

I know this is done in DSE, I'm not familiar with the Datastax implementation 
as these are closed source, however I would be cautious using 
EncryptingSnappyCompressor et al. 

On the flip side performing compression after the SSTable has been encrypted 
tends to not be particularly effective due to apparent increase of entropy in 
the SSTable after encryption.

Due to the above, I would not worry about supporting the ability to encrypt AND 
compress SSTables in this first release of the feature.

bq. Not every cipher mode supports initialization vectors

Supporting cipher modes that don't use IVs allows users to shoot themselves in 
the foot. The example given above of AES/ECB/NoPadding is definitely a cipher 
mode we don't want to support. See https://blog.filippo.io/the-ecb-penguin/ 



> Add ability to encrypt sstables
> ---
>
> Key: CASSANDRA-9633
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9633
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jason Brown
>Assignee: Jason Brown
>  Labels: encryption, security, sstable
> Fix For: 3.x
>
>
> Add option to allow encrypting of sstables.
> I have a version of this functionality built on cassandra 2.0 that 
> piggy-backs on the existing sstable compression functionality and ICompressor 
> interface (similar in nature to what DataStax Enterprise does). However, if 
> we're adding the feature to the main OSS product, I'm not sure if we want to 
> use the pluggable compression framework or if it's worth investigating a 
> different path. I think there's a lot of upside in reusing the sstable 
> compression scheme, but perhaps add a new component in cqlsh for table 
> encryption and a corresponding field in CFMD.
> Encryption configuration in the yaml can use the same mechanism as 
> CASSANDRA-6018 (which is currently pending internal review).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12629) All Nodes Replication Strategy

2016-10-11 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566292#comment-15566292
 ] 

Ben Bromhead commented on CASSANDRA-12629:
--

This is not super important to get it committed as like you mentioned 
replication strategies are pluggable, just keen to figure out what we are 
missing here. 

We are still not hugely comfortable with the default rf of the system_auth 
keyspace and the way in which authN/Z information is replicated as the current 
default is also pretty easy to shoot yourself in the foot (from what we have 
seen helping folks out). Also while maintaining a separate process to manage 
system_auth keyspace RF and repairs works... it is also somewhat unwieldy (this 
is our current approach). 

After doing some reading, particularly of 
https://issues.apache.org/jira/browse/CASSANDRA-826, my gut feel is that 
replication of authN/Z keyspace requires a more elegant solution than what an 
"Everywhere" strategy would provide and should be more in line with the way the 
schema keyspaces behave? Such a discussion may require a new ticket or there is 
an existing one for it?

> All Nodes Replication Strategy
> --
>
> Key: CASSANDRA-12629
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12629
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alwyn Davis
>Priority: Minor
> Attachments: 12629-trunk.patch
>
>
> When adding a new DC, keyspaces must be manually updated to replicate to the 
> new DC.  This is problematic for system_auth, as it cannot achieve LOCAL_ONE 
> consistency (for a non-cassandra user), until its replication options have 
> been updated on an existing node.
> Ideally, system_auth could be set to an "All Nodes strategy" that will 
> replicate it to all nodes, as they join the cluster.  It also removes the 
> need to update the replication factor for system_auth when adding nodes to 
> the cluster to keep with the recommendation of RF=number of nodes (at least 
> for small clusters).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs

2015-08-06 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660844#comment-14660844
 ] 

Ben Bromhead commented on CASSANDRA-9291:
-

It worked for us but took a little bit of doing it a few times on the impacted 
nodes until we got them to stream from the right nodes. You need to have more 
good nodes than broken ones in this case (or at least one).

resetLocalSchema() in the MigrationManager picks the first node from 
Gossiper.instance.getLiveMembers(), so if you have enough (or at least one) 
good node in your cluster you can eventually get to a state of normality.

I think you can get the liveMembers list via JMX (correct me if I'm wrong here) 
to help figure out which node it will use, otherwise you can roll through the 
whole cluster excluding the known good ones. 

 Too many tombstones in schema_columns from creating too many CFs
 

 Key: CASSANDRA-9291
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9291
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Production Cluster with 2 DCs of 3 nodes each and 1 DC 
 of 7 nodes, running on dedicated Xeon hexacore, 96GB ram, RAID for Data and 
 SSF for commitlog, running Debian 7 (with Java 1.7.0_76-b13 64-Bit, 8GB and 
 16GB of heap tested).
 Dev Cluster with 1 DC with 3 nodes and 1 DC with 1 node, running on 
 virtualized env., Ubuntu 12.04.5 (with Java 1.7.0_72-b14 64-Bit 1GB, 4GB 
 heap) 
Reporter: Luis Correia
Priority: Blocker
 Attachments: after_schema.txt, before_schema.txt, schemas500.cql


 When creating lots of columnfamilies (about 200) the system.schema_columns 
 gets filled with tombstones and therefore prevents clients using the binary 
 protocol of connecting.
 Clients already connected continue normal operation (reading and inserting).
 Log messages are:
 For the first tries (sorry for the lack of precision):
 bq. ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) 
 Scanned over 10 tombstones in system.schema_columns; query aborted (see 
 tombstone_failure_threshold)
 For each client that tries to connect but fails with timeout:
  bq. WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 
 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147281748 columns was requested, slices=[-]
 bq. WARN [ReadStage:40] 2015-04-27 15:40:10,751 SliceQueryFilter.java (line 
 231) Read 864 live and 2664 

[jira] [Commented] (CASSANDRA-9291) Too many tombstones in schema_columns from creating too many CFs

2015-08-06 Thread Ben Bromhead (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660628#comment-14660628
 ] 

Ben Bromhead commented on CASSANDRA-9291:
-

FYI for those stumbling across this ticket experiencing similar issues 
`nodetool resetlocalschema` will drop schema info from the node and resync with 
other nodes, clearing out existing data (and tombstones).

So this would be the work around until 3.0

 Too many tombstones in schema_columns from creating too many CFs
 

 Key: CASSANDRA-9291
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9291
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Production Cluster with 2 DCs of 3 nodes each and 1 DC 
 of 7 nodes, running on dedicated Xeon hexacore, 96GB ram, RAID for Data and 
 SSF for commitlog, running Debian 7 (with Java 1.7.0_76-b13 64-Bit, 8GB and 
 16GB of heap tested).
 Dev Cluster with 1 DC with 3 nodes and 1 DC with 1 node, running on 
 virtualized env., Ubuntu 12.04.5 (with Java 1.7.0_72-b14 64-Bit 1GB, 4GB 
 heap) 
Reporter: Luis Correia
Priority: Blocker
 Attachments: after_schema.txt, before_schema.txt, schemas500.cql


 When creating lots of columnfamilies (about 200) the system.schema_columns 
 gets filled with tombstones and therefore prevents clients using the binary 
 protocol of connecting.
 Clients already connected continue normal operation (reading and inserting).
 Log messages are:
 For the first tries (sorry for the lack of precision):
 bq. ERROR [main] 2015-04-22 00:01:38,527 SliceQueryFilter.java (line 200) 
 Scanned over 10 tombstones in system.schema_columns; query aborted (see 
 tombstone_failure_threshold)
 For each client that tries to connect but fails with timeout:
  bq. WARN [ReadStage:35] 2015-04-27 15:40:10,600 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:40] 2015-04-27 15:40:10,609 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:61] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:51] 2015-04-27 15:40:10,670 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:55] 2015-04-27 15:40:10,675 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:35] 2015-04-27 15:40:10,707 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:40] 2015-04-27 15:40:10,708 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:43] 2015-04-27 15:40:10,715 SliceQueryFilter.java (line 
 231) Read 395 live and 1217 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147283441 columns was requested, slices=[-]
 bq. WARN [ReadStage:51] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:61] 2015-04-27 15:40:10,736 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:35] 2015-04-27 15:40:10,750 SliceQueryFilter.java (line 
 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147281748 columns was requested, slices=[-]
 bq. WARN [ReadStage:40] 2015-04-27 15:40:10,751 SliceQueryFilter.java (line 
 231) Read 864 live and 2664 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147281748 columns was requested, slices=[-]
 bq. WARN [ReadStage:55] 2015-04-27 15:40:10,759 SliceQueryFilter.java (line 
 231) Read 1146 live and 3534 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147282894 columns was requested, slices=[-]
 bq. WARN [ReadStage:51] 2015-04-27 15:40:10,821