[jira] [Commented] (CASSANDRA-14507) OutboundMessagingConnection backlog is not fully written in case of race conditions

2018-06-08 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506789#comment-16506789
 ] 

Dinesh Joshi commented on CASSANDRA-14507:
--

[~sbtourist] thank you for the bug report. 

{quote}3) The writer threads are scheduled back and add to the backlog, but the 
channel state is READY at this point, so those writes would sit in the backlog 
and expire.
{quote}

I am not clear on how the messages would simply sit in the backlog queue and 
expire? Wouldn't they be picked up by the 
{{MessageOutHandler::channelWritabilityChanged}} and then get drained? What am 
I missing here?

> OutboundMessagingConnection backlog is not fully written in case of race 
> conditions
> ---
>
> Key: CASSANDRA-14507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Sergio Bossa
>Priority: Major
>
> The {{OutboundMessagingConnection}} writes into a backlog queue before the 
> connection handshake is successfully completed, and then writes such backlog 
> to the channel as soon as the successful handshake moves the channel state to 
> {{READY}}.
> This is unfortunately race prone, as the following could happen:
> 1) One or more writer threads see the channel state as {{NOT_READY}} in 
> {{#sendMessage()}} and are about to enqueue to the backlog, but they get 
> descheduled by the OS.
> 2) The handshake thread is scheduled by the OS and moves the channel state to 
> {{READY}}, emptying the backlog.
> 3) The writer threads are scheduled back and add to the backlog, but the 
> channel state is {{READY}} at this point, so those writes would sit in the 
> backlog and expire.
> Please note a similar race condition exists between 
> {{OutboundMessagingConnection#sendMessage()}} and 
> {{MessageOutHandler#channelWritabilityChanged()}}, which is way more serious 
> as the channel writability could frequently change, luckily it looks like 
> {{ChannelWriter#write()}} never gets invoked with {{checkWritability}} at 
> {{true}} (so writes never go to the backlog when the channel is not writable).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506754#comment-16506754
 ] 

Jordan West commented on CASSANDRA-14499:
-

{quote}disabling gosspi alone is insufficient, also need to disable native
{quote}
Agreed. I hadn't updated the description to reflect it but what I am working on 
does this as well. 
{quote}still not sure I buy the argument that it’s wrong to serve reads in this 
case - it may be true that some table is getting out of sync, but that doesn’t 
mean every table is,
{quote}
I agree it depends on the workload for each specific dataset but since we can't 
know which we have we have to assume it could get really out of sync. 
{quote}and we already have a mechanism to deal with nodes that can serve reads 
but not writes (speculating on the read repair).
{quote}
Even if we speculate we still attempt it. That work will always be for naught 
and being at quota is likely a prolonged state (the ways out of it take a 
while).
{quote}If you don’t serve reads either, than any GC pause will be guaranteed to 
impact client request latency as we can’t soeculate around it in the common 
rf=3 case.
{quote}
This is true. But thats almost the same as losing a node because its disk has 
been filled up completely. If we have one unhealthy node we are another 
unhealthy node away from unavailability in the rf=3/quorum case. 

That said, I'll consider the reads more over the weekend. Its a valid concern. 

 

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14509) AsyncOneResponse uses the incorrect timeout

2018-06-08 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506751#comment-16506751
 ] 

Dinesh Joshi commented on CASSANDRA-14509:
--

[~krummas] I have updated the branch with a unit test.

> AsyncOneResponse uses the incorrect timeout
> ---
>
> Key: CASSANDRA-14509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14509
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
> Fix For: 4.x
>
>
> {{AsyncOneResponse}} has a bug where it uses the initial timeout value 
> instead of the adjustedTimeout. Combined with passing in the wrong 
> {{TimeUnit}}, it leads to a shorter timeout than expected. This can have 
> unintended consequences for example, in 
> {{StorageService::sendReplicationNotification}} instead of waiting 10 seconds 
> ({{request_timeout_in_ms}}), we wait for {{1}} Nano Seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506748#comment-16506748
 ] 

Jeff Jirsa commented on CASSANDRA-14499:


1) disabling gosspi alone is insufficient, also need to disable native

2) recovery is likely some combination of compactions and host replacement

3) still not sure I buy the argument that it’s wrong to serve reads in this 
case - it may be true that some table is getting out of sync, but that doesn’t 
mean every table is, and we already have a mechanism to deal with nodes that 
can serve reads but not writes (speculating on the read repair). If you don’t 
serve reads either, than any GC pause will be guaranteed to impact client 
request latency as we can’t soeculate around it in the common rf=3 case. 

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506746#comment-16506746
 ] 

Jordan West edited comment on CASSANDRA-14499 at 6/9/18 12:36 AM:
--

The other reason the OS level wouldn't work is we are trying to track *live* 
data, which the OS can't tell the difference between. EDIT: also to clarify, 
the goal here isn't to implement a perfect quota. There will be some room for 
error where the quota can be exceeded. The goal is to the mark the node 
unhealthy when it reaches this level and to have enough headroom for compaction 
or other operations to get it to a healthy state. 

Regarding taking reads, [~jasobrown], [~krummas], and I discussed this some 
offline. Since the node can only get more and more out of sync while not taking 
write traffic and can't participate in (read) repair until the amount of 
storage used is below quota, we thought it better to disable both reads and 
writes. Less-blocking and speculative read repair makes us more available in 
this case (as it should).

Disabling gossip is a quick route to disabling reads/writes. Is it the best 
approach to doing so? I'm not 100%. My concern is for how the operator gets 
back to a healthy state once a quota is reached on a node. They have a few 
options: migrate data to a bigger node, compaction catches up and deletes data, 
quota is raised so its not met anymore, node(s) are added to take storage 
responsibility away from the node, or data is forcefully deleted from the node. 
We need to ensure we don't prevent those operations from taking place. I've 
been discussing this with [~jasobrown] offline as well. 


was (Author: jrwest):
The other reason the OS level wouldn't work is we are trying to track *live* 
data, which the OS can't tell the difference between.

Regarding taking reads, [~jasobrown], [~krummas], and I discussed this some 
offline. Since the node can only get more and more out of sync while not taking 
write traffic and can't participate in (read) repair until the amount of 
storage used is below quota, we thought it better to disable both reads and 
writes. Less-blocking and speculative read repair makes us more available in 
this case (as it should).

Disabling gossip is a quick route to disabling reads/writes. Is it the best 
approach to doing so? I'm not 100%. My concern is for how the operator gets 
back to a healthy state once a quota is reached on a node. They have a few 
options: migrate data to a bigger node, compaction catches up and deletes data, 
quota is raised so its not met anymore, node(s) are added to take storage 
responsibility away from the node, or data is forcefully deleted from the node. 
We need to ensure we don't prevent those operations from taking place. I've 
been discussing this with [~jasobrown] offline as well. 

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506746#comment-16506746
 ] 

Jordan West commented on CASSANDRA-14499:
-

The other reason the OS level wouldn't work is we are trying to track *live* 
data, which the OS can't tell the difference between.

Regarding taking reads, [~jasobrown], [~krummas], and I discussed this some 
offline. Since the node can only get more and more out of sync while not taking 
write traffic and can't participate in (read) repair until the amount of 
storage used is below quota, we thought it better to disable both reads and 
writes. Less-blocking and speculative read repair makes us more available in 
this case (as it should).

Disabling gossip is a quick route to disabling reads/writes. Is it the best 
approach to doing so? I'm not 100%. My concern is for how the operator gets 
back to a healthy state once a quota is reached on a node. They have a few 
options: migrate data to a bigger node, compaction catches up and deletes data, 
quota is raised so its not met anymore, node(s) are added to take storage 
responsibility away from the node, or data is forcefully deleted from the node. 
We need to ensure we don't prevent those operations from taking place. I've 
been discussing this with [~jasobrown] offline as well. 

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14489) Test cqlsh authentication

2018-06-08 Thread Patrick Bannister (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Bannister resolved CASSANDRA-14489.
---
Resolution: Not A Problem

This problem didn't actually exist.

> Test cqlsh authentication
> -
>
> Key: CASSANDRA-14489
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14489
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Patrick Bannister
>Priority: Critical
>  Labels: cqlsh, security, test
> Fix For: 4.x
>
>
> Coverage analysis of the cqlshlib unittests (pylib/cqlshlib/test/test*.py) 
> and the dtest cqlsh_tests (cqlsh_tests.py and cqlsh_copy_tests.py) showed no 
> coverage of authentication related code.
> Before we can release a port of cqlsh, we should identify an existing test 
> for cqlsh authentication, or write a new one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14489) Test cqlsh authentication

2018-06-08 Thread Patrick Bannister (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506718#comment-16506718
 ] 

Patrick Bannister commented on CASSANDRA-14489:
---

There are some tests for cqlsh username/password login in the dtests. There are 
several of them in cqlsh_tests/cqlsh_tests.py::TestCqlLogin. I re-read my 
coverage report, and in fact, we did observe coverage of connecting with cqlsh 
using cassandra.auth.PlainTextAuthProvider.

Furthermore, the relevant dtests are passing for the pure Python 3 port.

We don't have coverage of using the LOGIN command during a connected cqlsh 
session, but I think it's sufficient that we're already testing the initial 
login with a password.

> Test cqlsh authentication
> -
>
> Key: CASSANDRA-14489
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14489
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Patrick Bannister
>Priority: Critical
>  Labels: cqlsh, security, test
> Fix For: 4.x
>
>
> Coverage analysis of the cqlshlib unittests (pylib/cqlshlib/test/test*.py) 
> and the dtest cqlsh_tests (cqlsh_tests.py and cqlsh_copy_tests.py) showed no 
> coverage of authentication related code.
> Before we can release a port of cqlsh, we should identify an existing test 
> for cqlsh authentication, or write a new one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14510) Flaky uTest: RemoveTest.testRemoveHostId

2018-06-08 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14510:
--

 Summary: Flaky uTest: RemoveTest.testRemoveHostId
 Key: CASSANDRA-14510
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14510
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Jay Zhuang


https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/619/testReport/org.apache.cassandra.service/RemoveTest/testRemoveHostId/

{noformat}
Failed 13 times in the last 30 runs. Flakiness: 31%, Stability: 56%
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14509) AsyncOneResponse uses the incorrect timeout

2018-06-08 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14509:
-
Fix Version/s: 4.x
   Status: Patch Available  (was: Open)

||14509||
|[branch|https://github.com/dineshjoshi/cassandra/tree/trunk-14509]|
|[utests & 
dtests|https://circleci.com/gh/dineshjoshi/workflows/cassandra/tree/trunk-14509]|
||

> AsyncOneResponse uses the incorrect timeout
> ---
>
> Key: CASSANDRA-14509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14509
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
> Fix For: 4.x
>
>
> {{AsyncOneResponse}} has a bug where it uses the initial timeout value 
> instead of the adjustedTimeout. Combined with passing in the wrong 
> {{TimeUnit}}, it leads to a shorter timeout than expected. This can have 
> unintended consequences for example, in 
> {{StorageService::sendReplicationNotification}} instead of waiting 10 seconds 
> ({{request_timeout_in_ms}}), we wait for {{1}} Nano Seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506701#comment-16506701
 ] 

Jeff Jirsa commented on CASSANDRA-14499:


Not clear to me how you'd do this as gracefully at the OS level as you can at 
the cassandra level (by, e.g., blocking writes and inbound streaming).

It's also not clear to me that disabling gossip is the right answer. You can 
still serve reads, the coordinator will know if it's out of sync and can 
attempt a (now non-blocking and speculating) read repair if necessary. If read 
repair is required to meet consistency, we'll fail there, but that's still 
likely better than not serving the already consistent read. 


> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14509) AsyncOneResponse uses the incorrect timeout

2018-06-08 Thread Dinesh Joshi (JIRA)
Dinesh Joshi created CASSANDRA-14509:


 Summary: AsyncOneResponse uses the incorrect timeout
 Key: CASSANDRA-14509
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14509
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Dinesh Joshi
Assignee: Dinesh Joshi


{{AsyncOneResponse}} has a bug where it uses the initial timeout value instead 
of the adjustedTimeout. Combined with passing in the wrong {{TimeUnit}}, it 
leads to a shorter timeout than expected. This can have unintended consequences 
for example, in {{StorageService::sendReplicationNotification}} instead of 
waiting 10 seconds ({{request_timeout_in_ms}}), we wait for {{1}} Nano 
Seconds.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2018-06-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506659#comment-16506659
 ] 

Joseph Lynch edited comment on CASSANDRA-14459 at 6/8/18 11:36 PM:
---

I think that we need some way to get latency measurements for hosts that have 
been excluded from traffic due to high minimums. For example if during the 
initial {{PingMessages}} a local DC host gets a very high measurement (e.g. 
100ms) we will never send traffic to it ever. My understanding is that's why we 
reset in the first place.

I'll try to come up with a solution that doesn't involve additional traffic.


was (Author: jolynch):
I think that we need some way to get latency measurements for hosts that have 
been excluded from traffic due to high minimums. For example if during the 
initial {{PingMessages}} a local DC host gets a very high measurement (e.g. 
100ms) we will never send traffic to it ever. My understanding is that's why we 
reset in the first place.

I'll work on a feedback mechanism for the {{DES}} to ask for latency probes 
(which I guess would be best implemented as {{PingMessages}} since you're 
concerned about {{EchoMessages}}). I see possible two designs: one where I send 
the probes directly from the {{DES}} or I can have a method expressing the 
desire for probes that propagates up to e.g. the {{MessagingService}}. Are 
there better options?

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.x
>
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14237) Unittest failed: org.apache.cassandra.utils.BitSetTest.compareBitSets

2018-06-08 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang resolved CASSANDRA-14237.

Resolution: Won't Fix

Makes sense to me. Closing as "won't fix"

> Unittest failed: org.apache.cassandra.utils.BitSetTest.compareBitSets
> -
>
> Key: CASSANDRA-14237
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14237
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Jay Zhuang
>Priority: Minor
>  Labels: testing
>
> {noformat}
> [junit] Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 0.822 sec
> [junit]
> [junit] Testcase: compareBitSets(org.apache.cassandra.utils.BitSetTest):  
>   Caused an ERROR
> [junit] java.io.FileNotFoundException: /usr/share/dict/words (No such 
> file or directory)
> [junit] java.lang.RuntimeException: java.io.FileNotFoundException: 
> /usr/share/dict/words (No such file or directory)
> [junit] at 
> org.apache.cassandra.utils.KeyGenerator$WordGenerator.reset(KeyGenerator.java:137)
> [junit] at 
> org.apache.cassandra.utils.KeyGenerator$WordGenerator.(KeyGenerator.java:126)
> [junit] at 
> org.apache.cassandra.utils.BitSetTest.compareBitSets(BitSetTest.java:50)
> [junit] Caused by: java.io.FileNotFoundException: /usr/share/dict/words 
> (No such file or directory)
> [junit] at java.io.FileInputStream.open0(Native Method)
> [junit] at java.io.FileInputStream.open(FileInputStream.java:195)
> [junit] at java.io.FileInputStream.(FileInputStream.java:138)
> [junit] at java.io.FileInputStream.(FileInputStream.java:93)
> [junit] at 
> org.apache.cassandra.utils.KeyGenerator$WordGenerator.reset(KeyGenerator.java:135)
> [junit]
> [junit]
> [junit] Test org.apache.cassandra.utils.BitSetTest FAILED
> {noformat}
> Works fine on my mac but failed on some linux hosts which do not have 
> {{/usr/share/dict/words}}. It's the same issue as CASSANDRA-7389, should we 
> backport that?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2018-06-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506659#comment-16506659
 ] 

Joseph Lynch commented on CASSANDRA-14459:
--

I think that we need some way to get latency measurements for hosts that have 
been excluded from traffic due to high minimums. For example if during the 
initial {{PingMessages}} a local DC host gets a very high measurement (e.g. 
100ms) we will never send traffic to it ever. My understanding is that's why we 
reset in the first place.

I'll work on a feedback mechanism for the {{DES}} to ask for latency probes 
(which I guess would be best implemented as {{PingMessages}} since you're 
concerned about {{EchoMessages}}). I see possible two designs: one where I send 
the probes directly from the {{DES}} or I can have a method expressing the 
desire for probes that propagates up to e.g. the {{MessagingService}}. Are 
there better options?

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.x
>
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14508) Jenkins Slave doesn't have permission to write to /tmp/ directory

2018-06-08 Thread Jay Zhuang (JIRA)
Jay Zhuang created CASSANDRA-14508:
--

 Summary: Jenkins Slave doesn't have permission to write to /tmp/ 
directory
 Key: CASSANDRA-14508
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14508
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Jay Zhuang


Which is causing uTest failed, e.g.:
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-test/lastCompletedBuild/testReport/org.apache.cassandra.streaming.compression/CompressedInputStreamTest/testCompressedReadUncompressedChunks/

h3. Error Message
{noformat}
java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db
{noformat}
h3. Stacktrace
{noformat}
java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
/tmp/na-1-big-Data.db at 
org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:119)
 at 
org.apache.cassandra.io.util.SequentialWriter.(SequentialWriter.java:141) 
at 
org.apache.cassandra.io.compress.CompressedSequentialWriter.(CompressedSequentialWriter.java:82)
 at 
org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadWith(CompressedInputStreamTest.java:118)
 at 
org.apache.cassandra.streaming.compression.CompressedInputStreamTest.testCompressedReadUncompressedChunks(CompressedInputStreamTest.java:83)
 Caused by: java.nio.file.AccessDeniedException: /tmp/na-1-big-Data.db at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at 
sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
 at java.nio.channels.FileChannel.open(FileChannel.java:287) at 
java.nio.channels.FileChannel.open(FileChannel.java:335) at 
org.apache.cassandra.io.util.SequentialWriter.openChannel(SequentialWriter.java:100)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14146) [DTEST] cdc_test::TestCDC::test_insertion_and_commitlog_behavior_after_reaching_cdc_total_space assertion always fails (Extra items in the left set)

2018-06-08 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14146:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [DTEST] 
> cdc_test::TestCDC::test_insertion_and_commitlog_behavior_after_reaching_cdc_total_space
>  assertion always fails (Extra items in the left set)
> 
>
> Key: CASSANDRA-14146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Michael Kjellman
>Priority: Major
>
> Dtest 
> cdc_test::TestCDC::test_insertion_and_commitlog_behavior_after_reaching_cdc_total_space
>  always fails on an assertion.
> the assert is the final step of the test and it checks that 
> pre_non_cdc_write_cdc_raw_segments == _get_cdc_raw_files(node.get_path())
> This fails 100% of the time locally, 100% of the time on circleci executed 
> under pytest, and 100% of the time for the past 40 test runs on ASF Jenkins 
> runs against trunk.
> This is the only test failure (excluding flaky one-off failures) remaining on 
> the pytest dtest branch. I'm going to annotate the test with a skip marker 
> (including a reason reference to this JIRA)... when it's fixed we should also 
> remove the skip annotation from the test.
> {code}
> >   assert pre_non_cdc_write_cdc_raw_segments == 
> > _get_cdc_raw_files(node.get_path())
> E   AssertionError: assert {'/tmp/dtest-...169.log', ...} == 
> {'/tmp/dtest-v...169.log', ...}
> E Extra items in the left set:
> E 
> '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005097.log'
> E 
> '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005098.log'
> E Extra items in the right set:
> E 
> '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005099.log'
> E 
> '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005100.log'
> E Use -v to get the full diff
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506549#comment-16506549
 ] 

Jeremiah Jordan commented on CASSANDRA-14499:
-

isn't this pretty easy to do with OS level settings?  Getting this tracking 
right across all places we uses disk seems like something we are bound to fail 
at, where using your OS would not?

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14457) Add a virtual table with current compactions

2018-06-08 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506543#comment-16506543
 ] 

Jeff Jirsa commented on CASSANDRA-14457:


[~iamaleksey]  Re: {{CompactionMetrics}} - [~krummas] recently noted in passing 
that most of compaction has been slowly evolving over time and could probably 
use a nice, thorough, ground-up rewrite in the near future. May be worth a chat 
on the dev@ list about a potential redesign.


> Add a virtual table with current compactions
> 
>
> Key: CASSANDRA-14457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14505) Removal of last element on a List deletes the entire row

2018-06-08 Thread Jeremiah Jordan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan resolved CASSANDRA-14505.
-
Resolution: Duplicate

> Removal of last element on a List deletes the entire row
> 
>
> Key: CASSANDRA-14505
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14505
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: * Java: 1.8.0_171
>  * SO: Ubuntu 18.04 LTS
>  * Cassandra: 3.11.2 
>Reporter: André Paris
>Assignee: Benjamin Lerer
>Priority: Major
>
> The behavior of an element removal from a list by an UPDATE differs by how 
> the row was created:
> Given the table
> {{CREATE TABLE table_test (}}
>  {{    id int PRIMARY KEY,}}
>  {{    list list}}
>  {{)}}
> If the row is created by an INSERT, the row remains after the UPDATE to 
> remove the last element on the list:
> {{cqlsh:ks_test> INSERT INTO table_test (id, list ) VALUES ( 1, ['foo']) ;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +}}
>      1   | ['foo'] 
> {{(1 rows)}}
>  {{cqlsh:ks_test> UPDATE table_test SET list = list - ['foo'] WHERE id=1;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +--}}
>  {{  1 | null}}
> {{(1 rows)}}
>  
> But, if the row is created by an UPDATE, the row is deleted after the UPDATE 
> to remove the last element on the list:
> {{cqlsh:ks_test> UPDATE table_test SET list = list + ['foo'] WHERE id=2;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +-}}
>        2 | ['foo']
> {{(1 rows)}}
>  {{cqlsh:ks_test> UPDATE table_test SET list = list - ['foo'] WHERE id=2;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +--}}
> {{(0 rows)}}
>  
> Thanks in advance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14505) Removal of last element on a List deletes the entire row

2018-06-08 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506538#comment-16506538
 ] 

Jeremiah Jordan commented on CASSANDRA-14505:
-

This has nothing to do with lists.  In general if you "create" a row using 
UPDATE, rather than INSERT, when all columns are nulled out, the row will be 
gone.  This is because INSERT creates a row marker, while UPDATE does not.

> Removal of last element on a List deletes the entire row
> 
>
> Key: CASSANDRA-14505
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14505
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: * Java: 1.8.0_171
>  * SO: Ubuntu 18.04 LTS
>  * Cassandra: 3.11.2 
>Reporter: André Paris
>Assignee: Benjamin Lerer
>Priority: Major
>
> The behavior of an element removal from a list by an UPDATE differs by how 
> the row was created:
> Given the table
> {{CREATE TABLE table_test (}}
>  {{    id int PRIMARY KEY,}}
>  {{    list list}}
>  {{)}}
> If the row is created by an INSERT, the row remains after the UPDATE to 
> remove the last element on the list:
> {{cqlsh:ks_test> INSERT INTO table_test (id, list ) VALUES ( 1, ['foo']) ;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +}}
>      1   | ['foo'] 
> {{(1 rows)}}
>  {{cqlsh:ks_test> UPDATE table_test SET list = list - ['foo'] WHERE id=1;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +--}}
>  {{  1 | null}}
> {{(1 rows)}}
>  
> But, if the row is created by an UPDATE, the row is deleted after the UPDATE 
> to remove the last element on the list:
> {{cqlsh:ks_test> UPDATE table_test SET list = list + ['foo'] WHERE id=2;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +-}}
>        2 | ['foo']
> {{(1 rows)}}
>  {{cqlsh:ks_test> UPDATE table_test SET list = list - ['foo'] WHERE id=2;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +--}}
> {{(0 rows)}}
>  
> Thanks in advance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506533#comment-16506533
 ] 

Jordan West commented on CASSANDRA-14499:
-

[~jeromatron] I understand those concerns. This would be opt-in for folks who 
wanted automatic action taken and any such action should take care to not cause 
the node to flap, for example. One use case where we see this as valuable is 
QA/perf/test clusters that may not have the full monitoring setup but need to 
be protected from errant clients filling up disks to a point where worse things 
happen. The warning system can be accomplished today with monitoring and 
alerting on the same metrics.

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14481) Using nodetool status after enabling Cassandra internal auth for JMX access fails with currently documented permissions

2018-06-08 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14481:
-
Labels: security  (was: )

> Using nodetool status after enabling Cassandra internal auth for JMX access 
> fails with currently documented permissions
> ---
>
> Key: CASSANDRA-14481
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14481
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
> Environment: Apache Cassandra 3.11.2
> Centos 6.9
>Reporter: Valerie Parham-Thompson
>Priority: Minor
>  Labels: security
>
> Using the documentation here:
> [https://cassandra.apache.org/doc/latest/operating/security.html#cassandra-integrated-auth]
> Running `nodetool status` on a cluster fails as follows:
> {noformat}
> error: Access Denied
> -- StackTrace --
> java.lang.SecurityException: Access Denied
> at 
> org.apache.cassandra.auth.jmx.AuthorizationProxy.invoke(AuthorizationProxy.java:172)
> at com.sun.proxy.$Proxy4.invoke(Unknown Source)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1408)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
> at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
> at sun.rmi.transport.Transport$1.run(Transport.java:200)
> at sun.rmi.transport.Transport$1.run(Transport.java:197)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
> at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:835)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at 
> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:283)
> at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:260)
> at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
> at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
> at javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
> at 
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1020)
> at 
> javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
> at com.sun.proxy.$Proxy7.effectiveOwnership(Unknown Source)
> at org.apache.cassandra.tools.NodeProbe.effectiveOwnership(NodeProbe.java:489)
> at org.apache.cassandra.tools.nodetool.Status.execute(Status.java:74)
> at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:255)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:169) {noformat}
> Permissions on two additional mbeans were required:
> {noformat}
> GRANT SELECT, EXECUTE ON MBEAN ‘org.apache.cassandra.db:type=StorageService’ 
> TO jmx;
> GRANT EXECUTE ON MBEAN ‘org.apache.cassandra.db:type=EndpointSnitchInfo’ TO 
> jmx;
> {noformat}
> I've updated the documentation in my fork here and would like to do a pull 
> request for the addition:
> [https://github.com/dataindataout/cassandra/blob/trunk/doc/source/operating/security.rst#cassandra-integrated-auth]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14489) Test cqlsh authentication

2018-06-08 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14489:
-
Labels: cqlsh security test  (was: cqlsh test)

> Test cqlsh authentication
> -
>
> Key: CASSANDRA-14489
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14489
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Patrick Bannister
>Priority: Critical
>  Labels: cqlsh, security, test
> Fix For: 4.x
>
>
> Coverage analysis of the cqlshlib unittests (pylib/cqlshlib/test/test*.py) 
> and the dtest cqlsh_tests (cqlsh_tests.py and cqlsh_copy_tests.py) showed no 
> coverage of authentication related code.
> Before we can release a port of cqlsh, we should identify an existing test 
> for cqlsh authentication, or write a new one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14497) Add Role login cache

2018-06-08 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14497:
-
Labels: security  (was: )

> Add Role login cache
> 
>
> Key: CASSANDRA-14497
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14497
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Auth
>Reporter: Jay Zhuang
>Assignee: Sam Tunnicliffe
>Priority: Major
>  Labels: security
>
> The 
> [{{ClientState.login()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ClientState.java#L313]
>  function is used for all auth message: 
> [{{AuthResponse.java:82}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/messages/AuthResponse.java#L82].
>  But the 
> [{{role.canLogin}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L521]
>  information is not cached. So it hits the database every time: 
> [{{CassandraRoleManager.java:407}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L407].
>  For a cluster with lots of new connections, it's causing performance issue. 
> The mitigation for us is to increase the {{system_auth}} replication factor 
> to match the number of nodes, so 
> [{{local_one}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L488]
>  would be very cheap. The P99 dropped immediately, but I don't think it is 
> not a good solution.
> I would purpose to add {{Role.canLogin}} to the RolesCache to improve the 
> auth performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14499) node-level disk quota

2018-06-08 Thread Jeremy Hanna (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506516#comment-16506516
 ] 

Jeremy Hanna commented on CASSANDRA-14499:
--

I just want to add a note of caution to anything automatic happening when 
certain metrics trigger.  I've seen where metrics can misfire under certain 
circumstances which leads to unpredictable cluster behavior.  I would favor 
having a warning system over anything done automatically if it were my cluster.

> node-level disk quota
> -
>
> Key: CASSANDRA-14499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14457) Add a virtual table with current compactions

2018-06-08 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14457:
-
Labels: virtual-tables  (was: )

> Add a virtual table with current compactions
> 
>
> Key: CASSANDRA-14457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14457
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14458) Add virtual table to list active connections

2018-06-08 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14458:
-
Labels: virtual-tables  (was: )

> Add virtual table to list active connections
> 
>
> Key: CASSANDRA-14458
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14458
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.x
>
>
> List all active connections in virtual table like:
> {code:sql}
> cqlsh:system> select * from system_views.clients ;
>  
>  client_address   | cipher    | driver_name | driver_version | keyspace | 
> protocol  | requests | ssl   | user      | version
> --+---+-++--+---+--+---+---+-
>  /127.0.0.1:63903 | undefined |   undefined |      undefined |          | 
> undefined |       13 | False | anonymous |       4
>  /127.0.0.1:63904 | undefined |   undefined |      undefined |   system | 
> undefined |       16 | False | anonymous |       4
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory

2018-06-08 Thread Jay Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13929:
---
   Resolution: Fixed
Reproduced In: 3.11.0, 3.8  (was: 3.11.0)
   Status: Resolved  (was: Ready to Commit)

> BTree$Builder / io.netty.util.Recycler$Stack leaking memory
> ---
>
> Key: CASSANDRA-13929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13929
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Thomas Steinmaurer
>Assignee: Jay Zhuang
>Priority: Major
> Fix For: 3.11.3
>
> Attachments: cassandra_3.11.0_min_memory_utilization.jpg, 
> cassandra_3.11.1_NORECYCLE_memory_utilization.jpg, 
> cassandra_3.11.1_mat_dominator_classes.png, 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, 
> cassandra_3.11.1_vs_3.11.2recyclernullingpatch.png, 
> cassandra_heapcpu_memleak_patching_test_30d.png, 
> dtest_example_80_request.png, dtest_example_80_request_fix.png, 
> dtest_example_heap.png, memleak_heapdump_recyclerstack.png
>
>
> Different to CASSANDRA-13754, there seems to be another memory leak in 
> 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack.
> * heap utilization increase after upgrading to 3.11.0 => 
> cassandra_3.11.0_min_memory_utilization.jpg
> * No difference after upgrading to 3.11.1 (snapshot build) => 
> cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing 
> CASSANDRA-13754, more visible now
> * MAT shows io.netty.util.Recycler$Stack as top contributing class => 
> cassandra_3.11.1_mat_dominator_classes.png
> * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart 
> after ~ 72 hours
> Verified the following fix, namely explicitly unreferencing the 
> _recycleHandle_ member (making it non-final). In 
> _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_
> {code}
> public void recycle()
> {
> if (recycleHandle != null)
> {
> this.cleanup();
> builderRecycler.recycle(this, recycleHandle);
> recycleHandle = null; // ADDED
> }
> }
> {code}
> Patched a single node in our loadtest cluster with this change and after ~ 10 
> hours uptime, no sign of the previously offending class in MAT anymore => 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png
> Can' say if this has any other side effects etc., but I doubt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory

2018-06-08 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506462#comment-16506462
 ] 

Jay Zhuang commented on CASSANDRA-13929:


Thanks [~jasobrown] again for the review. Committed as 
[{{ed5f834}}|https://github.com/apache/cassandra/commit/ed5f8347ef0c7175cd96e59bc8bfaf3ed1f4697a].

> BTree$Builder / io.netty.util.Recycler$Stack leaking memory
> ---
>
> Key: CASSANDRA-13929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13929
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Thomas Steinmaurer
>Assignee: Jay Zhuang
>Priority: Major
> Fix For: 3.11.3
>
> Attachments: cassandra_3.11.0_min_memory_utilization.jpg, 
> cassandra_3.11.1_NORECYCLE_memory_utilization.jpg, 
> cassandra_3.11.1_mat_dominator_classes.png, 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, 
> cassandra_3.11.1_vs_3.11.2recyclernullingpatch.png, 
> cassandra_heapcpu_memleak_patching_test_30d.png, 
> dtest_example_80_request.png, dtest_example_80_request_fix.png, 
> dtest_example_heap.png, memleak_heapdump_recyclerstack.png
>
>
> Different to CASSANDRA-13754, there seems to be another memory leak in 
> 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack.
> * heap utilization increase after upgrading to 3.11.0 => 
> cassandra_3.11.0_min_memory_utilization.jpg
> * No difference after upgrading to 3.11.1 (snapshot build) => 
> cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing 
> CASSANDRA-13754, more visible now
> * MAT shows io.netty.util.Recycler$Stack as top contributing class => 
> cassandra_3.11.1_mat_dominator_classes.png
> * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart 
> after ~ 72 hours
> Verified the following fix, namely explicitly unreferencing the 
> _recycleHandle_ member (making it non-final). In 
> _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_
> {code}
> public void recycle()
> {
> if (recycleHandle != null)
> {
> this.cleanup();
> builderRecycler.recycle(this, recycleHandle);
> recycleHandle = null; // ADDED
> }
> }
> {code}
> Patched a single node in our loadtest cluster with this change and after ~ 10 
> hours uptime, no sign of the previously offending class in MAT anymore => 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png
> Can' say if this has any other side effects etc., but I doubt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/3] cassandra git commit: Remove BTree.Builder Recycler to reduce memory usage

2018-06-08 Thread jzhuang
Remove BTree.Builder Recycler to reduce memory usage

patch by Jay Zhuang; reviewed by jasobrown for CASSANDRA-13929


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ed5f8347
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ed5f8347
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ed5f8347

Branch: refs/heads/trunk
Commit: ed5f8347ef0c7175cd96e59bc8bfaf3ed1f4697a
Parents: b174819
Author: Jay Zhuang 
Authored: Mon Jan 29 18:17:56 2018 -0800
Committer: Jay Zhuang 
Committed: Fri Jun 8 10:40:06 2018 -0700

--
 CHANGES.txt |  1 +
 build.xml   |  4 +-
 .../columniterator/SSTableReversedIterator.java |  2 +-
 .../org/apache/cassandra/db/rows/BTreeRow.java  |  2 +-
 .../cassandra/db/rows/ComplexColumnData.java|  5 +-
 .../org/apache/cassandra/utils/btree/BTree.java | 69 +-
 .../test/microbench/BTreeBuildBench.java| 96 
 .../org/apache/cassandra/utils/BTreeTest.java   | 33 ++-
 8 files changed, 161 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 2e77d2e..7f4b655 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.3
+ * Remove BTree.Builder Recycler to reduce memory usage (CASSANDRA-13929)
  * Reduce nodetool GC thread count (CASSANDRA-14475)
  * Fix New SASI view creation during Index Redistribution (CASSANDRA-14055)
  * Remove string formatting lines from BufferPool hot path (CASSANDRA-14416)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/build.xml
--
diff --git a/build.xml b/build.xml
index 4edfbb1..54c5372 100644
--- a/build.xml
+++ b/build.xml
@@ -422,8 +422,8 @@
   
 
 
-  
-  
+  
+  
 
   
   

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
index cf8798d..6a0b7be 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
@@ -426,7 +426,7 @@ public class SSTableReversedIterator extends 
AbstractSSTableIterator
 public void reset()
 {
 built = null;
-rowBuilder = BTree.builder(metadata.comparator);
+rowBuilder.reuse();
 deletionBuilder = 
MutableDeletionInfo.builder(partitionLevelDeletion, metadata().comparator, 
false);
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/src/java/org/apache/cassandra/db/rows/BTreeRow.java
--
diff --git a/src/java/org/apache/cassandra/db/rows/BTreeRow.java 
b/src/java/org/apache/cassandra/db/rows/BTreeRow.java
index 15ac30a..c70e0e2 100644
--- a/src/java/org/apache/cassandra/db/rows/BTreeRow.java
+++ b/src/java/org/apache/cassandra/db/rows/BTreeRow.java
@@ -738,7 +738,7 @@ public class BTreeRow extends AbstractRow
 this.clustering = null;
 this.primaryKeyLivenessInfo = LivenessInfo.EMPTY;
 this.deletion = Deletion.LIVE;
-this.cells_ = null;
+this.cells_.reuse();
 this.hasComplex = false;
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
--
diff --git a/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java 
b/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
index 380af7a..1395782 100644
--- a/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
+++ b/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
@@ -242,7 +242,10 @@ public class ComplexColumnData extends ColumnData 
implements Iterable
 {
 this.column = column;
 this.complexDeletion = DeletionTime.LIVE; // default if 
writeComplexDeletion is not called
-this.builder = BTree.builder(column.cellComparator());
+if (builder == null)
+builder = BTree.builder(column.cellComparator());
+else
+builder.reuse(column.cellComparator());
 }
 
 public void addComplexDeletion(DeletionTime 

[1/3] cassandra git commit: Remove BTree.Builder Recycler to reduce memory usage

2018-06-08 Thread jzhuang
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.11 b1748198e -> ed5f8347e
  refs/heads/trunk 800f0b394 -> 958e13d16


Remove BTree.Builder Recycler to reduce memory usage

patch by Jay Zhuang; reviewed by jasobrown for CASSANDRA-13929


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ed5f8347
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ed5f8347
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ed5f8347

Branch: refs/heads/cassandra-3.11
Commit: ed5f8347ef0c7175cd96e59bc8bfaf3ed1f4697a
Parents: b174819
Author: Jay Zhuang 
Authored: Mon Jan 29 18:17:56 2018 -0800
Committer: Jay Zhuang 
Committed: Fri Jun 8 10:40:06 2018 -0700

--
 CHANGES.txt |  1 +
 build.xml   |  4 +-
 .../columniterator/SSTableReversedIterator.java |  2 +-
 .../org/apache/cassandra/db/rows/BTreeRow.java  |  2 +-
 .../cassandra/db/rows/ComplexColumnData.java|  5 +-
 .../org/apache/cassandra/utils/btree/BTree.java | 69 +-
 .../test/microbench/BTreeBuildBench.java| 96 
 .../org/apache/cassandra/utils/BTreeTest.java   | 33 ++-
 8 files changed, 161 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 2e77d2e..7f4b655 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.3
+ * Remove BTree.Builder Recycler to reduce memory usage (CASSANDRA-13929)
  * Reduce nodetool GC thread count (CASSANDRA-14475)
  * Fix New SASI view creation during Index Redistribution (CASSANDRA-14055)
  * Remove string formatting lines from BufferPool hot path (CASSANDRA-14416)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/build.xml
--
diff --git a/build.xml b/build.xml
index 4edfbb1..54c5372 100644
--- a/build.xml
+++ b/build.xml
@@ -422,8 +422,8 @@
   
 
 
-  
-  
+  
+  
 
   
   

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
index cf8798d..6a0b7be 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/SSTableReversedIterator.java
@@ -426,7 +426,7 @@ public class SSTableReversedIterator extends 
AbstractSSTableIterator
 public void reset()
 {
 built = null;
-rowBuilder = BTree.builder(metadata.comparator);
+rowBuilder.reuse();
 deletionBuilder = 
MutableDeletionInfo.builder(partitionLevelDeletion, metadata().comparator, 
false);
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/src/java/org/apache/cassandra/db/rows/BTreeRow.java
--
diff --git a/src/java/org/apache/cassandra/db/rows/BTreeRow.java 
b/src/java/org/apache/cassandra/db/rows/BTreeRow.java
index 15ac30a..c70e0e2 100644
--- a/src/java/org/apache/cassandra/db/rows/BTreeRow.java
+++ b/src/java/org/apache/cassandra/db/rows/BTreeRow.java
@@ -738,7 +738,7 @@ public class BTreeRow extends AbstractRow
 this.clustering = null;
 this.primaryKeyLivenessInfo = LivenessInfo.EMPTY;
 this.deletion = Deletion.LIVE;
-this.cells_ = null;
+this.cells_.reuse();
 this.hasComplex = false;
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ed5f8347/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
--
diff --git a/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java 
b/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
index 380af7a..1395782 100644
--- a/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
+++ b/src/java/org/apache/cassandra/db/rows/ComplexColumnData.java
@@ -242,7 +242,10 @@ public class ComplexColumnData extends ColumnData 
implements Iterable
 {
 this.column = column;
 this.complexDeletion = DeletionTime.LIVE; // default if 
writeComplexDeletion is not called
-this.builder = BTree.builder(column.cellComparator());
+if (builder == null)
+builder = BTree.builder(column.cellComparator()

[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2018-06-08 Thread jzhuang
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/958e13d1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/958e13d1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/958e13d1

Branch: refs/heads/trunk
Commit: 958e13d1667391c69ec82f54da7d371e6eba29d6
Parents: 800f0b3 ed5f834
Author: Jay Zhuang 
Authored: Fri Jun 8 10:47:14 2018 -0700
Committer: Jay Zhuang 
Committed: Fri Jun 8 10:48:15 2018 -0700

--
 CHANGES.txt |  1 +
 build.xml   |  4 +-
 .../columniterator/SSTableReversedIterator.java |  2 +-
 .../org/apache/cassandra/db/rows/BTreeRow.java  |  2 +-
 .../cassandra/db/rows/ComplexColumnData.java|  5 +-
 .../org/apache/cassandra/utils/btree/BTree.java | 69 +-
 .../test/microbench/BTreeBuildBench.java| 96 
 .../org/apache/cassandra/utils/BTreeTest.java   | 33 ++-
 8 files changed, 161 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/958e13d1/CHANGES.txt
--
diff --cc CHANGES.txt
index 9857704,7f4b655..27c9561
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,253 -1,5 +1,254 @@@
 +4.0
 + * Add option to sanity check tombstones on reads/compactions 
(CASSANDRA-14467)
 + * Add a virtual table to expose all running sstable tasks (CASSANDRA-14457)
 + * Let nodetool import take a list of directories (CASSANDRA-14442)
 + * Avoid unneeded memory allocations / cpu for disabled log levels 
(CASSANDRA-14488)
 + * Implement virtual keyspace interface (CASSANDRA-7622)
 + * nodetool import cleanup and improvements (CASSANDRA-14417)
 + * Bump jackson version to >= 2.9.5 (CASSANDRA-14427)
 + * Allow nodetool toppartitions without specifying table (CASSANDRA-14360)
 + * Audit logging for database activity (CASSANDRA-12151)
 + * Clean up build artifacts in docs container (CASSANDRA-14432)
 + * Minor network authz improvements (Cassandra-14413)
 + * Automatic sstable upgrades (CASSANDRA-14197)
 + * Replace deprecated junit.framework.Assert usages with org.junit.Assert 
(CASSANDRA-14431)
 + * Cassandra-stress throws NPE if insert section isn't specified in user 
profile (CASSSANDRA-14426)
 + * List clients by protocol versions `nodetool clientstats --by-protocol` 
(CASSANDRA-14335)
 + * Improve LatencyMetrics performance by reducing write path processing 
(CASSANDRA-14281)
 + * Add network authz (CASSANDRA-13985)
 + * Use the correct IP/Port for Streaming when localAddress is left unbound 
(CASSANDRA-14389)
 + * nodetool listsnapshots is missing local system keyspace snapshots 
(CASSANDRA-14381)
 + * Remove StreamCoordinator.streamExecutor thread pool (CASSANDRA-14402)
 + * Rename nodetool --with-port to --print-port to disambiguate from --port 
(CASSANDRA-14392)
 + * Client TOPOLOGY_CHANGE messages have wrong port. (CASSANDRA-14398)
 + * Add ability to load new SSTables from a separate directory (CASSANDRA-6719)
 + * Eliminate background repair and probablistic read_repair_chance table 
options
 +   (CASSANDRA-13910)
 + * Bind to correct local address in 4.0 streaming (CASSANDRA-14362)
 + * Use standard Amazon naming for datacenter and rack in Ec2Snitch 
(CASSANDRA-7839)
 + * Fix junit failure for SSTableReaderTest (CASSANDRA-14387)
 + * Abstract write path for pluggable storage (CASSANDRA-14118)
 + * nodetool describecluster should be more informative (CASSANDRA-13853)
 + * Compaction performance improvements (CASSANDRA-14261) 
 + * Refactor Pair usage to avoid boxing ints/longs (CASSANDRA-14260)
 + * Add options to nodetool tablestats to sort and limit output 
(CASSANDRA-13889)
 + * Rename internals to reflect CQL vocabulary (CASSANDRA-14354)
 + * Add support for hybrid MIN(), MAX() speculative retry policies
 +   (CASSANDRA-14293, CASSANDRA-14338, CASSANDRA-14352)
 + * Fix some regressions caused by 14058 (CASSANDRA-14353)
 + * Abstract repair for pluggable storage (CASSANDRA-14116)
 + * Add meaningful toString() impls (CASSANDRA-13653)
 + * Add sstableloader option to accept target keyspace name (CASSANDRA-13884)
 + * Move processing of EchoMessage response to gossip stage (CASSANDRA-13713)
 + * Add coordinator write metric per CF (CASSANDRA-14232)
 + * Correct and clarify SSLFactory.getSslContext method and call sites 
(CASSANDRA-14314)
 + * Handle static and partition deletion properly on 
ThrottledUnfilteredIterator (CASSANDRA-14315)
 + * NodeTool clientstats should show SSL Cipher (CASSANDRA-14322)
 + * Add ability to specify driver name and version (CASSANDRA-14275)
 + * Abstract streaming for pluggable storage (CASSANDRA-14115)
 + * Forced incremental repairs should promote sstables if they can 
(CASSANDRA-14294)
 + * Use Murm

[jira] [Commented] (CASSANDRA-14358) OutboundTcpConnection can hang for many minutes when nodes restart

2018-06-08 Thread Kevin Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506404#comment-16506404
 ] 

Kevin Zhang commented on CASSANDRA-14358:
-

[~jolynch] currently we have an issue with gossip going one way after a node 
had sudden loss of storage controller. After the offending node comes back 
online, all the rest nodes show TCP connections on gossip, but those 
connections  (in bold) are not seen on the offending node. On offending node, 
nodetool gossipinfo shows generation 0 for all other nodes and nodetool status 
show DN for all other nodes. On other nodes, nodetool gossipinfo seems fine but 
nodetool status shows offending node down. This can be resolved by restarting 
cassandra on all nodes except the offending node, or wait for 2 hours after 
crash event (tcp_keepalive is set to 7200s on debian?). I don't know if this is 
related, but wondering if there is any way to verify (like TRACE logging on 
org.apache.cassandra.gms and/or org.apache.cassandra.net, or maybe packet 
capture). So far we can reproduce it at 30% chance.  Thanks in advance. 

node 10.96.105.4
*tcp 0 0 10.96.105.4:7001 10.96.105.6:55629 ESTABLISHED keepalive (729.79/0/0)*
*tcp 0 0 10.96.105.4:39219 10.96.105.6:7001 ESTABLISHED keepalive (783.04/0/0)*
*tcp 0 0 10.96.105.4:7001 10.96.105.6:60007 ESTABLISHED keepalive (729.79/0/0)*
tcp 0 0 10.96.105.4:7001 10.96.105.6:45318 ESTABLISHED keepalive (1471.16/0/0)
node 10.96.105.6
tcp 0 0 10.96.105.6:7001 0.0.0.0:* LISTEN off (0.00/0/0)
tcp 0 0 10.96.105.6:45318 10.96.105.4:7001 ESTABLISHED keepalive (1477.00/0/0)

> OutboundTcpConnection can hang for many minutes when nodes restart
> --
>
> Key: CASSANDRA-14358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14358
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Cassandra 2.1.19 (also reproduced on 3.0.15), running 
> with {{internode_encryption: all}} and the EC2 multi region snitch on Linux 
> 4.13 within the same AWS region. Smallest cluster I've seen the problem on is 
> 12 nodes, reproduces more reliably on 40+ and 300 node clusters consistently 
> reproduce on at least one node in the cluster.
> So all the connections are SSL and we're connecting on the internal ip 
> addresses (not the public endpoint ones).
> Potentially relevant sysctls:
> {noformat}
> /proc/sys/net/ipv4/tcp_syn_retries = 2
> /proc/sys/net/ipv4/tcp_synack_retries = 5
> /proc/sys/net/ipv4/tcp_keepalive_time = 7200
> /proc/sys/net/ipv4/tcp_keepalive_probes = 9
> /proc/sys/net/ipv4/tcp_keepalive_intvl = 75
> /proc/sys/net/ipv4/tcp_retries2 = 15
> {noformat}
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Major
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.x
>
> Attachments: 10 Minute Partition.pdf
>
>
> edit summary: This primarily impacts networks with stateful firewalls such as 
> AWS. I'm working on a proper patch for trunk but unfortunately it relies on 
> the Netty refactor in 4.0 so it will be hard to backport to previous 
> versions. A workaround for earlier versions is to set the 
> {{net.ipv4.tcp_retries2}} sysctl to ~5. This can be done with the following:
> {code:java}
> $ cat /etc/sysctl.d/20-cassandra-tuning.conf
> net.ipv4.tcp_retries2=5
> $ # Reload all sysctls
> $ sysctl --system{code}
> Original Bug Report:
> I've been trying to debug nodes not being able to see each other during 
> longer (~5 minute+) Cassandra restarts in 3.0.x and 2.1.x which can 
> contribute to {{UnavailableExceptions}} during rolling restarts of 3.0.x and 
> 2.1.x clusters for us. I think I finally have a lead. It appears that prior 
> to trunk (with the awesome Netty refactor) we do not set socket connect 
> timeouts on SSL connections (in 2.1.x, 3.0.x, or 3.11.x) nor do we set 
> {{SO_TIMEOUT}} as far as I can tell on outbound connections either. I believe 
> that this means that we could potentially block forever on {{connect}} or 
> {{recv}} syscalls, and we could block forever on the SSL Handshake as well. I 
> think that the OS will protect us somewhat (and that may be what's causing 
> the eventual timeout) but I think that given the right network conditions our 
> {{OutboundTCPConnection}} threads can just be stuck never making any progress 
> until the OS intervenes.
> I have attached some logs of such a network partition during a rolling 
> restart where an old node in the cluster has a completely foobarred 
> {{OutboundTcpConnection}} for ~10 minutes before finally getting a 
> {{java.net.SocketException: Connection timed out (Write failed)}} and 
> immediately successfully reconnecting. I conclude that the old node is the 
> problem because the new node (the one that restarted) is sending ECHOs to the 
> 

[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2018-06-08 Thread Vinay Chella (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506282#comment-16506282
 ] 

Vinay Chella commented on CASSANDRA-14482:
--

(y)(y). Looking forward for your contributions [~sushm...@gmail.com]

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: Wish
>  Components: Compression, Libraries
>Reporter: Sushma A Devendrappa
>Assignee: Sushma A Devendrappa
>Priority: Major
>  Labels: performance
> Fix For: 3.11.x, 4.x
>
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2018-06-08 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506243#comment-16506243
 ] 

Jason Brown commented on CASSANDRA-14459:
-

tbqh, I am -1 on sending an {{EchoMessage}} on every gossip round. This 
increases the gossip traffic by 66% [1], if not 100%, adds more processing 
demands to the single-threaded Gossip stage, and will not even give you 
realistic latency data (except, possibly, a rudimentary floor latency, but that 
assumes a small cluster that is rather quiescent). Seed nodes would also bear a 
lot of this additional traffic. 

If we don't have any latency data in DES for a host, it's because we have not 
communicated meaningfully with it (as far as latency numbers go). I am totally 
fine with that, and we don't need to goose the traffic to get latencies for a 
node which we don't talk to.

Your original patch was probably good enough to start a proper review, as I 
believe this behavior is a worthwhile addition.

[1] Currently there's 2-3 msgs per gossip round (Ack2 is optional), EchoMsg + 
response adds two more.

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.x
>
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2018-06-08 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14482:
---
Fix Version/s: 4.x

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: Wish
>  Components: Compression, Libraries
>Reporter: Sushma A Devendrappa
>Assignee: Sushma A Devendrappa
>Priority: Major
>  Labels: performance
> Fix For: 3.11.x, 4.x
>
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2018-06-08 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14482:
---
Component/s: Compression

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: Wish
>  Components: Compression, Libraries
>Reporter: Sushma A Devendrappa
>Assignee: Sushma A Devendrappa
>Priority: Major
>  Labels: performance
> Fix For: 3.11.x, 4.x
>
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2018-06-08 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14482:
---
Labels: performance  (was: )

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: Wish
>  Components: Compression, Libraries
>Reporter: Sushma A Devendrappa
>Assignee: Sushma A Devendrappa
>Priority: Major
>  Labels: performance
> Fix For: 3.11.x, 4.x
>
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2018-06-08 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506156#comment-16506156
 ] 

Jeff Jirsa commented on CASSANDRA-14482:


I think that's more than fair and reasonable.


> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: Wish
>  Components: Libraries
>Reporter: Sushma A Devendrappa
>Assignee: Sushma A Devendrappa
>Priority: Major
> Fix For: 3.11.x
>
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2018-06-08 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-14482:
--

Assignee: Sushma A Devendrappa  (was: Vinay Chella)

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: Wish
>  Components: Libraries
>Reporter: Sushma A Devendrappa
>Assignee: Sushma A Devendrappa
>Priority: Major
> Fix For: 3.11.x
>
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14482) ZSTD Compressor support in Cassandra

2018-06-08 Thread Sushma A Devendrappa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506145#comment-16506145
 ] 

Sushma A Devendrappa commented on CASSANDRA-14482:
--

[~vinaykumarcse] [~jjirsa] [~zznate] do you guys mind if i work on this. I am 
internally working on this and would love to take it forward and this will be 
my first chance to contribute to the community. 

 

Thanks

Sushma

> ZSTD Compressor support in Cassandra
> 
>
> Key: CASSANDRA-14482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14482
> Project: Cassandra
>  Issue Type: Wish
>  Components: Libraries
>Reporter: Sushma A Devendrappa
>Assignee: Vinay Chella
>Priority: Major
> Fix For: 3.11.x
>
>
> ZStandard has a great speed and compression ratio tradeoff. 
> ZStandard is open source compression from Facebook.
> More about ZSTD
> [https://github.com/facebook/zstd]
> https://code.facebook.com/posts/1658392934479273/smaller-and-faster-data-compression-with-zstandard/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14462) CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X

2018-06-08 Thread Fabien Rousseau (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabien Rousseau reassigned CASSANDRA-14462:
---

Assignee: Fabien Rousseau

> CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X
> ---
>
> Key: CASSANDRA-14462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14462
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Fabien Rousseau
>Assignee: Fabien Rousseau
>Priority: Major
> Attachments: 14662-2.1-2.2.patch
>
>
> Issue CASSANDRA-12127 changed the way the reversed comparator behaves. Before 
> scrubbing tables with reversed clustering keys, requests with CAS won't apply 
> (even if the condition is true).
> Below is a simple scenario to reproduce it:
> - use C* 2.1.14/2.2.6
> - create the schema
> {code:java}
> CREATE KEYSPACE IF NOT EXISTS test_ks WITH replication = {'class': 
> 'SimpleStrategy', 'replication_factor': 1};
> USE test_ks;
> CREATE TABLE IF NOT EXISTS test_cf (
>  pid text,
>  total int static,
>  sid uuid,
>  amount int,
>  PRIMARY KEY ((pid), sid)
> ) WITH CLUSTERING ORDER BY (sid DESC);
> {code}
>  
> - insert data
> {code:java}
> INSERT INTO test_cf (pid, sid, amount) VALUES ('1', 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab, 2);
> INSERT INTO test_cf (pid, total) VALUES ('1', 2);{code}
>  
> - nodetool flush (this is necessary for the scenario to show the problem)
> - upgrade to C* 2.1.20/2.2.12
> - execute the following queries:
> {code:java}
> UPDATE test_cf SET total = 3 WHERE pid = '1' IF total = 2;
> UPDATE test_cf SET amount = 3 WHERE pid = '1' AND sid = 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab IF amount = 2;{code}
>  
> Both statements won't be applied while they should be applied.
> It seems related to the min/maxColumn sstable checks (before the scrubbing, 
> the min is an empty array, after it is no more) which filter too many 
> sstables.
> The SliceQueryFilter.shouldInclude method filter too many SSTables.
> Note: When doing a simple "SELECT total FROM test_cf WHERE pid ='1';" works 
> well because the selected slices are different (and thus do not filter the 
> sstables).
> Note: This does not seem to affect the 3.0.X versions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14462) CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X

2018-06-08 Thread Fabien Rousseau (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506104#comment-16506104
 ] 

Fabien Rousseau edited comment on CASSANDRA-14462 at 6/8/18 2:52 PM:
-

I propose the patch in attachments.

It works with the scenario described in the ticket.

This might not be the right way to fix the issue, but it has the advantage of 
being rather simple and safe, at the expense of a performance penalty before 
scrubbing. (Once scrubbed, there is no more penalty).

The patch works by disabling the min/max column check for tables with a 
clustering key and for the first clustering component being a reversed 
comparator (and automatically including it).

There are no unit tests because I don't know how to automatically generate 
SSTable of a previous version then use them (but please let me know if it is 
possible).

 

Note: the patch is simple enough to apply cleanly to both 2.1 & 2.2 branches.


was (Author: frousseau):
I propose the following patch.

It works with the scenario described in the ticket.

This might not be the right way to fix the issue, but it has the advantage of 
being rather simple and safe, at the expense of a performance penalty before 
scrubbing. (Once scrubbed, there is no more penalty).

The patch works by disabling the min/max column check for tables with a 
clustering key and for the first clustering component being a reversed 
comparator (and automatically including it).

There are no unit tests because I don't know how to automatically generate 
SSTable of a previous version then use them (but please let me know if it is 
possible).

 

Note: the patch is simple enough to apply cleanly to both 2.1 & 2.2 branches.

> CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X
> ---
>
> Key: CASSANDRA-14462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14462
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Fabien Rousseau
>Priority: Major
> Attachments: 14662-2.1-2.2.patch
>
>
> Issue CASSANDRA-12127 changed the way the reversed comparator behaves. Before 
> scrubbing tables with reversed clustering keys, requests with CAS won't apply 
> (even if the condition is true).
> Below is a simple scenario to reproduce it:
> - use C* 2.1.14/2.2.6
> - create the schema
> {code:java}
> CREATE KEYSPACE IF NOT EXISTS test_ks WITH replication = {'class': 
> 'SimpleStrategy', 'replication_factor': 1};
> USE test_ks;
> CREATE TABLE IF NOT EXISTS test_cf (
>  pid text,
>  total int static,
>  sid uuid,
>  amount int,
>  PRIMARY KEY ((pid), sid)
> ) WITH CLUSTERING ORDER BY (sid DESC);
> {code}
>  
> - insert data
> {code:java}
> INSERT INTO test_cf (pid, sid, amount) VALUES ('1', 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab, 2);
> INSERT INTO test_cf (pid, total) VALUES ('1', 2);{code}
>  
> - nodetool flush (this is necessary for the scenario to show the problem)
> - upgrade to C* 2.1.20/2.2.12
> - execute the following queries:
> {code:java}
> UPDATE test_cf SET total = 3 WHERE pid = '1' IF total = 2;
> UPDATE test_cf SET amount = 3 WHERE pid = '1' AND sid = 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab IF amount = 2;{code}
>  
> Both statements won't be applied while they should be applied.
> It seems related to the min/maxColumn sstable checks (before the scrubbing, 
> the min is an empty array, after it is no more) which filter too many 
> sstables.
> The SliceQueryFilter.shouldInclude method filter too many SSTables.
> Note: When doing a simple "SELECT total FROM test_cf WHERE pid ='1';" works 
> well because the selected slices are different (and thus do not filter the 
> sstables).
> Note: This does not seem to affect the 3.0.X versions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14462) CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X

2018-06-08 Thread Fabien Rousseau (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506104#comment-16506104
 ] 

Fabien Rousseau commented on CASSANDRA-14462:
-

I propose the following patch.

It works with the scenario described in the ticket.

This might not be the right way to fix the issue, but it has the advantage of 
being rather simple and safe, at the expense of a performance penalty before 
scrubbing. (Once scrubbed, there is no more penalty).

The patch works by disabling the min/max column check for tables with a 
clustering key and for the first clustering component being a reversed 
comparator (and automatically including it).

There are no unit tests because I don't know how to automatically generate 
SSTable of a previous version then use them (but please let me know if it is 
possible).

 

Note: the patch is simple enough to apply cleanly to both 2.1 & 2.2 branches.

> CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X
> ---
>
> Key: CASSANDRA-14462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14462
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Fabien Rousseau
>Priority: Major
> Attachments: 14662-2.1-2.2.patch
>
>
> Issue CASSANDRA-12127 changed the way the reversed comparator behaves. Before 
> scrubbing tables with reversed clustering keys, requests with CAS won't apply 
> (even if the condition is true).
> Below is a simple scenario to reproduce it:
> - use C* 2.1.14/2.2.6
> - create the schema
> {code:java}
> CREATE KEYSPACE IF NOT EXISTS test_ks WITH replication = {'class': 
> 'SimpleStrategy', 'replication_factor': 1};
> USE test_ks;
> CREATE TABLE IF NOT EXISTS test_cf (
>  pid text,
>  total int static,
>  sid uuid,
>  amount int,
>  PRIMARY KEY ((pid), sid)
> ) WITH CLUSTERING ORDER BY (sid DESC);
> {code}
>  
> - insert data
> {code:java}
> INSERT INTO test_cf (pid, sid, amount) VALUES ('1', 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab, 2);
> INSERT INTO test_cf (pid, total) VALUES ('1', 2);{code}
>  
> - nodetool flush (this is necessary for the scenario to show the problem)
> - upgrade to C* 2.1.20/2.2.12
> - execute the following queries:
> {code:java}
> UPDATE test_cf SET total = 3 WHERE pid = '1' IF total = 2;
> UPDATE test_cf SET amount = 3 WHERE pid = '1' AND sid = 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab IF amount = 2;{code}
>  
> Both statements won't be applied while they should be applied.
> It seems related to the min/maxColumn sstable checks (before the scrubbing, 
> the min is an empty array, after it is no more) which filter too many 
> sstables.
> The SliceQueryFilter.shouldInclude method filter too many SSTables.
> Note: When doing a simple "SELECT total FROM test_cf WHERE pid ='1';" works 
> well because the selected slices are different (and thus do not filter the 
> sstables).
> Note: This does not seem to affect the 3.0.X versions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14462) CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X

2018-06-08 Thread Fabien Rousseau (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabien Rousseau updated CASSANDRA-14462:

Attachment: 14662-2.1-2.2.patch

> CAS temporarily broken on reversed tables after upgrading on 2.1.X or 2.2.X
> ---
>
> Key: CASSANDRA-14462
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14462
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Fabien Rousseau
>Priority: Major
> Attachments: 14662-2.1-2.2.patch
>
>
> Issue CASSANDRA-12127 changed the way the reversed comparator behaves. Before 
> scrubbing tables with reversed clustering keys, requests with CAS won't apply 
> (even if the condition is true).
> Below is a simple scenario to reproduce it:
> - use C* 2.1.14/2.2.6
> - create the schema
> {code:java}
> CREATE KEYSPACE IF NOT EXISTS test_ks WITH replication = {'class': 
> 'SimpleStrategy', 'replication_factor': 1};
> USE test_ks;
> CREATE TABLE IF NOT EXISTS test_cf (
>  pid text,
>  total int static,
>  sid uuid,
>  amount int,
>  PRIMARY KEY ((pid), sid)
> ) WITH CLUSTERING ORDER BY (sid DESC);
> {code}
>  
> - insert data
> {code:java}
> INSERT INTO test_cf (pid, sid, amount) VALUES ('1', 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab, 2);
> INSERT INTO test_cf (pid, total) VALUES ('1', 2);{code}
>  
> - nodetool flush (this is necessary for the scenario to show the problem)
> - upgrade to C* 2.1.20/2.2.12
> - execute the following queries:
> {code:java}
> UPDATE test_cf SET total = 3 WHERE pid = '1' IF total = 2;
> UPDATE test_cf SET amount = 3 WHERE pid = '1' AND sid = 
> b2495ad2-9b64-4aab-b000-2ed20dda60ab IF amount = 2;{code}
>  
> Both statements won't be applied while they should be applied.
> It seems related to the min/maxColumn sstable checks (before the scrubbing, 
> the min is an empty array, after it is no more) which filter too many 
> sstables.
> The SliceQueryFilter.shouldInclude method filter too many SSTables.
> Note: When doing a simple "SELECT total FROM test_cf WHERE pid ='1';" works 
> well because the selected slices are different (and thus do not filter the 
> sstables).
> Note: This does not seem to affect the 3.0.X versions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Deleted] (CASSANDRA-14506) Cassandra has a serious bug

2018-06-08 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa deleted CASSANDRA-14506:
---


> Cassandra has a serious bug
> ---
>
> Key: CASSANDRA-14506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14506
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Priority: Critical
>
> TBA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14506) Cassandra has a serious bug

2018-06-08 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14506:
---
Summary: Cassandra has a serious bug  (was: Cassandra is an idiot)

> Cassandra has a serious bug
> ---
>
> Key: CASSANDRA-14506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14506
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> TBA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2018-06-08 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14459:
---
Fix Version/s: 4.x

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.x
>
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory

2018-06-08 Thread Jason Brown (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-13929:

Status: Ready to Commit  (was: Patch Available)

> BTree$Builder / io.netty.util.Recycler$Stack leaking memory
> ---
>
> Key: CASSANDRA-13929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13929
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Thomas Steinmaurer
>Assignee: Jay Zhuang
>Priority: Major
> Fix For: 3.11.3
>
> Attachments: cassandra_3.11.0_min_memory_utilization.jpg, 
> cassandra_3.11.1_NORECYCLE_memory_utilization.jpg, 
> cassandra_3.11.1_mat_dominator_classes.png, 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, 
> cassandra_3.11.1_vs_3.11.2recyclernullingpatch.png, 
> cassandra_heapcpu_memleak_patching_test_30d.png, 
> dtest_example_80_request.png, dtest_example_80_request_fix.png, 
> dtest_example_heap.png, memleak_heapdump_recyclerstack.png
>
>
> Different to CASSANDRA-13754, there seems to be another memory leak in 
> 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack.
> * heap utilization increase after upgrading to 3.11.0 => 
> cassandra_3.11.0_min_memory_utilization.jpg
> * No difference after upgrading to 3.11.1 (snapshot build) => 
> cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing 
> CASSANDRA-13754, more visible now
> * MAT shows io.netty.util.Recycler$Stack as top contributing class => 
> cassandra_3.11.1_mat_dominator_classes.png
> * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart 
> after ~ 72 hours
> Verified the following fix, namely explicitly unreferencing the 
> _recycleHandle_ member (making it non-final). In 
> _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_
> {code}
> public void recycle()
> {
> if (recycleHandle != null)
> {
> this.cleanup();
> builderRecycler.recycle(this, recycleHandle);
> recycleHandle = null; // ADDED
> }
> }
> {code}
> Patched a single node in our loadtest cluster with this change and after ~ 10 
> hours uptime, no sign of the previously offending class in MAT anymore => 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png
> Can' say if this has any other side effects etc., but I doubt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory

2018-06-08 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505984#comment-16505984
 ] 

Jason Brown commented on CASSANDRA-13929:
-

[~jay.zhuang] +1

> BTree$Builder / io.netty.util.Recycler$Stack leaking memory
> ---
>
> Key: CASSANDRA-13929
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13929
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Thomas Steinmaurer
>Assignee: Jay Zhuang
>Priority: Major
> Fix For: 3.11.3
>
> Attachments: cassandra_3.11.0_min_memory_utilization.jpg, 
> cassandra_3.11.1_NORECYCLE_memory_utilization.jpg, 
> cassandra_3.11.1_mat_dominator_classes.png, 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png, 
> cassandra_3.11.1_snapshot_heaputilization.png, 
> cassandra_3.11.1_vs_3.11.2recyclernullingpatch.png, 
> cassandra_heapcpu_memleak_patching_test_30d.png, 
> dtest_example_80_request.png, dtest_example_80_request_fix.png, 
> dtest_example_heap.png, memleak_heapdump_recyclerstack.png
>
>
> Different to CASSANDRA-13754, there seems to be another memory leak in 
> 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack.
> * heap utilization increase after upgrading to 3.11.0 => 
> cassandra_3.11.0_min_memory_utilization.jpg
> * No difference after upgrading to 3.11.1 (snapshot build) => 
> cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing 
> CASSANDRA-13754, more visible now
> * MAT shows io.netty.util.Recycler$Stack as top contributing class => 
> cassandra_3.11.1_mat_dominator_classes.png
> * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart 
> after ~ 72 hours
> Verified the following fix, namely explicitly unreferencing the 
> _recycleHandle_ member (making it non-final). In 
> _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_
> {code}
> public void recycle()
> {
> if (recycleHandle != null)
> {
> this.cleanup();
> builderRecycler.recycle(this, recycleHandle);
> recycleHandle = null; // ADDED
> }
> }
> {code}
> Patched a single node in our loadtest cluster with this change and after ~ 10 
> hours uptime, no sign of the previously offending class in MAT anymore => 
> cassandra_3.11.1_mat_dominator_classes_FIXED.png
> Can' say if this has any other side effects etc., but I doubt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[1/2] cassandra-builds git commit: Update centos docker file to avoid py ssl warnings

2018-06-08 Thread spod
Repository: cassandra-builds
Updated Branches:
  refs/heads/master 8f796c668 -> fb5df10b6


Update centos docker file to avoid py ssl warnings


Project: http://git-wip-us.apache.org/repos/asf/cassandra-builds/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-builds/commit/a317ef0c
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-builds/tree/a317ef0c
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-builds/diff/a317ef0c

Branch: refs/heads/master
Commit: a317ef0c79c6c38f2ed7627482a609cf9c7bc4e7
Parents: 8f796c6
Author: Stefan Podkowinski 
Authored: Fri Jun 8 14:32:24 2018 +0200
Committer: Stefan Podkowinski 
Committed: Fri Jun 8 14:32:24 2018 +0200

--
 docker/centos7-image.docker | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-builds/blob/a317ef0c/docker/centos7-image.docker
--
diff --git a/docker/centos7-image.docker b/docker/centos7-image.docker
index c082939..90d7b28 100644
--- a/docker/centos7-image.docker
+++ b/docker/centos7-image.docker
@@ -24,11 +24,17 @@ RUN yum -y install \
 # via epel-releases
 RUN yum -y install python2-pip
 
+# install ssl enabled urllib for retrieving python packages
+# this will produce some ssl related warnings, which will be resolved once the 
package has been installed
+RUN pip install urllib3[secure]
+
+# upgrade to modern pip version
+RUN pip install --upgrade pip
+
 # install Sphinx to generate docs
 RUN pip install \
Sphinx \
-   sphinx_rtd_theme \
-   urllib3
+   sphinx_rtd_theme
 
 # create and change to build user
 RUN adduser build


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/2] cassandra-builds git commit: README: on updating rpm/deb repositories

2018-06-08 Thread spod
README: on updating rpm/deb repositories


Project: http://git-wip-us.apache.org/repos/asf/cassandra-builds/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-builds/commit/fb5df10b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-builds/tree/fb5df10b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-builds/diff/fb5df10b

Branch: refs/heads/master
Commit: fb5df10b6341b82fc887c7b60109b1a25f485334
Parents: a317ef0
Author: Stefan Podkowinski 
Authored: Fri Jun 8 14:34:14 2018 +0200
Committer: Stefan Podkowinski 
Committed: Fri Jun 8 14:34:14 2018 +0200

--
 README.md | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-builds/blob/fb5df10b/README.md
--
diff --git a/README.md b/README.md
index 8bb85ee..14faed1 100644
--- a/README.md
+++ b/README.md
@@ -59,7 +59,32 @@ Once the RPM is signed, both the import key and verification 
steps should take p
 
 See use of `debsign` in `cassandra-release/prepare_release.sh`.
 
+## Updating package repositories
 
-## Publishing packages
+### Prerequisites
 
-TODO
+Artifacts for RPM and Debian package repositories, as well as tar archives, 
are keept in a single SVN repository. You need to have your own local copy for 
adding new packages:
+
+```
+svn co --config-option 'config:miscellany:use-commit-times=yes' 
https://dist.apache.org/repos/dist/release/cassandra
+```
+
+(you may also want to set `use-commit-times = yes` in your local svn config)
+
+We'll further refer to the local directory created by the svn command as 
`$artifacts_svn_dir`.
+
+Required build tools:
+* [createrepo](https://packages.ubuntu.com/bionic/createrepo) (RPMs)
+* [reprepro](https://packages.ubuntu.com/bionic/reprepro) (Debian)
+
+### RPM
+
+Adding new packages to the official repository starts by copying the RPMs to 
`$artifacts_svn_dir/redhat/`. Afterwards, recreate the metadata by 
executing `createrepo -v .` in that directory. Finally, sign the generated meta 
data files in the `repodata` sub-directory:
+
+```
+for i in `ls *.bz2 *.gz *.xml`; do gpg -sba --local-user MyAlias $i; done;
+```
+
+### Debian
+
+See `finish_release.sh`


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14507) OutboundMessagingConnection backlog is not fully written in case of race conditions

2018-06-08 Thread Sergio Bossa (JIRA)
Sergio Bossa created CASSANDRA-14507:


 Summary: OutboundMessagingConnection backlog is not fully written 
in case of race conditions
 Key: CASSANDRA-14507
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14507
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
Reporter: Sergio Bossa


The {{OutboundMessagingConnection}} writes into a backlog queue before the 
connection handshake is successfully completed, and then writes such backlog to 
the channel as soon as the successful handshake moves the channel state to 
{{READY}}.
This is unfortunately race prone, as the following could happen:
1) One or more writer threads see the channel state as {{NOT_READY}} in 
{{#sendMessage()}} and are about to enqueue to the backlog, but they get 
descheduled by the OS.
2) The handshake thread is scheduled by the OS and moves the channel state to 
{{READY}}, emptying the backlog.
3) The writer threads are scheduled back and add to the backlog, but the 
channel state is {{READY}} at this point, so those writes would sit in the 
backlog and expire.

Please note a similar race condition exists between 
{{OutboundMessagingConnection#sendMessage()}} and 
{{MessageOutHandler#channelWritabilityChanged()}}, which is way more serious as 
the channel writability could frequently change, luckily it looks like 
{{ChannelWriter#write()}} never gets invoked with {{checkWritability}} at 
{{true}} (so writes never go to the backlog when the channel is not writable).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14506) Cassandra is an idiot

2018-06-08 Thread Aleksey Yeschenko (JIRA)
Aleksey Yeschenko created CASSANDRA-14506:
-

 Summary: Cassandra is an idiot
 Key: CASSANDRA-14506
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14506
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
 Fix For: 3.0.x, 3.11.x, 4.0.x


TBA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14502) toDate() CQL function is instantiated for wrong argument type

2018-06-08 Thread Benjamin Lerer (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer reassigned CASSANDRA-14502:
--

Assignee: Benjamin Lerer

> toDate() CQL function is instantiated for wrong argument type
> -
>
> Key: CASSANDRA-14502
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14502
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Piotr Sarna
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 4.0.x
>
>
> {{toDate()}} function is instantiated to work for {{timeuuid}} and {{date}} 
> types passed as an argument, instead of {{timeuuid}} and {{timestamp}}, as 
> stated in this documentation: 
> [http://cassandra.apache.org/doc/latest/cql/functions.html#datetime-functions]
> As a result it's possible to convert a {{date}} into {{date}}, but not a 
> {{timestamp}} into {{date}}, which is probably what was meant.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14459) DynamicEndpointSnitch should never prefer latent nodes

2018-06-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505784#comment-16505784
 ] 

Joseph Lynch commented on CASSANDRA-14459:
--

Ok, I've pushed another version of the patch to that branch which:
 # Adds a guaranteed {{EchoMessage}} to live hosts in addition to the GossipSyn 
request during the first step of gossip so that we can get some latency 
measurements from even latent nodes at some point. Increasing the gossip 
messaging slightly concerns me so the other option is I can have DES send 
explicit {{EchoMessages}} when it notices that a host doesn't have any data in 
{{reset}}. That is more deterministic (we can guarantee that after 2 reset 
intervals we'll probe the host), but also has the DES actively sending 
messages...
 # Creates a JMX method on {{DynamicEndpointSnitchMBean}} to allow users to 
force timing resets (if someone wants the old behavior back they can just call 
it on a cron ;)).

I've been playing around with a local CCM cluster using `netem` to delay 
traffic to a particular localhost node with a small (~5s) reset interval and 
testing the reset logic out and it appears to work well. The only issue I ran 
into is that if a node is really fast once and then it becomes slow it will get 
some traffic after every reset because we reset to the fast measurement. This 
is no worse than the status quo but I tried to mitigate it by special casing a 
host which only has two measurements (a fast and a subsequent slow one) to use 
the mean instead of the minimum which eventually converges either up or down to 
the new RTT.

> DynamicEndpointSnitch should never prefer latent nodes
> --
>
> Key: CASSANDRA-14459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14505) Removal of last element on a List deletes the entire row

2018-06-08 Thread Benjamin Lerer (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer reassigned CASSANDRA-14505:
--

Assignee: Benjamin Lerer

> Removal of last element on a List deletes the entire row
> 
>
> Key: CASSANDRA-14505
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14505
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: * Java: 1.8.0_171
>  * SO: Ubuntu 18.04 LTS
>  * Cassandra: 3.11.2 
>Reporter: André Paris
>Assignee: Benjamin Lerer
>Priority: Major
>
> The behavior of an element removal from a list by an UPDATE differs by how 
> the row was created:
> Given the table
> {{CREATE TABLE table_test (}}
>  {{    id int PRIMARY KEY,}}
>  {{    list list}}
>  {{)}}
> If the row is created by an INSERT, the row remains after the UPDATE to 
> remove the last element on the list:
> {{cqlsh:ks_test> INSERT INTO table_test (id, list ) VALUES ( 1, ['foo']) ;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +}}
>      1   | ['foo'] 
> {{(1 rows)}}
>  {{cqlsh:ks_test> UPDATE table_test SET list = list - ['foo'] WHERE id=1;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +--}}
>  {{  1 | null}}
> {{(1 rows)}}
>  
> But, if the row is created by an UPDATE, the row is deleted after the UPDATE 
> to remove the last element on the list:
> {{cqlsh:ks_test> UPDATE table_test SET list = list + ['foo'] WHERE id=2;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +-}}
>        2 | ['foo']
> {{(1 rows)}}
>  {{cqlsh:ks_test> UPDATE table_test SET list = list - ['foo'] WHERE id=2;}}
>  {{cqlsh:ks_test> SELECT * FROM table_test;}}
> {{ id | list}}
>  {{ +--}}
> {{(0 rows)}}
>  
> Thanks in advance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org