[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-10-09 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197195#comment-16197195
 ] 

Christian Esken commented on CASSANDRA-13265:
-

PR closed: https://github.com/apache/cassandra/pull/95

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: cassandra-13265-2.2-dtest_stdout.txt, 
> cassandra-13265-trun-dtest_stdout.txt, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-05-05 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15998277#comment-15998277
 ] 

Christian Esken commented on CASSANDRA-13265:
-

It is fine, I do not use 2.2. I was just wondering because you asked me to 
start with 2.2, which required more effort and made things a bit more 
complicated. If you hadn't asked, I would have done patches just for the HEAD 
of version 3 (cassandra-3.0) and version 4 (trunk). 

So, mission complete. Thanks Ariel for guiding me through my first Cassandra 
patch. :-)

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: cassandra-13265-2.2-dtest_stdout.txt, 
> cassandra-13265-trun-dtest_stdout.txt, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-05-04 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996322#comment-15996322
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Thanks. I have seen your commit in three branches. I did not yet see the 
changes in cassandra-2.2, when looking at  
https://github.com/apache/cassandra/commits/cassandra-2.2 . Is this an 
omission, or is the github repo is not current?

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: cassandra-13265-2.2-dtest_stdout.txt, 
> cassandra-13265-trun-dtest_stdout.txt, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-05-03 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986426#comment-15986426
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 5/3/17 1:05 PM:
-

I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  yes | (/) / (?) | No stress-test in build.xml. I 
patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 . CircleCI still kicks off a 4th test, which fails but can likely be ignored 
for now. |
|  cassandra-13265-3.11 | yes | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | yes | CircleCI (/) / (?)  | My unit test works. But 
there is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |



was (Author: cesken):
I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  yes | (/) / (?) | No stress-test in build.xml. I 
patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 . CircleCI still kicks off a 4th test, which fails but can likely be ignored 
for now. |
|  cassandra-13265-3.11 | yes | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | yes | CircleCI (/) / (?)  | My unit test works. Bu 
there is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra-13265-2.2-dtest_stdout.txt, 
> cassandra-13265-trun-dtest_stdout.txt, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-05-03 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15994820#comment-15994820
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Done. Squashed and pushed. 

I also removed my "stress-test" patch in the 3.0 branch, as it is not related 
and also does not look like a proper fix. As a reference, here is the patch:
{code}
-- case $CIRCLE_NODE_INDEX in 0) ant eclipse-warnings; ant test ;; 1) ant 
long-test ;; 2) ant test-compression ;; 3) ant stress-test ;;esac:
+- case $CIRCLE_NODE_INDEX in 0) ant eclipse-warnings; ant test ;; 1) ant 
long-test ;; 2) ant test-compression ;;esac:
{code}


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra-13265-2.2-dtest_stdout.txt, 
> cassandra-13265-trun-dtest_stdout.txt, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-05-03 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986426#comment-15986426
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 5/3/17 1:01 PM:
-

I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  yes | (/) / (?) | No stress-test in build.xml. I 
patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 . CircleCI still kicks off a 4th test, which fails but can likely be ignored 
for now. |
|  cassandra-13265-3.11 | yes | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | yes | CircleCI (/) / (?)  | My unit test works. Bu 
there is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |



was (Author: cesken):
I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (/) / (?) | No stress-test in build.xml. I 
patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 . CircleCI still kicks off a 4th test, which fails but can likely be ignored 
for now. |
|  cassandra-13265-3.11 | yes | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | yes | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra-13265-2.2-dtest_stdout.txt, 
> cassandra-13265-trun-dtest_stdout.txt, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-05-03 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986426#comment-15986426
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 5/3/17 12:02 PM:
--

I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (/) / (?) | No stress-test in build.xml. I 
patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 . CircleCI still kicks off a 4th test, which fails but can likely be ignored 
for now. |
|  cassandra-13265-3.11 | yes | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | yes | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |



was (Author: cesken):
I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (/) / (?) | No stress-test in build.xml. I 
patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 . CircleCI still kicks off a 4th test, which fails but can likely be ignored 
for now. |
|  cassandra-13265-3.11 | no | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | no | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra-13265-2.2-dtest_stdout.txt, 
> cassandra-13265-trun-dtest_stdout.txt, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-27 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986426#comment-15986426
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/27/17 1:24 PM:
--

I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (/) / (?) | No stress-test in build.xml. I 
patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 . CircleCI still kicks off a 4th test, which fails but can likely be ignored 
for now. |
|  cassandra-13265-3.11 | no | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | no | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |



was (Author: cesken):
I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (CircleCI  currently running) | No stress-test 
in build.xml. I patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 |
|  cassandra-13265-3.11 | no | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | no | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-27 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986426#comment-15986426
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/27/17 11:53 AM:
---

I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (CircleCI  currently running) | No stress-test 
in build.xml. I patched circle.yml to match that: 
https://github.com/christian-esken/cassandra/commit/1a776e299c76093eb3edf20e0d9054e14549a667
 |
|  cassandra-13265-3.11 | no | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | no | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |



was (Author: cesken):
I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (CircleCI  currently running) | |
|  cassandra-13265-3.11 | no | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | no | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-27 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15986426#comment-15986426
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I am fixing the branches, while you work on the dtests. I will continue 
updating this comment as long as I work on it.
|| branch || sqaushed? || Unit Tests OK? || comment ||
|  cassandra-13265-3.0 |  no | (CircleCI  currently running) | |
|  cassandra-13265-3.11 | no | CircleCI  (/) | |
|  cassandra-13265-2.2  | yes | ant test   (/) | CicrleCI hasn't kicked off 
tests for the branch | 
|  cassandra-13265-trunk  | no | CircleCI (?)  | My unit test works. Bu there 
is a strange unrelated unit test failure: ClassNotFoundException: 
org.apache.cassandra.stress.CompactionStress |


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-24 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981236#comment-15981236
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/24/17 2:32 PM:
--

First here is a summary and the question I have: The tests work if I add 
"DatabaseDescriptor.daemonInitialization();" to the unit test of the affected 
branches. Is this a good idea, [~aweisberg]?

Now the long story:

This is the status for branch cassandra-13265-3.0:
- (/) Running unit tests in Eclipse: Works
 - (/)/(?) CircleCI: All normal tests work fine. "Your build ran 4754 tests in 
junit with 0 failures".  The build fails for me with: Target "stress-test" does 
not exist in the project "apache-cassandra". As "ant test" worked, I would 
guess that the patch is fine. I will reverify the specific unit test locally


This is the status for branch cassandra-13265-3.11 and cassandra-13265-trunk:
- (/) Running unit tests in Eclipse: Works
- (x) Running unit tests with CircleCI or "ant test" fails, due to 
non-initialized DatabaseDescriptor.
  When I add the following to the unit test of cassandra-13265-3.11, the unit 
test works. 
{code}
   DatabaseDescriptor.daemonInitialization();
{code}

{code}
[junit] Null Test:  Caused an ERROR
[junit] null
[junit] java.lang.ExceptionInInitializerError
[junit] at java.lang.Class.forName0(Native Method)
[junit] at java.lang.Class.forName(Class.java:264)
[junit] Caused by: java.lang.NullPointerException
[junit] at 
org.apache.cassandra.config.DatabaseDescriptor.getWriteRpcTimeout(DatabaseDescriptor.java:1400)
[junit] at 
org.apache.cassandra.net.MessagingService$Verb$1.getTimeout(MessagingService.java:121)
[junit] at 
org.apache.cassandra.net.OutboundTcpConnectionTest.(OutboundTcpConnectionTest.java:43)
{code}


was (Author: cesken):
First here is the summary: The tests work if I add 
"DatabaseDescriptor.daemonInitialization();" to the unit test of the affected 
branches. Is this a good idea, [~aweisberg]?

Now the long story:

This is the status for branch cassandra-13265-3.0:
- (/) Running unit tests in Eclipse: Works
 - (/)/(?) CircleCI: All normal tests work fine. "Your build ran 4754 tests in 
junit with 0 failures".  The build fails for me with: Target "stress-test" does 
not exist in the project "apache-cassandra". As "ant test" worked, I would 
guess that the patch is fine. I will reverify the specific unit test locally


This is the status for branch cassandra-13265-3.11 and cassandra-13265-trunk:
- (/) Running unit tests in Eclipse: Works
- (x) Running unit tests with CircleCI or "ant test" fails, due to 
non-initialized DatabaseDescriptor.
  When I add the following to the unit test of cassandra-13265-3.11, the unit 
test works. 
{code}
   DatabaseDescriptor.daemonInitialization();
{code}

{code}
[junit] Null Test:  Caused an ERROR
[junit] null
[junit] java.lang.ExceptionInInitializerError
[junit] at java.lang.Class.forName0(Native Method)
[junit] at java.lang.Class.forName(Class.java:264)
[junit] Caused by: java.lang.NullPointerException
[junit] at 
org.apache.cassandra.config.DatabaseDescriptor.getWriteRpcTimeout(DatabaseDescriptor.java:1400)
[junit] at 
org.apache.cassandra.net.MessagingService$Verb$1.getTimeout(MessagingService.java:121)
[junit] at 
org.apache.cassandra.net.OutboundTcpConnectionTest.(OutboundTcpConnectionTest.java:43)
{code}

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued 

[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-24 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981236#comment-15981236
 ] 

Christian Esken commented on CASSANDRA-13265:
-

First here is the summary: The tests work if I add 
"DatabaseDescriptor.daemonInitialization();" to the unit test of the affected 
branches. Is this a good idea, [~aweisberg]?

Now the long story:

This is the status for branch cassandra-13265-3.0:
- (/) Running unit tests in Eclipse: Works
 - (/)/(?) CircleCI: All normal tests work fine. "Your build ran 4754 tests in 
junit with 0 failures".  The build fails for me with: Target "stress-test" does 
not exist in the project "apache-cassandra". As "ant test" worked, I would 
guess that the patch is fine. I will reverify the specific unit test locally


This is the status for branch cassandra-13265-3.11 and cassandra-13265-trunk:
- (/) Running unit tests in Eclipse: Works
- (x) Running unit tests with CircleCI or "ant test" fails, due to 
non-initialized DatabaseDescriptor.
  When I add the following to the unit test of cassandra-13265-3.11, the unit 
test works. 
{code}
   DatabaseDescriptor.daemonInitialization();
{code}

{code}
[junit] Null Test:  Caused an ERROR
[junit] null
[junit] java.lang.ExceptionInInitializerError
[junit] at java.lang.Class.forName0(Native Method)
[junit] at java.lang.Class.forName(Class.java:264)
[junit] Caused by: java.lang.NullPointerException
[junit] at 
org.apache.cassandra.config.DatabaseDescriptor.getWriteRpcTimeout(DatabaseDescriptor.java:1400)
[junit] at 
org.apache.cassandra.net.MessagingService$Verb$1.getTimeout(MessagingService.java:121)
[junit] at 
org.apache.cassandra.net.OutboundTcpConnectionTest.(OutboundTcpConnectionTest.java:43)
{code}

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-21 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978748#comment-15978748
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I pulled the changes, fixed the CHANGES.txt and pushed everything again. Now 
CircleCI is kicking off the builds for the branches. Looks like we are getting 
somewhere. :-)

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-21 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978730#comment-15978730
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Hmm, I rebased on 2.2 and trunk. I am surprised that 3.11 is not current, as 
3.11 is not that old. I will now clean my repo, including deleting the bad 
"13625" branches and rebasing 3.0 and 3.11. 

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-20 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976279#comment-15976279
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Unfortunately some test failed, not because of bugs but due to technical 
issues, mostly with 
"com.datastax.driver.core.exceptions.NoHostAvailableException". Are these the 
"dtest" issues in CircleCI you mentioned?

I tried to run the tests locally, but even "ant test" runs > 1 hour and keeps 
failing  with Timeout, NoHostAvailableException, or similar. I don't know why 
the tests fail, as my Laptop should be capable of doing it. I am frequently 
running a 3-node Cassandra on it via ccm and that works properly.

Currently I think I did all I can do. Let me know if I can check something 
else. What is your proposal how do we continue here?

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-19 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974885#comment-15974885
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/19/17 3:14 PM:
--

No problem. I was away for Easter, so I did not even notice you being busy. 
I just started my CircleCI test for the first time. Its working on the first 
branch (trunk) for an hour and is not complete yet, so I guess with all the 
branches it can take a day to complete. I have restarted the build with more 
parallelism and hopefully that will create a more acceptable turnaround time. I 
will send an update whenever that is complete.  
https://circleci.com/gh/christian-esken/cassandra/3


was (Author: cesken):
No problem. I was away for Easter, so I did not even notice you being busy. 
I just started my CircleCI test for the first time. Its working on the first 
branch (trunk) for an hour and is not complete yet, so I guess with all the 
branches it can take a day to complete. I have restarted the build with more 
parallelism and will send an update whenever that is complete. Hopefully that 
will create a more acceptable turnaround time. 
https://circleci.com/gh/christian-esken/cassandra/3

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-19 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974885#comment-15974885
 ] 

Christian Esken commented on CASSANDRA-13265:
-

No problem. I was away for Easter, so I did not even notice you being busy. 
I just started my CircleCI test for the first time. Its working on the first 
branch (trunk) for an hour and is not complete yet, so I guess with all the 
branches it can take a day to complete. I have restarted the build with more 
parallelism and will send an update whenever that is complete. Hopefully that 
will create a more acceptable turnaround time. 
https://circleci.com/gh/christian-esken/cassandra/3

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-19 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974590#comment-15974590
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq. For CHANGES.TXT the entry should go at the top of the list of entries for 
the version the change is for. I don't know why.
I also haven't seen this mentioned. Probably someone could add that to 
https://wiki.apache.org/cassandra/HowToContribute or 
http://cassandra.apache.org/doc/latest/development/how_to_commit.html . Anyhow 
I have fixed that.

bq. set up with CircleCI [...] Also you transposed 13625 and 13265 
I changed the branches to correct the transposing 13625 and 13265. I didn't 
find any other place than the branch names. I will try to find out about how to 
do the CircleCI stuff. Meanwhile here are the updated links:

https://github.com/christian-esken/cassandra/commits/cassandra-13265-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13265-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13265-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13265-trunk

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-13 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967484#comment-15967484
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/13/17 11:59 AM:
---

There were different reasons why the build failed, e.g. somehow Eclipse did not 
pick up the build parameters for 2.2 after "ant generate-eclipse-files" and the 
build was done with Java 8 language level (lambdas). Looks like building and 
testing in Eclipse alone is not enough, so I redid everything manually in the 
console and fixed the issues. As you recommended, I have created branches that 
follow your naming  (cassandra-13625-3.0) with squashed commits. The new 
branches are:

https://github.com/christian-esken/cassandra/commits/cassandra-13625-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13625-trunk

About CHANGES.TXT: I added changes in the "matching" release versions that were 
listed in the individual branches. Please check, as the naming conventions 
within Cassandra are still not clear to me (e.g. there exists a 3.11 branch, a 
3.0.11 release and a 3.11.0 changelog entry).


was (Author: cesken):
There were different reasons why the build failed, e.g. somehow Eclipse did not 
pick up the build parameters for 2.2 after "ant generate-eclipse-files" and the 
build was done with Java 8 language level (lambdas). Looks like building and 
testing in Eclipse alone is not enough, so I redid everything manually in the 
console and fixed the issues. As you recommended, I have created branches that 
follow your naming  (cassandra-13625-3.0) with squashed commits. The new 
branches are:

https://github.com/christian-esken/cassandra/commits/cassandra-13625-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13625-trunk

About CHANGES.TXT: I added changes to  all branches where in the appropriate 
versions. Please check, as the naming conventions within Cassandra are still 
not clear to me(e.g. there exists a 3.11 branch, a 3.0.11 release and a 3.11.0 
changelog entry).

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-13 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967484#comment-15967484
 ] 

Christian Esken commented on CASSANDRA-13265:
-

There were different reasons why the build failed, e.g. somehow Eclipse did not 
pick up the build parameters for 2.2 after "ant generate-eclipse-files" and the 
build was done with Java 8 language level (lambdas). Looks like building and 
testing in Eclipse alone is not enough, so I redid everything manually in the 
console and fixed the issues. As you recommended, I have created branches that 
follow your naming  (cassandra-13625-3.0) with squashed commits. The new 
branches are:

https://github.com/christian-esken/cassandra/commits/cassandra-13625-2.2
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.11
https://github.com/christian-esken/cassandra/commits/cassandra-13625-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-13625-trunk

About CHANGES.TXT: I added changes to  all branches where in the appropriate 
versions. Please check, as the naming conventions within Cassandra are still 
not clear to me(e.g. there exists a 3.11 branch, a 3.0.11 release and a 3.11.0 
changelog entry).

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-11 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964522#comment-15964522
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Done.  2 organizational topics left:

I will add the required line to the commit message. Looks OK?!?
 bq. patch by Christian Esken; reviewed by Ariel Weisberg  and Jason Brown for 
CASSANDRA-13265
My proposals for the CHANGES.txt would be the following text. Can you do that, 
Ariel? I do not know in which versions to add that, as they are upcoming 
versions.
 bq. Expire OutboundTcpConnection messages by a single Thread 

Here are the branches.The cassandra-3.0 is already squashed. If that branch is 
OK, I will also squash the other 3 branches.

https://github.com/christian-esken/cassandra/commits/cassandra-3.0
https://github.com/christian-esken/cassandra/commits/cassandra-3.11
https://github.com/christian-esken/cassandra/commits/trunk
https://github.com/christian-esken/cassandra/commits/cassandra-2.2

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963095#comment-15963095
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 4/10/17 4:20 PM:
--

Done. My highest priority is the 3.0 branch. I created a patch (single file, 
squashed) for 3.0, that I also applied to my Github fork 
https://github.com/christian-esken/cassandra/commits/cassandra-3.0 . I attached 
the patch using the Submit Patch button on the top.


was (Author: cesken):
Done. My highest priority is the 3.0 branch. I created a patch (single file, 
squashed) for 3.0, that I also applied to my Github fork 
https://github.com/christian-esken/cassandra/commits/cassandra-3.0 . Please 
have a look at the attached file 
0001-3.0-Expire-OTC-messages-by-a-single-Thread.patch .

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-10 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Status: Patch Available  (was: Open)

>From 6bd3f3fc3b2da3a66b53a94a819446a9ea8ea2cf Mon Sep 17 00:00:00 2001
From: Christian Esken 
Date: Wed, 1 Mar 2017 15:56:36 +0100
Subject: [PATCH] Expire OTC messages by a single Thread

This patch consists of the following aspects related to OutboundTcpConnection:
- Backlog queue expiration by a single Thread
- Drop count statistics
- QueuedMessage.isTimedOut() fix

When backlog queue expiration is done, one single Thread is elected to do the
work. Previously, all Threads would go in and do the same work,
producing high lock contention. The Thread reading from the Queue could
even be starved by not be able to acquire the read lock.
Backlog queue is inspected every otc_backlog_expiration_interval_ms
milliseconds if its size exceeds BACKLOG_PURGE_SIZE. Added unit tests
for OutboundTcpConnection.

Timed out messages are counted in the dropped statistics. Additionally
count the dropped messages when it is not possible to write to the
socket, e.g. if there is no connection because a target node is down.

Fix QueuedMessage.isTimedOut(), which had used a "a < b" comparison on
nano time values, which can be wrong due to wrapping of System.nanoTime().

CASSANDRA-13265
---
 conf/cassandra.yaml|   9 ++
 src/java/org/apache/cassandra/config/Config.java   |   6 +
 .../cassandra/config/DatabaseDescriptor.java   |  10 ++
 .../cassandra/net/OutboundTcpConnection.java   | 113 +++---
 .../org/apache/cassandra/service/StorageProxy.java |  10 +-
 .../cassandra/service/StorageProxyMBean.java   |   3 +
 .../cassandra/net/OutboundTcpConnectionTest.java   | 170 +
 7 files changed, 294 insertions(+), 27 deletions(-)
 create mode 100644 
test/unit/org/apache/cassandra/net/OutboundTcpConnectionTest.java

diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index 790dfd743b..9c1510b66a 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -985,3 +985,12 @@ windows_timer_interval: 1
 
 # Do not try to coalesce messages if we already got that many messages. This 
should be more than 2 and less than 128.
 # otc_coalescing_enough_coalesced_messages: 8
+
+# How many milliseconds to wait between two expiration runs on the backlog 
(queue) of the OutboundTcpConnection.
+# Expiration is done if messages are piling up in the backlog. Droppable 
messages are expired to free the memory
+# taken by expired messages. The interval should be between 0 and 1000, and in 
most installations the default value
+# will be appropriate. A smaller value could potentially expire messages 
slightly sooner at the expense of more CPU
+# time and queue contention while iterating the backlog of messages.
+# An interval of 0 disables any wait time, which is the behavior of former 
Cassandra versions.
+#
+# otc_backlog_expiration_interval_ms: 200
diff --git a/src/java/org/apache/cassandra/config/Config.java 
b/src/java/org/apache/cassandra/config/Config.java
index 9aaf7ae33e..6a99cd3cbd 100644
--- a/src/java/org/apache/cassandra/config/Config.java
+++ b/src/java/org/apache/cassandra/config/Config.java
@@ -298,6 +298,12 @@ public class Config
 public int otc_coalescing_window_us = otc_coalescing_window_us_default;
 public int otc_coalescing_enough_coalesced_messages = 8;
 
+/**
+ * Backlog expiration interval in milliseconds for the 
OutboundTcpConnection.
+ */
+public static final int otc_backlog_expiration_interval_ms_default = 200;
+public volatile int otc_backlog_expiration_interval_ms = 
otc_backlog_expiration_interval_ms_default;
+ 
 public int windows_timer_interval = 0;
 
 public boolean enable_user_defined_functions = false;
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 602214f3c6..e9e54c3e20 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -1967,6 +1967,16 @@ public class DatabaseDescriptor
 conf.otc_coalescing_enough_coalesced_messages = 
otc_coalescing_enough_coalesced_messages;
 }
 
+public static int getOtcBacklogExpirationInterval()
+{
+return conf.otc_backlog_expiration_interval_ms;
+}
+
+public static void setOtcBacklogExpirationInterval(int intervalInMillis)
+{
+conf.otc_backlog_expiration_interval_ms = intervalInMillis;
+}
+ 
 public static int getWindowsTimerInterval()
 {
 return conf.windows_timer_interval;
diff --git a/src/java/org/apache/cassandra/net/OutboundTcpConnection.java 
b/src/java/org/apache/cassandra/net/OutboundTcpConnection.java
index 46083994df..99ad194b94 100644
--- 

[jira] [Updated] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-10 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Status: Open  (was: Patch Available)

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-10 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Fix Version/s: (was: 3.11.x)
   (was: 4.x)
   (was: 2.2.x)
   Status: Patch Available  (was: Reopened)

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 3.0.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-04-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963095#comment-15963095
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Done. My highest priority is the 3.0 branch. I created a patch (single file, 
squashed) for 3.0, that I also applied to my Github fork 
https://github.com/christian-esken/cassandra/commits/cassandra-3.0 . Please 
have a look at the attached file 
0001-3.0-Expire-OTC-messages-by-a-single-Thread.patch .

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-15 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926348#comment-15926348
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq.  I still think it's a good idea to avoid hard coding this kind of value so 
operators have options without recompiling. [...] A java property. 
(/) static final int BACKLOG_PURGE_SIZE = 
Integer.getInteger("OTC_BACKLOG_PURGE_SIZE", 1024);

bq. I think we should log the drops especially due to timeouts as they happen 
rather than at the end.
(/) I agree, and I did not change that. After the loop I simply add the 
unprocessed messages, which happen due to {{break inner;}}. I'll push today, so 
you have a chance to see that. 

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-15 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925815#comment-15925815
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq. it's very low effort to add a property
I see. I thought it would just clutter the cassanda.yaml, as nobody ever would 
change the value. But if you feel it is important enough, I can do so.

bq. Not a huge deal, but I think we should log the drops especially due to 
timeouts as they happen
OK. I can follow your argument. I will rewrite it, also adding comments with 
explanation. 

PS: I will be soon on vacation for two weeks, so please don't wonder why you do 
not see any updates from me.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:55 PM:
--

Done. Hint: Not everything is committed yet, as I have to remove my debug code 
from it.
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (?)Breaking out the uber bike shedding this could be maybeExpireMessages.
Nope, I am not going back that road. I had expireMessagesConditionally() before 
and changed it on request. If we do this, then a Set should not have an add() 
method but only a maybeAdd(), because it might not add the entry. Also I added 
clear documentation, so it should be fine. 
- (/)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (/)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.



was (Author: cesken):
Done. Hint: Not everything is committed yet, as I have to remove my debug code 
from it.
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (?)Breaking out the uber bike shedding this could be maybeExpireMessages.
Nope, I am not going back that road. I had expireMessagesConditionally() before 
and changed it on request. If we do this, then a Set should not have an add() 
method but only a maybeAdd(), because it might not add the entry. Also I added 
clear documentation, so it should be fine. 
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to 

[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:54 PM:
--

Done. Hint: Not everything is committed yet, as I have to remove my debug code 
from it.
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (?)Breaking out the uber bike shedding this could be maybeExpireMessages.
Nope, I am not going back that road. I had expireMessagesConditionally() before 
and changed it on request. If we do this, then a Set should not have an add() 
method but only a maybeAdd(), because it might not add the entry. Also I added 
clear documentation, so it should be fine. 
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (/)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.



was (Author: cesken):
Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (?)Breaking out the uber bike shedding this could be maybeExpireMessages.
Nope, I am not going back that road. I had expireMessagesConditionally() before 
and changed it on request. If we do this, then a Set should not have an add() 
method but only a maybeAdd(), because it might not add the entry. Also I added 
clear documentation, so it should be fine. 
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable 

[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:52 PM:
--

Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (?)Breaking out the uber bike shedding this could be maybeExpireMessages.
Nope, I am not going back that road. I had expireMessagesConditionally() before 
and changed it on request. If we do this, then a Set should not have an add() 
method but only a maybeAdd(), because it might not add the entry. Also I added 
clear documentation, so it should be fine. 
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.



was (Author: cesken):
Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (?)Breaking out the uber bike shedding this could be maybeExpireMessages.
Nope, I am not going back that road. I had expireMessagesConditionally() before 
and changed it on request. his would be like in a Set there should be not add() 
method, but only a maybeAdd(), because it might not add the entry. Also I added 
clear documentation, so it should be fine. 
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping 

[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:51 PM:
--

Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (?)Breaking out the uber bike shedding this could be maybeExpireMessages.
Nope, I am not going back that road. I had expireMessagesConditionally() before 
and changed it on request. his would be like in a Set there should be not add() 
method, but only a maybeAdd(), because it might not add the entry. Also I added 
clear documentation, so it should be fine. 
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.



was (Author: cesken):
Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the 

[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:40 PM:
--

Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE hard coded. It has been there for quite some 
time hard coded, and in the long term I do not think it should be kept. Fox 
example it would better to purge on the number of actually DROPPABLE messages 
in the queue (or their weight if you want to extend even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.



was (Author: cesken):
Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (x)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>   

[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:42 PM:
--

Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE to be kept hard coded for now. It has been 
there for quite some time hard coded, and in the long term I do not think it 
should be kept as-is. For example it would better to purge on the number of 
actually DROPPABLE messages in the queue (or their weight if you want to extend 
even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.



was (Author: cesken):
Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (?)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
 I would like BACKLOG_PURGE_SIZE hard coded. It has been there for quite some 
time hard coded, and in the long term I do not think it should be kept. Fox 
example it would better to purge on the number of actually DROPPABLE messages 
in the queue (or their weight if you want to extend even further)
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: 

[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/14/17 1:30 PM:
--

Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (x)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I wanted to protect from Exceptions inside the throw-Block 
which would disable expiration infinitely. I was quite tired yesterday. I am 
swapping it back, TimeUnit conversions never throw Exceptions, so it is safe. 
:-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.



was (Author: cesken):
Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (x)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I was quite tired yesterday. :-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I 

[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924135#comment-15924135
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Marking off feedback:
- (/) A smaller value could potentially ...
- (/)  You shouldn't need the check for null? Usually we "just" make sure its 
not null
OK.  I thought it might be possible to set this to null, but even JConsole 
refuses it.
- (/) Using a boxed integer makes it a bit confusing ...
  ACK. Happily changed that. Looks like I followed bad examples.
- (/) Avoid unrelated whitespace changes.
  OK. I missed that after moving the field.
- (x)  I still think it's a good idea to avoid hard coding this kind of value 
so operators have options without recompiling.
- (/) Fun fact. You don't need backlogNextExpirationTime to be volatile. You 
can piggyback on backlogExpirationActive to get the desired effects from the 
Java memory model. [...] I wouldn't change it ...
Yes.  I am  aware of that and using that technique often. Here I did not like 
it as visibility effects would not be obvious, unless explicitly documented. 
You are probably aware what Brian Goetz says about piggybacking in his JCIP 
book. BTW: A more obvious usage is for me in status fields, e.g. to make the 
results of a Future visible. I won't change it either, so marking this as done.
- (x)Breaking out the uber bike shedding this could be maybeExpireMessages.
- (x)Swap the order of these two stores so it doesn't do extra expirations.
  Ouch. That hurts. I was quite tired yesterday. :-|
- (?)  This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
  In progress. I am wondering if we should include fixing the drop count it in 
this patch, as it will likely create even more conflicts. OTOH I have to touch 
some related methods anyhow. I will think about it.


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-14 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15923915#comment-15923915
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq. This is not quite correct you can't count drainCount as dropped because 
some of the drained messages may have been sent during iteration.
I looked in more detail, and I think this a flaw in the original code 
"suggested" me to do this:  {{drainedMessages.clear()}} is called twice, and 
one time would be enough. IMO it would be better to only keep the one at the 
end of the method and also do the drop-counting for the drained messages there. 
This would also cover a rather exotic case of the {{catch (Exception e)}} in 
the {{run()}} method. If an Exception is thrown, then there is a danger of 
nothing being counted.

bq. Using a boxed integer
bq. You shouldn't need the check for null?
>From a brief check, this refers to a similar point. I saw many configuration 
>options to allow null and followed that route. I am absolutely happy to make 
>it non-boxed.

bq. The right way to do it is create a branch for all the versions where this 
is going to be fixed. Start at 2.2, merge to 3.0, merge to 3.11, then merge to 
trunk. 
At Github? I can do so. But no PR, right? I saw it mentioned that one should 
not open PR's for Cassandra on Github as they cannot be handled (it's just a 
mirror).


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-13 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907493#comment-15907493
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/13/17 2:33 PM:
--

The change is now finished, including configuration, MBean and testing. I also 
tested an interval of 0 ms, which is close to what Cassandra does today.

Please have  a look. What is the recommended way of getting this into the 
official Cassandra repo?  A patch, or would someone with write access take it 
directly from Github? This also has impact on how the merge conflicts will be 
resolved, as there are now 3 files with merge conflicts according to 
https://github.com/apache/cassandra/pull/95. I did not want to rebase my branch 
without asking.


was (Author: cesken):
The change is now finished, including configuration, MBean and testing. I also 
tested an interval of 0 ms, which is close to what Cassandra does today.

Please have  a look. What is the recommended way of getting this into the 
official Cassandra repo?  A patch, or would someone with write access take it 
directly from Github? This also has impact on how the merge conflicts will be 
resolved, as there are now 3 files merge conflicts according to 
https://github.com/apache/cassandra/pull/95. I did not want to rebase my branch 
without asking.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-13 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907493#comment-15907493
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/13/17 1:56 PM:
--

The change is now finished, including configuration, MBean and testing. I also 
tested an interval of 0 ms, which is close to what Cassandra does today.

Please have  a look. What is the recommended way of getting this into the 
official Cassandra repo?  A patch, or would someone with write access take it 
directly from Github? This also has impact on how the merge conflicts will be 
resolved, as there are now 3 files merge conflicts according to 
https://github.com/apache/cassandra/pull/95. I did not want to rebase my branch 
without asking.


was (Author: cesken):
The change is now finished, including configuration, MBean and testing. I also 
tested an interval of 0 ms, which is close to what Cassandra does today.

Please have  a look. What is the recommended way of getting this into the 
official Cassandra repo?  A patch, or would someone with write access take it 
directly from Github? This also has impact on how the merge conflicts will be 
resolved, as there are now 3 files with conflicts e branch has now merge 
conflicts on 3 files according to https://github.com/apache/cassandra/pull/95. 
I did not want to rebase my branch without asking.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-13 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907493#comment-15907493
 ] 

Christian Esken commented on CASSANDRA-13265:
-

The change is now finished, including configuration, MBean and testing. I also 
tested an interval of 0 ms, which is close to what Cassandra does today.

Please have  a look. What is the recommended way of getting this into the 
official Cassandra repo?  A patch, or would someone with write access take it 
directly from Github? This also has impact on how the merge conflicts will be 
resolved, as there are now 3 files with conflicts e branch has now merge 
conflicts on 3 files according to https://github.com/apache/cassandra/pull/95. 
I did not want to rebase my branch without asking.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905409#comment-15905409
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/10/17 5:05 PM:
--

I committed the change containing the configuration, as I would like some 
feedback whether I am on the right path. Please note that I did not yet have 
time for tests (planned for next Monday), but I thought it is better to give a 
chance to review the current changes.

I also had to add back "AtomicBoolean backlogExpirationActive" Otherwise I 
cannot guarantee that only a single Thread is iterating the Queue, especially 
if a small expiration interval (1ms, or 0ms) is configured. The "AtomicLong 
backlogNextExpirationTime" could now be "volatile long".


was (Author: cesken):
I committed the change containing the configuration, as I would like some 
feedback whether I am on the right path. Please note that I did not yet have 
time for tests (planned for next Monday), but I thought it is better to give a 
chance to review the current changes.

Please note that I had to add back "AtomicBoolean backlogExpirationActive" 
Otherwise I cannot guarantee that only a single Thread is iterating the Queue, 
especially if a small expiration interval (1ms, or 0ms) is configured. The 
"AtomicLong backlogNextExpirationTime" could now be "volatile long".

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905409#comment-15905409
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I committed the change containing the configuration, as I would like some 
feedback whether I am on the right path. Please note that I did not yet have 
time for tests (planned for next Monday), but I thought it is better to give a 
chance to review the current changes.

Please note that I had to add back "AtomicBoolean backlogExpirationActive" 
Otherwise I cannot guarantee that only a single Thread is iterating the Queue, 
especially if a small expiration interval (1ms, or 0ms) is configured. The 
"AtomicLong backlogNextExpirationTime" could now be "volatile long".

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904949#comment-15904949
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/10/17 3:48 PM:
--

I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might have been set via JMX in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}
Is that OK? Should I also handle other illegal values in that getter (negative 
values), or reject them in the setter?  I have not found a  code example in 
Cassandra that handles bad values uniformly for MBean and Config.

2. How to read the config value? I am seeing some 
{{Integer.getInteger(propName, defaultValue)}}, but this looks strange to me. I 
think changes from JMX would not even be reflected. Thus I am calling the 
getter from above: {{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is 
the latter OK?



was (Author: cesken):
I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might have been set via JMX in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some 
{{Integer.getInteger(propName, defaultValue)}}, but this looks strange to me. I 
think changes from JMX would not even be reflected. Thus I am calling the 
getter from above: {{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is 
the latter OK?


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904949#comment-15904949
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/10/17 3:42 PM:
--

I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might have been set via JMX in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some 
{{Integer.getInteger(propName, defaultValue)}}, but this looks strange to me. I 
think changes from JMX would not even be reflected. Thus I am calling the 
getter from above: {{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is 
the latter OK?



was (Author: cesken):
I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might have been set via JMX in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some 
{{Integer.getInteger(propName, defaultValue)}}, but this looks strange to me. I 
think changes from JMX would not even be reflected. Thus I am calling the 
getter from above: {{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is 
hte latter OK?


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904949#comment-15904949
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/10/17 11:46 AM:
---

I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might have been set via JMX in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some 
{{Integer.getInteger(propName, defaultValue)}}, but this looks strange to me. I 
think changes from JMX would not even be reflected. Thus I am calling the 
getter from above: {{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is 
hte latter OK?



was (Author: cesken):
I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might have been set via JMX in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some Integer.getInteger(propName, 
defaultValue), but this looks strange to me. I think changes from JMX would not 
even be reflected. Thus I am calling the getter from above: 
{{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is thte latter OK?


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904949#comment-15904949
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/10/17 11:46 AM:
---

I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might have been set via JMX in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some Integer.getInteger(propName, 
defaultValue), but this looks strange to me. I think changes from JMX would not 
even be reflected. Thus I am calling the getter from above: 
{{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is thte latter OK?



was (Author: cesken):
I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might come in via MBean in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some Integer.getInteger(propName, 
defaultValue), but this looks strange to me. I think changes from JMX would not 
even be reflected. Thus I am calling the getter from above: 
{{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is thte latter OK?


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904949#comment-15904949
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/10/17 11:45 AM:
---

I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}

Additionally I will handle null values, that might come in via MBean in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some Integer.getInteger(propName, 
defaultValue), but this looks strange to me. I think changes from JMX would not 
even be reflected. Thus I am calling the getter from above: 
{{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is thte latter OK?



was (Author: cesken):
I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}


- Additionally I will handle null values, that might come in via MBean in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some Integer.getInteger(propName, 
defaultValue), but this looks strange to me. I think changes from JMX would not 
even be reflected. Thus I am calling the getter from above: 
{{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is thte latter OK?


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-10 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904949#comment-15904949
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I am nearly done with the configuration, and have two questions about it:

1.  How to handle the default value? My approach is to pre-configure the 
default value in Config:
{code}
public static final int otc_backlog_expiration_interval_in_ms_default = 200;
public volatile Integer otc_backlog_expiration_interval_in_ms = 
otc_backlog_expiration_interval_in_ms_default;
{code}


- Additionally I will handle null values, that might come in via MBean in the 
getter of DatabaseDescriptor:
{code}
public static Integer getOtcBacklogExpirationInterval()
{
Integer confValue = conf.otc_backlog_expiration_interval_in_ms;
return confValue != null ? confValue : 
Config.otc_backlog_expiration_interval_in_ms_default;
}
{code}

2. How to read the config value? I am seeing some Integer.getInteger(propName, 
defaultValue), but this looks strange to me. I think changes from JMX would not 
even be reflected. Thus I am calling the getter from above: 
{{DatabaseDescriptor.getOtcBacklogExpirationInterval()}}. Is thte latter OK?


> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-08 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901495#comment-15901495
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I pushed the changes noted in the former comment.
I am plannig to do "Configurabilty and default value" tomorrow.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-08 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901414#comment-15901414
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/8/17 4:12 PM:
-

I will update the status while working on the individual topics:
(x)This needs to be configurable from the YAML and via JMX. 
(/)It should include drained messages as well.
(/)Typo, "thus letting"
(/)Extra line break
(/)We don't do/allow author tags
(/)Use TimeUnit
  => Additionally I am now determining the timeout value automatic
(/)It isn't using the constant.
(/)Just in case maybe assert the droppable/non-droppable status of the 
verbs. Or does it not matter since the tests will fail anyways?
   => It wouldn't matter, but I added a check make it more explicit.



was (Author: cesken):
I will update the status while working on the individual topics:
(x)This needs to be configurable from the YAML and via JMX. 
(/)It should include drained messages as well.
(/)Typo, "thus letting"
(/)Extra line break
(/)We don't do/allow author tags
(/)Use TimeUnit
(/)It isn't using the constant.
(x)Just in case maybe assert the droppable/non-droppable status of the 
verbs. Or does it not matter since the tests will fail anyways?



> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-08 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901414#comment-15901414
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/8/17 3:36 PM:
-

I will update the status while working on the individual topics:
(x)This needs to be configurable from the YAML and via JMX. 
(/)It should include drained messages as well.
(/)Typo, "thus letting"
(/)Extra line break
(/)We don't do/allow author tags
(/)Use TimeUnit
(/)It isn't using the constant.
(x)Just in case maybe assert the droppable/non-droppable status of the 
verbs. Or does it not matter since the tests will fail anyways?




was (Author: cesken):
I will update the status while working on the individual topics:
(x)This needs to be configurable from the YAML and via JMX. 
(/)It should include drained messages as well.
(/)Typo, "thus letting"
(/)Extra line break
(/)We don't do/allow author tags
(x)Use TimeUnit
(x)It isn't using the constant.
(x)Just in case maybe assert the droppable/non-droppable status of the 
verbs. Or does it not matter since the tests will fail anyways?



> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-08 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901414#comment-15901414
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I will update the status while working on the individual topics:
(x)This needs to be configurable from the YAML and via JMX. 
(/)It should include drained messages as well.
(/)Typo, "thus letting"
(/)Extra line break
(/)We don't do/allow author tags
(x)Use TimeUnit
(x)It isn't using the constant.
(x)Just in case maybe assert the droppable/non-droppable status of the 
verbs. Or does it not matter since the tests will fail anyways?



> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-08 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901047#comment-15901047
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/8/17 10:36 AM:
--

bq. It should include drained messages as well
OMG, right. I thought about that, and concluded it is correct to exclude. But 
obviously the messages are already drained from the queue, so they must be 
added.

bq. We don't do/allow author tags
OOPS, always this oversmart IDE :-)

bq. This needs to be configurable
Oh. Even more work. This is getting bigger than anticipated, but I am happy to 
do it. Thanks for the hints on how do do the Configuration. I will work on it 
later this day or tomorrow.


was (Author: cesken):
bq. It should include drained messages as well
OMG, right. I thought about that concluded it is correct to exclude. But 
obviously the messages are already drained from the queue, so they must be 
added.

bq. We don't do/allow author tags
OOPS, always this oversmart IDE :-)

bq. This needs to be configurable
Oh. Even more work. This is getting bigger than anticipated, but I am happy to 
do it. Thanks for the hints on how do do the Configuration. I will work on it 
later this day or tomorrow.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-08 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901047#comment-15901047
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq. It should include drained messages as well
OMG, right. I thought about that concluded it is correct to exclude. But 
obviously the messages are already drained from the queue, so they must be 
added.

bq. We don't do/allow author tags
OOPS, always this oversmart IDE :-)

bq. This needs to be configurable
Oh. Even more work. This is getting bigger than anticipated, but I am happy to 
do it. Thanks for the hints on how do do the Configuration. I will work on it 
later this day or tomorrow.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-07 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899773#comment-15899773
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I added three changes:
- Implemented unit test
- Count dropped messages if Cassandra cannot write to the socket 
- Fix the QueuedMessage.isTimedOut(), which was prone to a System.nanoTime() 
wrap bug, as it used a check of type: aNanos < bNanos

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-06 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897641#comment-15897641
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I have one question about a code fragment. When the socket is not available the 
backlog is cleared, but no drops are counted. Looks like an omission to me, or 
is it intentional? {{dropped.addAndGet(backlog.size(); )}} would be an 
approximation. We likely cannot get closer as {{backlog.clear();}} does not 
tell how much elements were removed.

{code}
if (qm.isTimedOut())
dropped.incrementAndGet();
else if (socket != null || connect())
writeConnected(qm, count == 1 && backlog.isEmpty());
else
{
// clear out the queue, else gossip messages back up.
drainedMessages.clear();
// dropped.addAndGet(backlog.size()); //  TODO Should 
dropped statistics be counted in this case?
backlog.clear();
break inner;
}
{code}

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-06 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897121#comment-15897121
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/6/17 11:23 AM:
--

I was already looking into doing a unit test it but it requires access to the 
queue which means making it package level access and using 
{{@VisibleForTesting}}. I will do that tomorrow, unless there are arguments 
against it. I will also check alternatives.


was (Author: cesken):
I was already looking into doing a unit test it but it requires access to the 
queue which means making it package level access and using 
{{@VisibleForTesting}}. I will do that tomorrow, unless there are arguments 
against it.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread

2017-03-06 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897121#comment-15897121
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I was already looking into doing a unit test it but it requires access to the 
queue which means making it package level access and using 
{{@VisibleForTesting}}. I will do that tomorrow, unless there are arguments 
against it.

> Expiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892471#comment-15892471
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Change to System.nanoTime() is done. I kept the logging, but stripped it down 
and guarded it with a {{isTraceEnabled()}}

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892332#comment-15892332
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/2/17 2:36 PM:
-

bq. use System.nanoTime() instead of System.currentTimeMillis().
Agreed, {{System.nanoTime()}} is slightly better here. In real life it won't 
make a terrific difference even with the worst clocks, but "_lets do things 
right"_. :-) . I never looked up the native code for nanoTIme(), but I bet on 
Unix it uses the POSIX {{clock_gettime(CLOCK_MONOTONIC, ...)}}.

bq. I don't think we want to traverse the entire backlog. [...]
Your argument  "reasonably in ascending timestamp order" makes sense, if all 
entries would have the same expiration time. But the Verbs have different 
timeouts, the defaults ranging from 2 to 60 seconds. Thus iterating the whole 
Queue should be done, as in the worst case we will remove nothing even though 
most entries are timed out.



was (Author: cesken):
bq. use System.nanoTime() instead of System.currentTimeMillis().
Agreed, {{System.nanoTime()}} is slightly better here. In real life it won't 
make a terrific difference even with the worst clocks, but "_lets do things 
right"_. :-) . I never looked up the native code for nanoTIme(), but I bet on 
Unix it uses the POSIX {{clock_gettime(CLOCK_MONOTONIC, ...)}}.

bq. I don't think we want to traverse the entire backlog. [...]
Your argument  "reasonably in ascending timestamp order" makes sense, if all 
entries would have the same expiration time. But the Verbs have different 
timeouts. Thus iterating the whole Queue should be done, as in the worst case 
we will remove nothing even though most entries are timed out.


> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892332#comment-15892332
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq. use System.nanoTime() instead of System.currentTimeMillis().
Agreed, {{System.nanoTime()}} is slightly better here. In real life it won't 
make a terrific difference even with the worst clocks, but "_lets do things 
right"_. :-) . I never looked up the native code for nanoTIme(), but I bet on 
Unix it uses the POSIX {{clock_gettime(CLOCK_MONOTONIC, ...)}}.

bq. I don't think we want to traverse the entire backlog. [...]
Your argument  "reasonably in ascendingascending timestamp order" makes sense, 
if all entries would have the same expiration time. But the Verbs have 
different timeouts. Thus iterating the whole Queue should be done, as in the 
worst case we will remove nothing even though most entries are timed out.


> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892332#comment-15892332
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/2/17 2:34 PM:
-

bq. use System.nanoTime() instead of System.currentTimeMillis().
Agreed, {{System.nanoTime()}} is slightly better here. In real life it won't 
make a terrific difference even with the worst clocks, but "_lets do things 
right"_. :-) . I never looked up the native code for nanoTIme(), but I bet on 
Unix it uses the POSIX {{clock_gettime(CLOCK_MONOTONIC, ...)}}.

bq. I don't think we want to traverse the entire backlog. [...]
Your argument  "reasonably in ascending timestamp order" makes sense, if all 
entries would have the same expiration time. But the Verbs have different 
timeouts. Thus iterating the whole Queue should be done, as in the worst case 
we will remove nothing even though most entries are timed out.



was (Author: cesken):
bq. use System.nanoTime() instead of System.currentTimeMillis().
Agreed, {{System.nanoTime()}} is slightly better here. In real life it won't 
make a terrific difference even with the worst clocks, but "_lets do things 
right"_. :-) . I never looked up the native code for nanoTIme(), but I bet on 
Unix it uses the POSIX {{clock_gettime(CLOCK_MONOTONIC, ...)}}.

bq. I don't think we want to traverse the entire backlog. [...]
Your argument  "reasonably in ascendingascending timestamp order" makes sense, 
if all entries would have the same expiration time. But the Verbs have 
different timeouts. Thus iterating the whole Queue should be done, as in the 
worst case we will remove nothing even though most entries are timed out.


> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891905#comment-15891905
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/2/17 12:36 PM:
--

Ariel wrote:
{quote}
Expiration is based on time. There is no point in attempting expiration again 
immediately because almost nothing will have expired. It allows one bad 
connection to consume resources it shouldn't in the form of hijacking a thread 
to iterate a list.

I don't see the downside of switching from a boolean to a long and CASing that 
instead. If we aren't confident in it we can set a small interval so that it 
still checks for expiration often although I think that just generates useless 
work. We can't make timeouts pass faster.
{quote}

[~aweisberg], I understand that you want to CAS on "lastExpirationTime", right? 
I am also for doing this. Its fitting better and still keeps the change simple. 
In that case the Thread should iterate the whole Queue, and not bail out on the 
first hit. I will change it in the PR.


was (Author: cesken):
Ariel wrote:
{quote}
Expiration is based on time. There is no point in attempting expiration again 
immediately because almost nothing will have expired. It allows one bad 
connection to consume resources it shouldn't in the form of hijacking a thread 
to iterate a list.

I don't see the downside of switching from a boolean to a long and CASing that 
instead. If we aren't confident in it we can set a small interval so that it 
still checks for expiration often although I think that just generates useless 
work. We can't make timeouts pass faster.
{quote}

[~aweisberg]: I understand that you want to CAS on "lastExpirationTime", right? 
I am also for doing this. Its fitting better and still keeps the change simple. 
In that case the Thread should iterate the whole Queue, and not bail out on the 
first hit. I will change it in the PR.

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891905#comment-15891905
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/2/17 12:35 PM:
--

Ariel wrote:
{quote}
Expiration is based on time. There is no point in attempting expiration again 
immediately because almost nothing will have expired. It allows one bad 
connection to consume resources it shouldn't in the form of hijacking a thread 
to iterate a list.

I don't see the downside of switching from a boolean to a long and CASing that 
instead. If we aren't confident in it we can set a small interval so that it 
still checks for expiration often although I think that just generates useless 
work. We can't make timeouts pass faster.
{quote}

[~aweisberg]: I understand that you want to CAS on "lastExpirationTime", right? 
I am also for doing this. Its fitting better and still keeps the change simple. 
In that case the Thread should iterate the whole Queue, and not bail out on the 
first hit. I will change it in the PR.


was (Author: cesken):
Ariel wrote:
{quote}
Expiration is based on time. There is no point in attempting expiration again 
immediately because almost nothing will have expired. It allows one bad 
connection to consume resources it shouldn't in the form of hijacking a thread 
to iterate a list.

I don't see the downside of switching from a boolean to a long and CASing that 
instead. If we aren't confident in it we can set a small interval so that it 
still checks for expiration often although I think that just generates useless 
work. We can't make timeouts pass faster.
{quote}

I understand that you want to CAS on "lastExpirationTime", right? I am also for 
doing this. Its fitting better and still keeps the change simple. In that case 
the Thread should iterate the whole Queue, and not bail out on the first hit. I 
will change it in the PR.

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892114#comment-15892114
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/2/17 11:47 AM:
--

I updated the PR with the following changes:
 - Variable names / modifiers (static)
 - Expiration is based on time
 - Expiration inspects the whole Queue (no bailing out)

This is really hard to reproduce and to test. Because of that I did not yet 
remove the BACKLOG_EXPIRATION_DEBUG. If you have a hint about test 
possibilities, let me know.


was (Author: cesken):
I updated the PR with the following changes:
 - Variable names / modifiers (static)
 - Expiration is based on time
 - Expiration inspects the whole Queue (no bailing out)


> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892114#comment-15892114
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I updated the PR with the following changes:
 - Variable names / modifiers (static)
 - Expiration is based on time
 - Expiration inspects the whole Queue (no bailing out)


> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891896#comment-15891896
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/2/17 9:26 AM:
-

bq. How often does this issue occur?

Not very often, but it happens. I assume that special scenarios trigger this:
- High write-throughput  (especially when you have write spikes it is easy to 
get above 1024 messages)
- A long Stop-the-World GC phase (because then even more Threads could start to 
write and iterate the Queue)
- Temporary network overload to the target node (because nothing is taken from 
the Queue in that case).
- Many non-droppable entries in the Queue (because then the loop does not bail 
out:  "if (! qm.droppable)  continue;" )

Temporary overloads usually resolve themselves, but in this case it does not. 
As soon as the Queue has reached a certain size limit, most time is spent in 
iterating the Queue, and the reader is starved (1 reader Thread fights against 
324 Threads that do a read-lock by calling iterator.next()).



was (Author: cesken):
bq. How often does this issue occur?

Not very often, but it happens. I assume that special scenarios trigger this:
- High write-throughput  (especially when you have write spikes it is easy to 
get above 1024 messages)
- A long Stop-the-World GC phase (because then even more Threads could start to 
write and iterate the Queue)
- Temporary network overload to the target node (because nothing is taken from 
the Queue in that case).
- Many non-droppable entries in the Queue (because then the loop does not bail 
out:  "if (! qm.droppable)  continue;" )

Usually temporary overloads resolve themselves, but in this case it does not. 
As soon as the Queue has reached a certain size limit, most time is spent in 
iterating the Queue, and the reader is starved (1 reader Thread fights against 
324 Threads that do a read-lock by calling iterator.next()).


> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891905#comment-15891905
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 3/2/17 9:25 AM:
-

Ariel wrote:
{quote}
Expiration is based on time. There is no point in attempting expiration again 
immediately because almost nothing will have expired. It allows one bad 
connection to consume resources it shouldn't in the form of hijacking a thread 
to iterate a list.

I don't see the downside of switching from a boolean to a long and CASing that 
instead. If we aren't confident in it we can set a small interval so that it 
still checks for expiration often although I think that just generates useless 
work. We can't make timeouts pass faster.
{quote}

I understand that you want to CAS on "lastExpirationTime", right? I am also for 
doing this. Its fitting better and still keeps the change simple. In that case 
the Thread should iterate the whole Queue, and not bail out on the first hit. I 
will change it in the PR.


was (Author: cesken):
Ariel wrote:
{quote}
Expiration is based on time. There is no point in attempting expiration again 
immediately because almost nothing will have expired. It allows one bad 
connection to consume resources it shouldn't in the form of hijacking a thread 
to iterate a list.

I don't see the downside of switching from a boolean to a long and CASing that 
instead. If we aren't confident in it we can set a small interval so that it 
still checks for expiration often although I think that just generates useless 
work. We can't make timeouts pass faster.
{quote}

I understand that you want to CAS on "lastExpirationTime", rjght? I am also for 
doing this. Its fitting better and still keeps the change simple. In that case 
the Thread should iterate the whole Queue, and not bail out on the first hit. I 
will change it in the PR.

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891905#comment-15891905
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Ariel wrote:
{quote}
Expiration is based on time. There is no point in attempting expiration again 
immediately because almost nothing will have expired. It allows one bad 
connection to consume resources it shouldn't in the form of hijacking a thread 
to iterate a list.

I don't see the downside of switching from a boolean to a long and CASing that 
instead. If we aren't confident in it we can set a small interval so that it 
still checks for expiration often although I think that just generates useless 
work. We can't make timeouts pass faster.
{quote}

I understand that you want to CAS on "lastExpirationTime", rjght? I am also for 
doing this. Its fitting better and still keeps the change simple. In that case 
the Thread should iterate the whole Queue, and not bail out on the first hit. I 
will change it in the PR.

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-02 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891896#comment-15891896
 ] 

Christian Esken commented on CASSANDRA-13265:
-

bq. How often does this issue occur?

Not very often, but it happens. I assume that special scenarios trigger this:
- High write-throughput  (especially when you have write spikes it is easy to 
get above 1024 messages)
- A long Stop-the-World GC phase (because then even more Threads could start to 
write and iterate the Queue)
- Temporary network overload to the target node (because nothing is taken from 
the Queue in that case).
- Many non-droppable entries in the Queue (because then the loop does not bail 
out:  "if (! qm.droppable)  continue;" )

Usually temporary overloads resolve themselves, but in this case it does not. 
As soon as the Queue has reached a certain size limit, most time is spent in 
iterating the Queue, and the reader is starved (1 reader Thread fights against 
324 Threads that do a read-lock by calling iterator.next()).


> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-01 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890316#comment-15890316
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I created a fresh branch to have a clean commit. The Pull request is opened: 
 https://github.com/apache/cassandra/pull/95

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-01 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890199#comment-15890199
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Will do. Thanks for your quick feedback.

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-01 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890155#comment-15890155
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Here is the patch for reducing contention in the queue expiration. The patch 
wraps expireMessages() in expireMessagesConditionally(), which makes sure that 
only a single Thread will do expiration at the same time:
  
https://github.com/apache/cassandra/compare/trunk...christian-esken:13265-3.0?expand=1

PS: Commits in this patch are not yet squashed. If the patch is good, I will 
create a proper branch to have a more clear history.

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13265) Epxiration in OutboundTcpConnection can block the reader Thread

2017-03-01 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Summary: Epxiration in OutboundTcpConnection can block the reader Thread  
(was: Communication breakdown in OutboundTcpConnection)

> Epxiration in OutboundTcpConnection can block the reader Thread
> ---
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-03-01 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken reopened CASSANDRA-13265:
-

Reopening.

While the "averageGap == 0" issue has been fixed, I would still want to fix the 
issue from the description. That issue is that multiple Threads do the 
expiration, which leads to unnecessary locking, more CPU usage  and possible 
starvation of the reader Thread.

I will prepare a patch, that fixes that.

> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-28 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15888439#comment-15888439
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Here is one possibly very important observation. It looks like Coalescing is 
doing an infinite loop while doing maybeSleep(). I checked 10 Thread dumps, and 
in each of them the Thread was at the same location. Is it possible that 
averageGap is 0? This would lead to infinite recursion.
{code}
private static boolean maybeSleep(int messages, long averageGap, long 
maxCoalesceWindow, Parker parker)
{
// only sleep if we can expect to double the number of messages we're 
sending in the time interval
long sleep = messages * averageGap; // TODO can averageGap be 0 ?
if (sleep > maxCoalesceWindow)
return false;

// assume we receive as many messages as we expect; apply the same 
logic to the future batch:
// expect twice as many messages to consider sleeping for "another" 
interval; this basically translates
// to doubling our sleep period until we exceed our max sleep window
while (sleep * 2 < maxCoalesceWindow)
sleep *= 2; //  CoalescingStrategies:106
parker.park(sleep);
return true;
}
{code}

If sum is bigger than MEASURED_INTERVAL, then averageGap() returns 0. I am 
aware that this is highly unlikely, but I cannot explain the likely hanging in 
maybeSleep() line 106.
{code}
private long averageGap()
{
if (sum == 0)
return Integer.MAX_VALUE;
return MEASURED_INTERVAL / sum;
}
{code}

> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Assignee: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-28 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887685#comment-15887685
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 2/28/17 10:10 AM:
---

I see your argument. On larger clusters this may get problematic. I will try to 
summarize the alternative solutions:
- Offload expiration to a "random" regular Thread, but only a single one. If 
one Thread already expires ...
 -- ... let the other Threads continue  (1)
 -- ... let the other Threads wait  (2)
- Use an "Expiration Thread Pool" (3). I am not (currently) in favor for it, 
and if I understood you correctly then it is also not your preference.

I will implement option (1) today.


Please see the attached Thread Dump to see which Threads are blocking. Here are 
two examples from the Thread Dumps. Mainly they are SharedPool-Worker threads, 
that either do iterator.remove() or iterator.next(). I think in the Threaddump 
there is also a HintDispatcher Thread that is parking on the same lock.

java.util.concurrent.LinkedBlockingQueue$Itr.remove:
{code}
"SharedPool-Worker-294" #587 daemon prio=5 os_prio=0 tid=0x7fb69b11e260 
nid=0x6090 waiting on condition [0x7fb162c0e000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00023a426218> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
java.util.concurrent.LinkedBlockingQueue.fullyLock(LinkedBlockingQueue.java:225)
at 
java.util.concurrent.LinkedBlockingQueue$Itr.remove(LinkedBlockingQueue.java:840)
at 
org.apache.cassandra.net.OutboundTcpConnection.expireMessages(OutboundTcpConnection.java:555)
at 
org.apache.cassandra.net.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:165)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:771)
at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:744)
at 
org.apache.cassandra.hints.HintVerbHandler.reply(HintVerbHandler.java:99)
at 
org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:94)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
at java.lang.Thread.run(Thread.java:745)
{code}

java.util.concurrent.LinkedBlockingQueue$Itr.next:
{code}
"SharedPool-Worker-295" #590 daemon prio=5 os_prio=0 tid=0x7fb69b1135b0 
nid=0x608d waiting on condition [0x7fb162cd1000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00023a426218> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
java.util.concurrent.LinkedBlockingQueue.fullyLock(LinkedBlockingQueue.java:225)
at 
java.util.concurrent.LinkedBlockingQueue$Itr.next(LinkedBlockingQueue.java:823)
at 
org.apache.cassandra.net.OutboundTcpConnection.expireMessages(OutboundTcpConnection.java:550)
at 
org.apache.cassandra.net.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:165)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:771)
at 

[jira] [Comment Edited] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-28 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887685#comment-15887685
 ] 

Christian Esken edited comment on CASSANDRA-13265 at 2/28/17 9:53 AM:
--

I see your argument. On larger clusters this may get problematic. Lets evaluate 
different solutions:
- Offload expiration to a "random" regular Thread, but only a single one. If 
one Thread already expires ...
 --- ... let the other Threads continue
 --- ... let the other Threads wait
- Go with your idea of an "Expiration Thread Pool"

Please see the attached Thread Dump to see which Threads are blocking. Here are 
two examples from the Thread Dumps. Mainly they are SharedPool-Worker threads, 
that either do iterator.remove() or iterator.next(). I think in the Threaddump 
there is also a HintDispatcher Thread that is parking on the same lock.

java.util.concurrent.LinkedBlockingQueue$Itr.remove:
{code}
"SharedPool-Worker-294" #587 daemon prio=5 os_prio=0 tid=0x7fb69b11e260 
nid=0x6090 waiting on condition [0x7fb162c0e000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00023a426218> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
java.util.concurrent.LinkedBlockingQueue.fullyLock(LinkedBlockingQueue.java:225)
at 
java.util.concurrent.LinkedBlockingQueue$Itr.remove(LinkedBlockingQueue.java:840)
at 
org.apache.cassandra.net.OutboundTcpConnection.expireMessages(OutboundTcpConnection.java:555)
at 
org.apache.cassandra.net.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:165)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:771)
at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:744)
at 
org.apache.cassandra.hints.HintVerbHandler.reply(HintVerbHandler.java:99)
at 
org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:94)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
at java.lang.Thread.run(Thread.java:745)
{code}

java.util.concurrent.LinkedBlockingQueue$Itr.next:
{code}
"SharedPool-Worker-295" #590 daemon prio=5 os_prio=0 tid=0x7fb69b1135b0 
nid=0x608d waiting on condition [0x7fb162cd1000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00023a426218> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
java.util.concurrent.LinkedBlockingQueue.fullyLock(LinkedBlockingQueue.java:225)
at 
java.util.concurrent.LinkedBlockingQueue$Itr.next(LinkedBlockingQueue.java:823)
at 
org.apache.cassandra.net.OutboundTcpConnection.expireMessages(OutboundTcpConnection.java:550)
at 
org.apache.cassandra.net.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:165)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:771)
at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:744)
at 
org.apache.cassandra.hints.HintVerbHandler.reply(HintVerbHandler.java:99)
at 

[jira] [Commented] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-28 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887685#comment-15887685
 ] 

Christian Esken commented on CASSANDRA-13265:
-

I see your argument. On larger clusters this may get problematic. Lets evaluate 
different solutions:
- Offload expiration to a "random" regular Thread, but only a single one. If 
one Thread already expires ...
 --- ... let the other Threads continue
 --- ... let the other Threads wait
- Go with your idea of an "Expiration Thread Pool"

Please see the attached Thread Dump to see which Threads are blocking. Mainly 
they are SharedPool-Worker threads, that either do iterator.remove() or 
iterator.next().
{code}
"SharedPool-Worker-294" #587 daemon prio=5 os_prio=0 tid=0x7fb69b11e260 
nid=0x6090 waiting on condition [0x7fb162c0e000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00023a426218> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
java.util.concurrent.LinkedBlockingQueue.fullyLock(LinkedBlockingQueue.java:225)
at 
java.util.concurrent.LinkedBlockingQueue$Itr.remove(LinkedBlockingQueue.java:840)
at 
org.apache.cassandra.net.OutboundTcpConnection.expireMessages(OutboundTcpConnection.java:555)
at 
org.apache.cassandra.net.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:165)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:771)
at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:744)
at 
org.apache.cassandra.hints.HintVerbHandler.reply(HintVerbHandler.java:99)
at 
org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:94)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
at java.lang.Thread.run(Thread.java:745)
{code}

{code}
"SharedPool-Worker-295" #590 daemon prio=5 os_prio=0 tid=0x7fb69b1135b0 
nid=0x608d waiting on condition [0x7fb162cd1000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00023a426218> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
java.util.concurrent.LinkedBlockingQueue.fullyLock(LinkedBlockingQueue.java:225)
at 
java.util.concurrent.LinkedBlockingQueue$Itr.next(LinkedBlockingQueue.java:823)
at 
org.apache.cassandra.net.OutboundTcpConnection.expireMessages(OutboundTcpConnection.java:550)
at 
org.apache.cassandra.net.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:165)
at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:771)
at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:744)
at 
org.apache.cassandra.hints.HintVerbHandler.reply(HintVerbHandler.java:99)
at 
org.apache.cassandra.hints.HintVerbHandler.doVerb(HintVerbHandler.java:94)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 

[jira] [Commented] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-24 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883172#comment-15883172
 ] 

Christian Esken commented on CASSANDRA-13265:
-

Link to the current patch: 
https://github.com/apache/cassandra/compare/trunk...christian-esken:13265-3.0?expand=1

> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-24 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15883047#comment-15883047
 ] 

Christian Esken commented on CASSANDRA-13265:
-

The Thread dumps show, that several Threads park on the same objects.
- 324 Threads are waiting on the same object, trying to iterate over Queue 
(expiration)
- 24 Threads  wait on a different object, as far as we see they try to read 
from the Queue

{code}
--- cassandra.pb-cache4-dus.2017-02-20-01-41-14.td 
---
  1 - parking to wait for  <0x0001c04b1748> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c056d4f0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c0579c60> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 24 - parking to wait for  <0x0001c058ce50> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c058e520> (a 
java.util.concurrent.Semaphore$NonfairSync)
  1 - parking to wait for  <0x0001c058ee50> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c0592bc0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c0593058> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c0593ae0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c05958d0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c059f788> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  4 - parking to wait for  <0x0001c07f5ea8> (a 
java.util.concurrent.SynchronousQueue$TransferStack)
  1 - parking to wait for  <0x0001c0df0548> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c4b52790> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c56a7ca8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c56beea8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c56bf2d8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
324 - parking to wait for  <0x0001c5d5a150> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
  1 - parking to wait for  <0x0001c628edb0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c6290b78> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c62958a8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c6295b08> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c72343a8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c7581d58> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001c8dd5738> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001ccdc3b80> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001cd22e1b0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001f3c39428> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0001fb43f5d0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  1 - parking to wait for  <0x0002003b6018> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
{code}


> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 

[jira] [Updated] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-24 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Attachment: cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz

Thread Dump

> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-24 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Attachment: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz

Class Histogram

> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
> Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, 
> cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz
>
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-24 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Description: 
I observed that sometimes a single node in a Cassandra cluster fails to 
communicate to the other nodes. This can happen at any time, during peak load 
or low load. Restarting that single node from the cluster fixes the issue.

Before going in to details, I want to state that I have analyzed the situation 
and am already developing a possible fix. Here is the analysis so far:

- A Threaddump in this situation showed  324 Threads in the 
OutboundTcpConnection class that want to lock the backlog queue for doing 
expiration.
- A class histogram shows 262508 instances of 
OutboundTcpConnection$QueuedMessage.

What is the effect of it? As soon as the Cassandra node has reached a certain 
amount of queued messages, it starts thrashing itself to death. Each of the 
Thread fully locks the Queue for reading and writing by calling 
iterator.next(), making the situation worse and worse.
- Writing: Only after 262508 locking operation it can progress with actually 
writing to the Queue.
- Reading: Is also blocked, as 324 Threads try to do iterator.next(), and fully 
lock the Queue

This means: Writing blocks the Queue for reading, and readers might even be 
starved which makes the situation even worse.

-
The setup is:
 - 3-node cluster
 - replication factor 2
 - Consistency LOCAL_ONE
 - No remote DC's
 - high write throughput (10 INSERT statements per second and more during 
peak times).
 

  was:
I observed that sometimes a single node in a Cassandra cluster fails to 
communicate to the other nodes. This can happen at any time, during peak load 
or low load. Restarting that single node from the cluster fixes the issue.

Before going in to details, I want to state that I have analyzed the situation 
and am already developing a possible fix. Here is the analysis so far:

- A Threaddump in this situation showed  324 Threads in the 
OutboundTcpConnection class that want to lock the backlog queue for doing 
expiration.
- A class histogram shows 262508 instances of 
OutboundTcpConnection$QueuedMessage.

What is the effect of it? As soon as the Cassandra node has reached that state, 
it never gets out of it by itself, it is thrashing itself to death instead, as 
each of the Thread fully locks the Queue for reading and writing by calling 
iterator.next().
- Writing: Only after 262508 locking operation it can progress with actually 
writing to the Queue.
- Reading: Is also blocked, as 324 Threads try to do iterator.next(), and fully 
lock the Queue

This means: Writing blocks the Queue for reading, and readers might even be 
starved which makes the situation even worse.

-
The setup is:
 - 3-node cluster
 - replication factor 2
 - Consistency LOCAL_ONE
 - No remote DC's
 - high write throughput (10 INSERT statements per second and more during 
peak times).
 


> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached a certain 
> amount of queued messages, it starts thrashing itself to death. Each of the 
> Thread fully locks the Queue for reading and writing by calling 
> iterator.next(), making the situation worse and worse.
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by 

[jira] [Updated] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-24 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13265:

Description: 
I observed that sometimes a single node in a Cassandra cluster fails to 
communicate to the other nodes. This can happen at any time, during peak load 
or low load. Restarting that single node from the cluster fixes the issue.

Before going in to details, I want to state that I have analyzed the situation 
and am already developing a possible fix. Here is the analysis so far:

- A Threaddump in this situation showed  324 Threads in the 
OutboundTcpConnection class that want to lock the backlog queue for doing 
expiration.
- A class histogram shows 262508 instances of 
OutboundTcpConnection$QueuedMessage.

What is the effect of it? As soon as the Cassandra node has reached that state, 
it never gets out of it by itself, it is thrashing itself to death instead, as 
each of the Thread fully locks the Queue for reading and writing by calling 
iterator.next().
- Writing: Only after 262508 locking operation it can progress with actually 
writing to the Queue.
- Reading: Is also blocked, as 324 Threads try to do iterator.next(), and fully 
lock the Queue

This means: Writing blocks the Queue for reading, and readers might even be 
starved which makes the situation even worse.

-
The setup is:
 - 3-node cluster
 - replication factor 2
 - Consistency LOCAL_ONE
 - No remote DC's
 - high write throughput (10 INSERT statements per second and more during 
peak times).
 

  was:
I observed that sometimes a single node in a Cassandra cluster fails to 
communicate to the other nodes. This can happen at any time, during peak load 
or low load. Restarting that single node from the cluster fixes the issue.

Before going in to details, I want to state that I have analyzed the situation 
and am already developing a possible fix. Here is the analysis so far:

- A Threaddump in this situation showed that 324 Threads in the 
OutboundTcpConnection class wanted to lock the backlog queue for doing 
expiration.
- A class histogram shows 262508 instances of 
OutboundTcpConnection$QueuedMessage.

What is the effect of it? As soon as the Cassandra node has reached that state, 
it never gets out of it by itself, it is thrashing itself to death instead, as 
each of the Thread fully locks the Queue for reading and writing by calling 
iterator.next().
- Writing: Only after 262508 locking operation it can progress with actually 
writing to the Queue.
- Reading: Is also blocked, as 324 Threads try to do iterator.next(), and fully 
lock the Queue

This means: Writing blocks the Queue for reading, and readers might even be 
starved which makes the situation even worse.

-
The setup is:
 - 3-node cluster
 - replication factor 2
 - Consistency LOCAL_ONE
 - No remote DC's
 - high write throughput (10 INSERT statements per second and more during 
peak times).
 


> Communication breakdown in OutboundTcpConnection
> 
>
> Key: CASSANDRA-13265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>
> I observed that sometimes a single node in a Cassandra cluster fails to 
> communicate to the other nodes. This can happen at any time, during peak load 
> or low load. Restarting that single node from the cluster fixes the issue.
> Before going in to details, I want to state that I have analyzed the 
> situation and am already developing a possible fix. Here is the analysis so 
> far:
> - A Threaddump in this situation showed  324 Threads in the 
> OutboundTcpConnection class that want to lock the backlog queue for doing 
> expiration.
> - A class histogram shows 262508 instances of 
> OutboundTcpConnection$QueuedMessage.
> What is the effect of it? As soon as the Cassandra node has reached that 
> state, it never gets out of it by itself, it is thrashing itself to death 
> instead, as each of the Thread fully locks the Queue for reading and writing 
> by calling iterator.next().
> - Writing: Only after 262508 locking operation it can progress with actually 
> writing to the Queue.
> - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and 
> fully lock the Queue
> This means: Writing blocks the Queue for reading, and readers might even be 
> starved which makes the situation even worse.
> -
> The setup is:
>  - 3-node cluster
>  - replication factor 2
>  - Consistency LOCAL_ONE
>  - No remote DC's
>  - high write throughput (10 INSERT statements per second and more during 
> peak times).
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13265) Communication breakdown in OutboundTcpConnection

2017-02-24 Thread Christian Esken (JIRA)
Christian Esken created CASSANDRA-13265:
---

 Summary: Communication breakdown in OutboundTcpConnection
 Key: CASSANDRA-13265
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13265
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 3.0.9
Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
1.8.0_112-b15)
Linux 3.16
Reporter: Christian Esken


I observed that sometimes a single node in a Cassandra cluster fails to 
communicate to the other nodes. This can happen at any time, during peak load 
or low load. Restarting that single node from the cluster fixes the issue.

Before going in to details, I want to state that I have analyzed the situation 
and am already developing a possible fix. Here is the analysis so far:

- A Threaddump in this situation showed that 324 Threads in the 
OutboundTcpConnection class wanted to lock the backlog queue for doing 
expiration.
- A class histogram shows 262508 instances of 
OutboundTcpConnection$QueuedMessage.

What is the effect of it? As soon as the Cassandra node has reached that state, 
it never gets out of it by itself, it is thrashing itself to death instead, as 
each of the Thread fully locks the Queue for reading and writing by calling 
iterator.next().
- Writing: Only after 262508 locking operation it can progress with actually 
writing to the Queue.
- Reading: Is also blocked, as 324 Threads try to do iterator.next(), and fully 
lock the Queue

This means: Writing blocks the Queue for reading, and readers might even be 
starved which makes the situation even worse.

-
The setup is:
 - 3-node cluster
 - replication factor 2
 - Consistency LOCAL_ONE
 - No remote DC's
 - high write throughput (10 INSERT statements per second and more during 
peak times).
 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13264) NullPointerException in sstabledump

2017-02-24 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13264:

Description: 
sstabledump can fail with NullPointerException, when not all files are present 
on Disk. This behavior is frequently present on Cassandra nodes with a high 
write-throughput and much disk I/O. It looks like Cassandra writes the Data 
file fast, and the other files (Statistics, ...) later when it has time to do 
so. This can be even a minute later. Technically it is OK for the Cassandra DB 
itself, but the sstabledump tool does not handle this gracefully.

Current behavior:
 - NullPointerException (see below)

Expected behavior:
 - More graceful behavior, e.g. a message to STDERR


PS: This is minor priority, I may pick up the ticket myself if nobody else is 
faster. 

-
{code}
~/apache-cassandra-3.9/tools/bin/sstabledump /appdata/mc-53346-big-Data.db
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:429)
at 
org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
{code}

FBUtilities.java:429 is the following line, and validationMetadata == null:
bq. if (validationMetadata.partitioner.endsWith("LocalPartitioner"))


  was:
sstabledump can fail with NullPointerException, when not all files are present 
on Disk. This behavior is frequently present on Cassandra nodes with a high 
write-throughput and much disk I/O. It looks like Cassandra writes the Data 
file fast, and the other files (Statistics, ...) when it has time to do so 
(e.g. a minute later). This is OK, but sstabledump does not handle this 
gracefully.

Current behavior:
 - NullPointerException (see below)

Expected behavior:
 - More graceful behavior, e.g. a message to STDERR


PS: This is minor priority, I may pick up the ticket myself if nobody else is 
faster. 

-
{code}
~/apache-cassandra-3.9/tools/bin/sstabledump /appdata/mc-53346-big-Data.db
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:429)
at 
org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
{code}

FBUtilities.java:429 is the following line, and validationMetadata == null:
bq. if (validationMetadata.partitioner.endsWith("LocalPartitioner"))



> NullPointerException in sstabledump
> ---
>
> Key: CASSANDRA-13264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13264
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Priority: Minor
>  Labels: twcs
>
> sstabledump can fail with NullPointerException, when not all files are 
> present on Disk. This behavior is frequently present on Cassandra nodes with 
> a high write-throughput and much disk I/O. It looks like Cassandra writes the 
> Data file fast, and the other files (Statistics, ...) later when it has time 
> to do so. This can be even a minute later. Technically it is OK for the 
> Cassandra DB itself, but the sstabledump tool does not handle this gracefully.
> Current behavior:
>  - NullPointerException (see below)
> Expected behavior:
>  - More graceful behavior, e.g. a message to STDERR
> PS: This is minor priority, I may pick up the ticket myself if nobody else is 
> faster. 
> -
> {code}
> ~/apache-cassandra-3.9/tools/bin/sstabledump /appdata/mc-53346-big-Data.db
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:429)
> at 
> org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
> at 
> org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
> {code}
> FBUtilities.java:429 is the following line, and validationMetadata == null:
> bq. if (validationMetadata.partitioner.endsWith("LocalPartitioner"))



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13264) NullPointerException in sstabledump

2017-02-24 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13264:

Description: 
sstabledump can fail with NullPointerException, when not all files are present 
on Disk. This behavior is frequently present on Cassandra nodes with a high 
write-throughput and much disk I/O. It looks like Cassandra writes the Data 
file fast, and the other files (Statistics, ...) when it has time to do so 
(e.g. a minute later). This is OK, but sstabledump does not handle this 
gracefully.

Current behavior:
 - NullPointerException (see below)

Expected behavior:
 - More graceful behavior, e.g. a message to STDERR


PS: This is minor priority, I may pick up the ticket myself if nobody else is 
faster. 

-
{code}
~/apache-cassandra-3.9/tools/bin/sstabledump /appdata/mc-53346-big-Data.db
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:429)
at 
org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
{code}

FBUtilities.java:429 is the following line, and validationMetadata == null:
bq. if (validationMetadata.partitioner.endsWith("LocalPartitioner"))


  was:
I have a table where all columns are stored with TTL of maximum 4 hours. 
Usually TWCS compaction properly removes  expired data via tombstone compaction 
and also removes fully expired tables. The number of SSTables is nearly 
constant since weeks. Good.

The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
being recreated frequently (judging form the file creation timestamp), but the 
number of tables is growing. Analysis and actions take so far:
- sstablemetadata shows strange data, as if the table is completely empty.
- sstabledump throws an Exception when running it on such a SSTable
- Even triggering a manual major compaction will not remove the old SSTable's. 
To be more precise: They are recreated with new id and timestamp (not sure 
whether they are identical as I cannot inspect content due to the sstabledump 
crash)

{color:blue}edit 2017-01-19: This ticket may be obsolete. See the later 
comments for more information.{color}



> NullPointerException in sstabledump
> ---
>
> Key: CASSANDRA-13264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13264
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Priority: Minor
>  Labels: twcs
>
> sstabledump can fail with NullPointerException, when not all files are 
> present on Disk. This behavior is frequently present on Cassandra nodes with 
> a high write-throughput and much disk I/O. It looks like Cassandra writes the 
> Data file fast, and the other files (Statistics, ...) when it has time to do 
> so (e.g. a minute later). This is OK, but sstabledump does not handle this 
> gracefully.
> Current behavior:
>  - NullPointerException (see below)
> Expected behavior:
>  - More graceful behavior, e.g. a message to STDERR
> PS: This is minor priority, I may pick up the ticket myself if nobody else is 
> faster. 
> -
> {code}
> ~/apache-cassandra-3.9/tools/bin/sstabledump /appdata/mc-53346-big-Data.db
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.cassandra.utils.FBUtilities.newPartitioner(FBUtilities.java:429)
> at 
> org.apache.cassandra.tools.SSTableExport.metadataFromSSTable(SSTableExport.java:104)
> at 
> org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:180)
> {code}
> FBUtilities.java:429 is the following line, and validationMetadata == null:
> bq. if (validationMetadata.partitioner.endsWith("LocalPartitioner"))



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CASSANDRA-13264) NullPointerException in sstabledump

2017-02-24 Thread Christian Esken (JIRA)
Christian Esken created CASSANDRA-13264:
---

 Summary: NullPointerException in sstabledump
 Key: CASSANDRA-13264
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13264
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction
 Environment: Cassandra 3.0.9
Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
1.8.0_112-b15)
Linux 3.16
Reporter: Christian Esken
Priority: Minor


I have a table where all columns are stored with TTL of maximum 4 hours. 
Usually TWCS compaction properly removes  expired data via tombstone compaction 
and also removes fully expired tables. The number of SSTables is nearly 
constant since weeks. Good.

The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
being recreated frequently (judging form the file creation timestamp), but the 
number of tables is growing. Analysis and actions take so far:
- sstablemetadata shows strange data, as if the table is completely empty.
- sstabledump throws an Exception when running it on such a SSTable
- Even triggering a manual major compaction will not remove the old SSTable's. 
To be more precise: They are recreated with new id and timestamp (not sure 
whether they are identical as I cannot inspect content due to the sstabledump 
crash)

{color:blue}edit 2017-01-19: This ticket may be obsolete. See the later 
comments for more information.{color}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2017-02-24 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken resolved CASSANDRA-13005.
-
Resolution: Cannot Reproduce

I cannot reproduce this issue any longer.

As explained in the last comment, it might not actually be a real bug, except 
for the NPE in sstabledump. For the latter I will create a followup ticket, and 
will close this ticket.

> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Priority: Minor
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)
> {color:blue}edit 2017-01-19: This ticket may be obsolete. See the later 
> comments for more information.{color}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2017-01-19 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13005:

Description: 
I have a table where all columns are stored with TTL of maximum 4 hours. 
Usually TWCS compaction properly removes  expired data via tombstone compaction 
and also removes fully expired tables. The number of SSTables is nearly 
constant since weeks. Good.

The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
being recreated frequently (judging form the file creation timestamp), but the 
number of tables is growing. Analysis and actions take so far:
- sstablemetadata shows strange data, as if the table is completely empty.
- sstabledump throws an Exception when running it on such a SSTable
- Even triggering a manual major compaction will not remove the old SSTable's. 
To be more precise: They are recreated with new id and timestamp (not sure 
whether they are identical as I cannot inspect content due to the sstabledump 
crash)

{color:blue}edit 2017-01-19: This ticket may be obsolete. See the later 
comments for more information.{color}


  was:
I have a table where all columns are stored with TTL of maximum 4 hours. 
Usually TWCS compaction properly removes  expired data via tombstone compaction 
and also removes fully expired tables. The number of SSTables is nearly 
constant since weeks. Good.

The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
being recreated frequently (judging form the file creation timestamp), but the 
number of tables is growing. Analysis and actions take so far:
- sstablemetadata shows strange data, as if the table is completely empty.
- sstabledump throws an Exception when running it on such a SSTable
- Even triggering a manual major compaction will not remove the old SSTable's. 
To be more precise: They are recreated with new id and timestamp (not sure 
whether they are identical as I cannot inspect content due to the sstabledump 
crash)

{color:blue}edit 2017-0-19: This ticket may be obsolete. See the later comments 
for more information.{color}



> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Priority: Minor
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)
> {color:blue}edit 2017-01-19: This ticket may be obsolete. See the later 
> comments for more information.{color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2017-01-19 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13005:

Description: 
I have a table where all columns are stored with TTL of maximum 4 hours. 
Usually TWCS compaction properly removes  expired data via tombstone compaction 
and also removes fully expired tables. The number of SSTables is nearly 
constant since weeks. Good.

The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
being recreated frequently (judging form the file creation timestamp), but the 
number of tables is growing. Analysis and actions take so far:
- sstablemetadata shows strange data, as if the table is completely empty.
- sstabledump throws an Exception when running it on such a SSTable
- Even triggering a manual major compaction will not remove the old SSTable's. 
To be more precise: They are recreated with new id and timestamp (not sure 
whether they are identical as I cannot inspect content due to the sstabledump 
crash)

{color:blue}edit 2017-0-19: This ticket may be obsolete. See the later comments 
for more information.{color}


  was:
I have a table where all columns are stored with TTL of maximum 4 hours. 
Usually TWCS compaction properly removes  expired data via tombstone compaction 
and also removes fully expired tables. The number of SSTables is nearly 
constant since weeks. Good.

The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
being recreated frequently (judging form the file creation timestamp), but the 
number of tables is growing. Analysis and actions take so far:
- sstablemetadata shows strange data, as if the table is completely empty.
- sstabledump throws an Exception when running it on such a SSTable
- Even triggering a manual major compaction will not remove the old SSTable's. 
To be more precise: They are recreated with new id and timestamp (not sure 
whether they are identical as I cannot inspect content due to the sstabledump 
crash)




> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Priority: Minor
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)
> {color:blue}edit 2017-0-19: This ticket may be obsolete. See the later 
> comments for more information.{color}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2017-01-19 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829587#comment-15829587
 ] 

Christian Esken commented on CASSANDRA-13005:
-

I am not anymore convinced that my bug report is fully accurate.
- It is true that files are missing
- It is true that NullPointerException happens within sstabledump
- OTOH missing files get created after some time. It may be the case that this 
always happens, even though it sometimes takes a long time (I did not yet 
measure this, but I encountered cases where the files were not there even after 
1 minute)
- I inspected some of the data files, after the missing files were created. At 
that point of time they were correct and contained non-expired data.

Thus, this may not be a bug. I will lower priority, but will keep this bug 
report open for some time.


> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2017-01-19 Thread Christian Esken (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Esken updated CASSANDRA-13005:

Priority: Minor  (was: Major)

> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>Priority: Minor
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2017-01-16 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823645#comment-15823645
 ] 

Christian Esken commented on CASSANDRA-13005:
-

Adding an additional observation on  the event of Jam 13: Most or all "missing" 
files appeared after some time. It seems like some background process creates 
those files, but sometimes it takes quite some time and also it happens for 
multiple tables. For example at some time I was checking and files from 13 
SSTables were missing (overall 78 files).

> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2017-01-12 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821041#comment-15821041
 ] 

Christian Esken commented on CASSANDRA-13005:
-

While the issue was not present in the last 4 weeks, it now has happened again. 
I am adding the sstableexpiredblockers output, as requested:

{code}
# sstableexpiredblockers cachestore bookinglinkentries 
[BigTableReader(path='/data/cassandra/data/cachestore/bookinglinkentries-a2502c60bba511e6917fcda6eb6df2bb/mc-204995-big-Data.db')
 (minTS = 1484213395148001, maxTS = 1484214299952743, maxLDT = 1484228699)],  
blocks 1 expired sstables from getting dropped: 
[BigTableReader(path='/data/cassandra/data/cachestore/bookinglinkentries-a2502c60bba511e6917fcda6eb6df2bb/mc-205731-big-Data.db')
 (minTS = 1484212495197503, maxTS = 1484213395210562, maxLDT = 1484227795)],
{code}

The broken SSTables do not appear in the sstableexpiredblockers output. As last 
time, the number of SSTables keeps increasing once the problem has occurred 
initially.


> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2016-12-09 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735819#comment-15735819
 ] 

Christian Esken edited comment on CASSANDRA-13005 at 12/9/16 5:13 PM:
--

I have imported some of the old defective SSTables in a test installation via 
sstableloader:

{code}
# sstableloader -d 127.0.0.1 cachestore/entries
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of /home/cesken/cachestore/entries/mc-50789-big-Data.db 
/home/cesken/cachestore/entries/mc-51223-big-Data.db 
/home/cesken/cachestore/entries/mc-51351-big-Data.db to [/127.0.0.1]
progress: [/127.0.0.1]0:0/3 0  % total: 0% 3,152MiB/s (avg: 3,152MiB/s)
progress: [/127.0.0.1]0:0/3 0  % total: 0% 1,908GiB/s (avg: 6,294MiB/s)
progress: [/127.0.0.1]0:0/3 0  % total: 0% 1,599GiB/s (avg: 9,423MiB/s)
[...]
progress: [/127.0.0.1]0:2/3 99 % total: 99% 3,177MiB/s (avg: 6,227MiB/s)
progress: [/127.0.0.1]0:3/3 100% total: 100% 3,436MiB/s (avg: 6,214MiB/s)
progress: [/127.0.0.1]0:3/3 100% total: 100% 0,000KiB/s (avg: 6,102MiB/s)

Summary statistics: 
   Connections per host: 1 
   Total files transferred : 3 
   Total bytes transferred : 3,783GiB  
   Total duration  : 634779 ms 
   Average transfer rate   : 6,102MiB/s
   Peak transfer rate  : 9,423MiB/s
{code}


As seen above, the 3 files were loaded, but Cassandra did not import any rows. 
Probably because the files are defective, or because everything in there is 
expired. A SELECT on the table also does not return any data.

{code}
# sstableexpiredblockers cachestore entries
No sstables for cachestore.entries
{code}



was (Author: cesken):
I have imported some of the old defective SSTables in a test installation via 
sstableloader:

{code}
# sstableloader -d 127.0.0.1 cachestore/entries
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of /home/cesken/cachestore/entries/mc-50789-big-Data.db 
/home/cesken/cachestore/entries/mc-51223-big-Data.db 
/home/cesken/cachestore/entries/mc-51351-big-Data.db to [/127.0.0.1]
progress: [/127.0.0.1]0:0/3 0  % total: 0% 3,152MiB/s (avg: 3,152MiB/s)
progress: [/127.0.0.1]0:0/3 0  % total: 0% 1,908GiB/s (avg: 6,294MiB/s)
progress: [/127.0.0.1]0:0/3 0  % total: 0% 1,599GiB/s (avg: 9,423MiB/s)
[...]
progress: [/127.0.0.1]0:2/3 99 % total: 99% 3,177MiB/s (avg: 6,227MiB/s)
progress: [/127.0.0.1]0:3/3 100% total: 100% 3,436MiB/s (avg: 6,214MiB/s)
progress: [/127.0.0.1]0:3/3 100% total: 100% 0,000KiB/s (avg: 6,102MiB/s)

Summary statistics: 
   Connections per host: 1 
   Total files transferred : 3 
   Total bytes transferred : 3,783GiB  
   Total duration  : 634779 ms 
   Average transfer rate   : 6,102MiB/s
   Peak transfer rate  : 9,423MiB/s
{code}


As seen above, the 3 files were loaded, but Cassandra did not import any rows. 
Probably because the files are defective, of because everything in there is 
expired. A SELECT on the table also does not return any data.

{code}
# sstableexpiredblockers cachestore entries
No sstables for cachestore.entries
{code}


> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2016-12-09 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735820#comment-15735820
 ] 

Christian Esken commented on CASSANDRA-13005:
-

Is there anything else that I could try?

> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2016-12-09 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735819#comment-15735819
 ] 

Christian Esken commented on CASSANDRA-13005:
-

I have imported some of the old defective SSTables in a test installation via 
sstableloader:

{code}
# sstableloader -d 127.0.0.1 cachestore/entries
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of /home/cesken/cachestore/entries/mc-50789-big-Data.db 
/home/cesken/cachestore/entries/mc-51223-big-Data.db 
/home/cesken/cachestore/entries/mc-51351-big-Data.db to [/127.0.0.1]
progress: [/127.0.0.1]0:0/3 0  % total: 0% 3,152MiB/s (avg: 3,152MiB/s)
progress: [/127.0.0.1]0:0/3 0  % total: 0% 1,908GiB/s (avg: 6,294MiB/s)
progress: [/127.0.0.1]0:0/3 0  % total: 0% 1,599GiB/s (avg: 9,423MiB/s)
[...]
progress: [/127.0.0.1]0:2/3 99 % total: 99% 3,177MiB/s (avg: 6,227MiB/s)
progress: [/127.0.0.1]0:3/3 100% total: 100% 3,436MiB/s (avg: 6,214MiB/s)
progress: [/127.0.0.1]0:3/3 100% total: 100% 0,000KiB/s (avg: 6,102MiB/s)

Summary statistics: 
   Connections per host: 1 
   Total files transferred : 3 
   Total bytes transferred : 3,783GiB  
   Total duration  : 634779 ms 
   Average transfer rate   : 6,102MiB/s
   Peak transfer rate  : 9,423MiB/s
{code}


As seen above, the 3 files were loaded, but Cassandra did not import any rows. 
Probably because the files are defective, of because everything in there is 
expired. A SELECT on the table also does not return any data.

{code}
# sstableexpiredblockers cachestore entries
No sstables for cachestore.entries
{code}


> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13005) Cassandra TWCS is not removing fully expired tables

2016-12-08 Thread Christian Esken (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732007#comment-15732007
 ] 

Christian Esken commented on CASSANDRA-13005:
-

I can do that. I could provide a patch for it. The HowToContribute mentions I 
can do it also via GitHub, which I would prefer.

> Cassandra TWCS is not removing fully expired tables
> ---
>
> Key: CASSANDRA-13005
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13005
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Cassandra 3.0.9
> Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version 
> 1.8.0_112-b15)
> Linux 3.16
>Reporter: Christian Esken
>  Labels: twcs
> Attachments: sstablemetadata-empty-type-that-is-3GB.txt
>
>
> I have a table where all columns are stored with TTL of maximum 4 hours. 
> Usually TWCS compaction properly removes  expired data via tombstone 
> compaction and also removes fully expired tables. The number of SSTables is 
> nearly constant since weeks. Good.
> The problem:  Suddenly TWCS does not remove old SSTables any longer. They are 
> being recreated frequently (judging form the file creation timestamp), but 
> the number of tables is growing. Analysis and actions take so far:
> - sstablemetadata shows strange data, as if the table is completely empty.
> - sstabledump throws an Exception when running it on such a SSTable
> - Even triggering a manual major compaction will not remove the old 
> SSTable's. To be more precise: They are recreated with new id and timestamp 
> (not sure whether they are identical as I cannot inspect content due to the 
> sstabledump crash)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >