[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238639#comment-16238639
 ] 

Jason Brown commented on CASSANDRA-12182:
-

bq. You're right (obviously).

Actually, your code was right, I was just pointing out how it was correct ;)

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> SecondaryIndexManagement  0 

[jira] [Updated] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-12182:

   Resolution: Fixed
 Reviewer: Jason Brown
Fix Version/s: 4.0
   Status: Resolved  (was: Patch Available)

committed as sha {{5b09543f64eafb1344f7814a80b73d312d5bbc37}}

Thanks for the patch, and thanks for indulging me on concurrency issues :)

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Fix For: 4.0
>
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
> 

cassandra git commit: Allow only one concurrent call to StatusLogger

2017-11-03 Thread jasobrown
Repository: cassandra
Updated Branches:
  refs/heads/trunk 260846685 -> 5b09543f6


Allow only one concurrent call to StatusLogger

patch by mszczygiel; reviewed by jasobrown for CASSANDRA-12182


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5b09543f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5b09543f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5b09543f

Branch: refs/heads/trunk
Commit: 5b09543f64eafb1344f7814a80b73d312d5bbc37
Parents: 2608466
Author: mszczygiel 
Authored: Tue Oct 31 22:46:56 2017 +0100
Committer: Jason Brown 
Committed: Fri Nov 3 17:28:05 2017 -0700

--
 CHANGES.txt |   1 +
 .../apache/cassandra/utils/StatusLogger.java|  25 ++-
 .../cassandra/utils/StatusLoggerTest.java   | 160 +++
 3 files changed, 185 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5b09543f/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 71f4b1d..e214177 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Allow only one concurrent call to StatusLogger (CASSANDRA-12182)
  * Refactoring to specialised functional interfaces (CASSANDRA-13982)
  * Speculative retry should allow more friendly params (CASSANDRA-13876)
  * Throw exception if we send/receive repair messages to incompatible nodes 
(CASSANDRA-13944)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5b09543f/src/java/org/apache/cassandra/utils/StatusLogger.java
--
diff --git a/src/java/org/apache/cassandra/utils/StatusLogger.java 
b/src/java/org/apache/cassandra/utils/StatusLogger.java
index c33190b..9f9d869 100644
--- a/src/java/org/apache/cassandra/utils/StatusLogger.java
+++ b/src/java/org/apache/cassandra/utils/StatusLogger.java
@@ -19,11 +19,13 @@ package org.apache.cassandra.utils;
 
 import java.lang.management.ManagementFactory;
 import java.util.Map;
+import java.util.concurrent.locks.ReentrantLock;
 import javax.management.*;
 
 import org.apache.cassandra.cache.*;
 
 import org.apache.cassandra.metrics.ThreadPoolMetrics;
+
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -37,10 +39,31 @@ import org.apache.cassandra.service.CacheService;
 public class StatusLogger
 {
 private static final Logger logger = 
LoggerFactory.getLogger(StatusLogger.class);
-
+private static final ReentrantLock busyMonitor = new ReentrantLock();
 
 public static void log()
 {
+// avoid logging more than once at the same time. throw away any 
attempts to log concurrently, as it would be
+// confusing and noisy for operators - and don't bother logging again, 
immediately as it'll just be the same data
+if (busyMonitor.tryLock())
+{
+try
+{
+logStatus();
+}
+finally
+{
+busyMonitor.unlock();
+}
+}
+else
+{
+logger.trace("StatusLogger is busy");
+}
+}
+
+private static void logStatus()
+{
 MBeanServer server = ManagementFactory.getPlatformMBeanServer();
 
 // everything from o.a.c.concurrent

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5b09543f/test/unit/org/apache/cassandra/utils/StatusLoggerTest.java
--
diff --git a/test/unit/org/apache/cassandra/utils/StatusLoggerTest.java 
b/test/unit/org/apache/cassandra/utils/StatusLoggerTest.java
new file mode 100644
index 000..878e6e8
--- /dev/null
+++ b/test/unit/org/apache/cassandra/utils/StatusLoggerTest.java
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.utils;
+
+import java.util.Comparator;
+import java.util.List;
+import java.util.Map;
+import 

[jira] [Comment Edited] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238548#comment-16238548
 ] 

Michał Szczygieł edited comment on CASSANDRA-12182 at 11/3/17 11:09 PM:


You're right (obviously). Feel free to remove {{InMemoryAppender#getEvents}} 
from my latest patch and replace it with direct field access. But please don't 
use previous patch, because test would incorrectly fail when time ranges are 
not connected ({{Range#intersect}} would throw {{IllegalArgumentException}} in 
such case). utests for the latest patch: 
[https://circleci.com/gh/mszczygiel/cassandra/4] 


was (Author: mychal):
You're right (obviously). Feel free to remove {{InMemoryAppender#getEvents}} 
from my latest patch and replace it with direct field access. But please don't 
use previous patch, because test would incorrectly fail when time ranges are 
not connected ({{Range.intersect}} would throw {{IllegalArgumentException}} in 
such case). utests for the latest patch: 
[https://circleci.com/gh/mszczygiel/cassandra/4] 

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 

[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238548#comment-16238548
 ] 

Michał Szczygieł commented on CASSANDRA-12182:
--

You're right (obviously). Feel free to remove {{InMemoryAppender#getEvents}} 
from my latest patch and replace it with direct field access. But please don't 
use previous patch, because test would incorrectly fail when time ranges are 
not connected ({{Range.intersect}} would throw {{IllegalArgumentException}} in 
such case). utests for the latest patch: 
[https://circleci.com/gh/mszczygiel/cassandra/4] 

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO 

[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238521#comment-16238521
 ] 

Jason Brown commented on CASSANDRA-12182:
-

{{#shutdown()}} guarantees that (from the javadoc):

{blockquote}
 * Initiates an orderly shutdown in which previously submitted
 * tasks are executed, but no new tasks will be accepted.
 * Invocation has no additional effect if already shut down.
 *
 * This method does not wait for previously submitted tasks to
 * complete execution.  Use {@link #awaitTermination awaitTermination}
 * to do that.
{blockquote}

You are calling {{#awaitTermination}} which ensures the tasks complete, so that 
acts as a block for the test thread. I'm pretty sure the entries in 
{{InMemoryAppender#events}} a) will have been written to (because we've waited 
until the tasks are complete) and b) will be visible (because of 
{{AppenderBase#doAppend}} being synchronized). wdyt?



> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234

[jira] [Created] (CASSANDRA-13992) Don't send new_metadata_id for conditional updates

2017-11-03 Thread Olivier Michallat (JIRA)
Olivier Michallat created CASSANDRA-13992:
-

 Summary: Don't send new_metadata_id for conditional updates
 Key: CASSANDRA-13992
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13992
 Project: Cassandra
  Issue Type: Bug
Reporter: Olivier Michallat
Priority: Minor


This is a follow-up to CASSANDRA-10786.

Given the table
{code}
CREATE TABLE foo (k int PRIMARY KEY)
{code}
And the prepared statement
{code}
INSERT INTO foo (k) VALUES (?) IF NOT EXISTS
{code}

The result set metadata changes depending on the outcome of the update:
* if the row didn't exist, there is only a single column \[applied] = true
* if it did, the result contains \[applied] = false, plus the current value of 
column k.

The way this was handled so far is that the PREPARED response contains no 
result set metadata, and therefore all EXECUTE messages have SKIP_METADATA = 
false, and the responses always include the full (and correct) metadata.

CASSANDRA-10786 still sends the PREPARED response with no metadata, *but the 
response to EXECUTE now contains a {{new_metadata_id}}*. The driver thinks it 
is because of a schema change, and updates its local copy of the prepared 
statement's result metadata.

The next EXECUTE is sent with SKIP_METADATA = true, but the server appears to 
ignore that, and still sends the metadata in the response. So each response 
includes the correct metadata, the driver uses it, and there is no visible 
issue for client code.

The only drawback is that the driver updates its local copy of the metadata 
unnecessarily, every time. We can work around that by only updating if we had 
metadata before, at the cost of an extra volatile read. But I think the best 
thing to do would be to never send a {{new_metadata_id}} in for a conditional 
update.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michał Szczygieł updated CASSANDRA-12182:
-
Status: Patch Available  (was: In Progress)

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> SecondaryIndexManagement  0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  

[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238426#comment-16238426
 ] 

Michał Szczygieł commented on CASSANDRA-12182:
--

But {{AppenderBase#doAppend}} is called only from 2 threads created by 
{{ExecutorService}} in {{submitTwoLogRequestsConcurrently}}, essentially 
leaving thread that test is executed on without using {{InMemoryAppender}} 
instance as a monitor to read {{InMemoryAppender#events}}, so unless I'm 
totally missing it we're not guaranteed {{AppenderBase#doAppend}} happens 
before read of {{events}}.

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  

[jira] [Updated] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michał Szczygieł updated CASSANDRA-12182:
-
Attachment: 12182-trunk.txt

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> SecondaryIndexManagement  0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> 

[jira] [Updated] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-03 Thread Loic Lambiel (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Loic Lambiel updated CASSANDRA-13948:
-
Attachment: threaddump.txt

> Reload compaction strategies when JBOD disk boundary changes
> 
>
> Key: CASSANDRA-13948
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Major
> Fix For: 3.11.x, 4.x
>
> Attachments: debug.log, threaddump.txt, trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the 
> {{IndexSummaryRedistribution}} and 
> {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=175 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=836 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=870 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) 
> @bci=17, line=1199 (Compiled frame)
>  - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, 
> line=943 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable,
>  java.lang.Iterable) @bci=359, line=483 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification,
>  java.lang.Object) @bci=53, line=555 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.Tracker.notifySSTablesChanged(java.util.Collection,
>  java.util.Collection, org.apache.cassandra.db.compaction.OperationType, 
> java.lang.Throwable) @bci=50, line=409 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.doCommit(java.lang.Throwable)
>  @bci=157, line=227 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit(java.lang.Throwable)
>  @bci=61, line=116 (Compiled frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit()
>  @bci=2, line=200 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish()
>  @bci=5, line=185 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries()
>  @bci=559, line=130 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=9, line=1420 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=4, line=250 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries() 
> @bci=30, line=228 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow() 
> @bci=4, line=125 (Interpreted frame)
>  - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run()
>  @bci=4, line=118 (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
> (Compiled frame)
>  - java.util.concurrent.FutureTask.runAndReset() @bci=47, line=308 (Compiled 
> frame)
>  - 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>  @bci=1, line=180 (Compiled frame)
>  - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() 
> @bci=37, line=294 (Compiled frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1149 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
>  @bci=1, line=81 (Interpreted frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$8.run() @bci=4 
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
> {noformat}
> {noformat}
> Thread 94573: (state = IN_JAVA)
>  - 

[jira] [Commented] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-03 Thread Loic Lambiel (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238384#comment-16238384
 ] 

Loic Lambiel commented on CASSANDRA-13948:
--

I've deployed the patch on a few big nodes. I've not seen the error popping up 
so far.

However I'm still facing issues with compactions. These are big nodes with with 
a big CF, holding many SSTables and pending compactions. According the thread 
dump it seems to be stuck around getNextBackgroundTask. Compactions are still 
being processed for the other keyspace. Beside that the node is running 
normally. Some nodetool commands takes time to proceed like compactionstats. 
Debug log doesn't show any error.

{code:java}
CREATE TABLE blobstore.block (
inode uuid,
version timeuuid,
block bigint,
offset bigint,
chunksize int,
payload blob,
PRIMARY KEY ((inode, version, block), offset)
) WITH CLUSTERING ORDER BY (offset ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'enabled': 
'true', 'tombstone_compaction_interval': '60', 'tombstone_threshold': '0.2', 
'unchecked_tombstone_compaction': 'false'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 172000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{code}


{code:java}
Keyspace : blobstore
Read Count: 97019
Read Latency: 2.4842547026871027 ms.
Write Count: 472590
Write Latency: 0.060107954040500226 ms.
Pending Flushes: 0
Table: block
SSTable count: 43373
SSTables in each level: [18890/4, 115/10, 198/100, 1905/1000, 
9451, 12814, 0, 0, 0]
Space used (live): 4839933810943
Space used (total): 4839933815913
Space used by snapshots (total): 0
Off heap memory used (total): 3273703284
SSTable Compression Ratio: 0.9416884172984209
Number of partitions (estimate): 2925826
Memtable cell count: 41542
Memtable data size: 2631688187
Memtable off heap memory used: 2638649871
Memtable switch count: 7
Local read count: 87281
Local read latency: 2.186 ms
Local write count: 465591
Local write latency: 0.124 ms
Pending flushes: 0
Percent repaired: 4.01
Bloom filter false positives: 297882
Bloom filter false ratio: 0.69198
Bloom filter space used: 5111208
Bloom filter off heap memory used: 4764232
Index summary off heap memory used: 3360917
Compression metadata off heap memory used: 626928264
Compacted partition minimum bytes: 61
Compacted partition maximum bytes: 186563160
Compacted partition mean bytes: 1797922
Average live cells per slice (last five minutes): 
8.641592920353983
Maximum live cells per slice (last five minutes): 258
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Dropped Mutations: 0

{code}

{code:java}
nodetool compactionstats
pending tasks: 3362
- blobstore.block: 3362
{code}

> Reload compaction strategies when JBOD disk boundary changes
> 
>
> Key: CASSANDRA-13948
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Major
> Fix For: 3.11.x, 4.x
>
> Attachments: debug.log, trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the 
> {{IndexSummaryRedistribution}} and 
> {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=175 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=836 (Compiled frame)
>  - 
> 

[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238351#comment-16238351
 ] 

Jason Brown commented on CASSANDRA-12182:
-

well, it always help to commit the changes before you push up a branch 
(facepalm).  My branch is updated now.

I do not believe you necessarily need to anything wrt visibility as that is 
handled by {{AppenderBase#doAppend}} being {{synchronized}}. 

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
>  0
> INFO  

[jira] [Comment Edited] (CASSANDRA-13663) Cassandra 3.10 crashes without dump

2017-11-03 Thread Ricardo Bartolome (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238184#comment-16238184
 ] 

Ricardo Bartolome edited comment on CASSANDRA-13663 at 11/3/17 7:24 PM:


UPDATE: in our case (not the case or the author of the ticket) the JVM was 
crashing. We realised this by enabling Oracle JVM ErrorFile and kernel core 
dumps.
{code}
-XX:ErrorFile=/var/lib/cassandra/heapdump/cassandra-jvm-file-error-1509734684-pid31745.log
{code}

Cassandra 3.9. It happens with OracleJDK 1.8.0_112 and 1.8.0_131. With kernel 
4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64

We'll try to share the crash log, but since we are not familiar with its 
contents, we are checking it does not contain any sensible information



was (Author: ricbartm):
UPDATE: in our case (not the case or the author of the ticket) the JVM was 
crashing. We realised this by enabling Oracle JVM ErrorFile and kernel core 
dumps.
{code}
-XX:ErrorFile=/var/lib/cassandra/heapdump/cassandra-jvm-file-error-1509734684-pid31745.log
{code}

It happens with OracleJDK 1.8.0_112 and 1.8.0_131. With kernel 
4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64

We'll try to share the crash log, but since we are not familiar with its 
contents, we are checking it does not contain any sensible information


> Cassandra 3.10 crashes without dump
> ---
>
> Key: CASSANDRA-13663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13663
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Matthias Otto
>Priority: Minor
> Attachments: 2017-07-04 10_48_34-CloudWatch Management Console.png, 
> RamUsageExamle1.png, RamUsageExample2.png, cassandra debug.log, cassandra 
> system.log
>
>
> Hello. My company runs a 5 node Cassandra cluster. For the last few weeks, we 
> have had a sporadic issue where one of the servers crashes without creating a 
> dump file and without any error messages in the logs. If one restarts the 
> service (which we have by now scripted to happen automatically), the servers 
> resumes work with no complaint.
> Log files of the time of the last crash are attached, thou again they do not 
> log any crash happening.
> Regarding out setup, we are running these servers on AMazon AWS, with 3 
> volumes per server, one for the system, one for data and one for the 
> commitlog. When a crash happens, we can observe a sudden spike of read 
> activity on the commitlog volume. All of these have ample free space. 
> Aspecially the system volume has more then enough free space so that a dump 
> could be written.
> The servers are Ubuntu 16.04 servers and Cassandra is installed from the 
> apt-get packet for version 3.10.
> It is worth noting that these crashes happen more often when nodetool is 
> running either repair job or a backup job, but this is by no means always the 
> case. As for frequency, we have had about 1-2 crashes per week for the last 
> month.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13663) Cassandra 3.10 crashes without dump

2017-11-03 Thread Ricardo Bartolome (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238205#comment-16238205
 ] 

Ricardo Bartolome commented on CASSANDRA-13663:
---

We can't easily share the JVM crash log because it contains much information 
about keyspaces that expose customer names. So far we can share this:
{code}
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f8e9fcf75ac, pid=16151, tid=0x7f8b7aa93700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 
1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# J 21889 C2 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(Lorg/apache/cassandra/db/rows/UnfilteredRowIterator;)Lorg/apache/cassandra/db/RowIndexEntry;
 (361 bytes) @ 0x7f8e9fcf75ac [0x7f8e9fcf42c0+0x32ec]
#
# Core dump written. Default location: //core or core.16151
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---  T H R E A D  ---

Current thread (0x0cd77920):  JavaThread 
"PerDiskMemtableFlushWriter_0:140" daemon [_thread_in_Java, id=28952, 
stack(0x7f8b7aa53000,0x7f8b7aa94000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x2fbb07de
{code}

> Cassandra 3.10 crashes without dump
> ---
>
> Key: CASSANDRA-13663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13663
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Matthias Otto
>Priority: Minor
> Attachments: 2017-07-04 10_48_34-CloudWatch Management Console.png, 
> RamUsageExamle1.png, RamUsageExample2.png, cassandra debug.log, cassandra 
> system.log
>
>
> Hello. My company runs a 5 node Cassandra cluster. For the last few weeks, we 
> have had a sporadic issue where one of the servers crashes without creating a 
> dump file and without any error messages in the logs. If one restarts the 
> service (which we have by now scripted to happen automatically), the servers 
> resumes work with no complaint.
> Log files of the time of the last crash are attached, thou again they do not 
> log any crash happening.
> Regarding out setup, we are running these servers on AMazon AWS, with 3 
> volumes per server, one for the system, one for data and one for the 
> commitlog. When a crash happens, we can observe a sudden spike of read 
> activity on the commitlog volume. All of these have ample free space. 
> Aspecially the system volume has more then enough free space so that a dump 
> could be written.
> The servers are Ubuntu 16.04 servers and Cassandra is installed from the 
> apt-get packet for version 3.10.
> It is worth noting that these crashes happen more often when nodetool is 
> running either repair job or a backup job, but this is by no means always the 
> case. As for frequency, we have had about 1-2 crashes per week for the last 
> month.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238190#comment-16238190
 ] 

Michał Szczygieł commented on CASSANDRA-12182:
--

[~jasobrown] Thank you for the feedback, however I'm afraid I can't see your 
updates in the branch you've linked. I agree {{trace}} level is better option. 
I think you've actually spotted visibility issue, because I did not perform 
anything that guarantees visibility of {{InMemoryAppender#events}} in the 
thread that test is running on (and only reads {{events}}). I'm going to adjust 
test and submit a new patch.

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  

[jira] [Updated] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michał Szczygieł updated CASSANDRA-12182:
-
Status: In Progress  (was: Patch Available)

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> SecondaryIndexManagement  0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> 

[jira] [Commented] (CASSANDRA-13663) Cassandra 3.10 crashes without dump

2017-11-03 Thread Ricardo Bartolome (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238184#comment-16238184
 ] 

Ricardo Bartolome commented on CASSANDRA-13663:
---

UPDATE: in our case (not the case or the author of the ticket) the JVM was 
crashing. We realised this by enabling Oracle JVM ErrorFile and kernel core 
dumps.
{code}
-XX:ErrorFile=/var/lib/cassandra/heapdump/cassandra-jvm-file-error-1509734684-pid31745.log
{code}

It happens with OracleJDK 1.8.0_112 and 1.8.0_131. With kernel 
4.9.43-17.38.amzn1.x86_64 and 3.14.35-28.38.amzn1.x86_64

We'll try to share the crash log, but since we are not familiar with its 
contents, we are checking it does not contain any sensible information


> Cassandra 3.10 crashes without dump
> ---
>
> Key: CASSANDRA-13663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13663
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Matthias Otto
>Priority: Minor
> Attachments: 2017-07-04 10_48_34-CloudWatch Management Console.png, 
> RamUsageExamle1.png, RamUsageExample2.png, cassandra debug.log, cassandra 
> system.log
>
>
> Hello. My company runs a 5 node Cassandra cluster. For the last few weeks, we 
> have had a sporadic issue where one of the servers crashes without creating a 
> dump file and without any error messages in the logs. If one restarts the 
> service (which we have by now scripted to happen automatically), the servers 
> resumes work with no complaint.
> Log files of the time of the last crash are attached, thou again they do not 
> log any crash happening.
> Regarding out setup, we are running these servers on AMazon AWS, with 3 
> volumes per server, one for the system, one for data and one for the 
> commitlog. When a crash happens, we can observe a sudden spike of read 
> activity on the commitlog volume. All of these have ample free space. 
> Aspecially the system volume has more then enough free space so that a dump 
> could be written.
> The servers are Ubuntu 16.04 servers and Cassandra is installed from the 
> apt-get packet for version 3.10.
> It is worth noting that these crashes happen more often when nodetool is 
> running either repair job or a backup job, but this is by no means always the 
> case. As for frequency, we have had about 1-2 crashes per week for the last 
> month.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13957) upgradesstables fails after upgrading from 2.1.x to 3.0.14

2017-11-03 Thread Dan Priscornic (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238127#comment-16238127
 ] 

Dan Priscornic commented on CASSANDRA-13957:


Thank you. I will try to reproduce on Monday and get back to you.

> upgradesstables fails after upgrading from 2.1.x to 3.0.14
> --
>
> Key: CASSANDRA-13957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13957
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dan Priscornic
>Priority: Major
>
> After upgrading DSE from 4.8.14 (cassandra 2.1.18.1463) to 5.0.10 (cassandra 
> 3.0.14.1862) I ran nodetool upgradesstables and it fails with the following 
> stack trace:
> {code:java}
> # nodetool -u cassandra -pwf /etc/dse/cassandra/jmxremote.password 
> upgradesstables
> error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:197)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:137)
>   at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111)
>   at 
> org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:149)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
>   at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:205)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:99)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:427)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:314)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6/61137731.run(Unknown
>  Source)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The bug seems similar to CASSANDRA-13320 which says it should be fixed in 
> cassandra 3.0.13 but does not look fixed in 3.0.14



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13957) upgradesstables fails after upgrading from 2.1.x to 3.0.14

2017-11-03 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238086#comment-16238086
 ] 

Jeff Jirsa commented on CASSANDRA-13957:


The stack you're posting is from the client side - on the server there's also 
going to be a stack, and it'll likely have a thread name associated with it 
(likely CompactionExecutor:NN ). Just above it in the logs, you'll likely see a 
line about that same thread upgrading a specific sstable. In 3.0, this may be 
in a debug log instead of the primary system.log.

> upgradesstables fails after upgrading from 2.1.x to 3.0.14
> --
>
> Key: CASSANDRA-13957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13957
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dan Priscornic
>Priority: Major
>
> After upgrading DSE from 4.8.14 (cassandra 2.1.18.1463) to 5.0.10 (cassandra 
> 3.0.14.1862) I ran nodetool upgradesstables and it fails with the following 
> stack trace:
> {code:java}
> # nodetool -u cassandra -pwf /etc/dse/cassandra/jmxremote.password 
> upgradesstables
> error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:197)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:137)
>   at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111)
>   at 
> org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:149)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
>   at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:205)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:99)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:427)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:314)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6/61137731.run(Unknown
>  Source)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The bug seems similar to CASSANDRA-13320 which says it should be fixed in 
> cassandra 3.0.13 but does not look fixed in 3.0.14



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13957) upgradesstables fails after upgrading from 2.1.x to 3.0.14

2017-11-03 Thread Dan Priscornic (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238077#comment-16238077
 ] 

Dan Priscornic commented on CASSANDRA-13957:


[~jjirsa], thank you for replying .
How do I detect the sstable that trigger is, it is not in the stack trace?

> upgradesstables fails after upgrading from 2.1.x to 3.0.14
> --
>
> Key: CASSANDRA-13957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13957
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dan Priscornic
>Priority: Major
>
> After upgrading DSE from 4.8.14 (cassandra 2.1.18.1463) to 5.0.10 (cassandra 
> 3.0.14.1862) I ran nodetool upgradesstables and it fails with the following 
> stack trace:
> {code:java}
> # nodetool -u cassandra -pwf /etc/dse/cassandra/jmxremote.password 
> upgradesstables
> error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:197)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:137)
>   at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111)
>   at 
> org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:149)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
>   at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:205)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:99)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:427)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:314)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6/61137731.run(Unknown
>  Source)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The bug seems similar to CASSANDRA-13320 which says it should be fixed in 
> cassandra 3.0.13 but does not look fixed in 3.0.14



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-12182:
---
Comment: was deleted

(was: I realize it’s s pretty simple patch, but couldn’t nospamlogger give us 
this same behavior?)

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> SecondaryIndexManagement  0 0  3 0
>  0
> INFO  

[jira] [Comment Edited] (CASSANDRA-13991) NullPointerException when querying a table with a previous state

2017-11-03 Thread Chris mildebrandt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237987#comment-16237987
 ] 

Chris mildebrandt edited comment on CASSANDRA-13991 at 11/3/17 5:03 PM:


With Cassandra 2.1.19 and 2.2.11, I get this error instead:

{noformat}
ERROR 16:59:56 Unexpected exception during request
java.lang.IllegalArgumentException: Not enough bytes. Offset: 2. Length: 25194. 
Buffer size: 4
at 
org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:362)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:98)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.RangeSliceQueryPager.(RangeSliceQueryPager.java:60)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.QueryPagers.pager(QueryPagers.java:115) 
~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.QueryPagers.pager(QueryPagers.java:126) 
~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:178)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:492)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:469)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:142)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
 [apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
 [apache-cassandra-2.2.11.jar:2.2.11]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) 
[na:1.7.0_151]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
 [apache-cassandra-2.2.11.jar:2.2.11]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.2.11.jar:2.2.11]
at java.lang.Thread.run(Thread.java:748) [na:1.7.0_151]
{noformat}


was (Author: mildebrandt):
With Cassandra 2.2.11, I get this error instead:

{noformat}
ERROR 16:59:56 Unexpected exception during request
java.lang.IllegalArgumentException: Not enough bytes. Offset: 2. Length: 25194. 
Buffer size: 4
at 
org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:362)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:98)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.RangeSliceQueryPager.(RangeSliceQueryPager.java:60)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.QueryPagers.pager(QueryPagers.java:115) 
~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.QueryPagers.pager(QueryPagers.java:126) 
~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:178)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
 

[jira] [Commented] (CASSANDRA-13991) NullPointerException when querying a table with a previous state

2017-11-03 Thread Chris mildebrandt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237987#comment-16237987
 ] 

Chris mildebrandt commented on CASSANDRA-13991:
---

With Cassandra 2.2.11, I get this error instead:

{noformat}
ERROR 16:59:56 Unexpected exception during request
java.lang.IllegalArgumentException: Not enough bytes. Offset: 2. Length: 25194. 
Buffer size: 4
at 
org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:362)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:98)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.RangeSliceQueryPager.(RangeSliceQueryPager.java:60)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.QueryPagers.pager(QueryPagers.java:115) 
~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.service.pager.QueryPagers.pager(QueryPagers.java:126) 
~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:178)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:226)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:492)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:469)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:142)
 ~[apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
 [apache-cassandra-2.2.11.jar:2.2.11]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
 [apache-cassandra-2.2.11.jar:2.2.11]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) 
[na:1.7.0_151]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
 [apache-cassandra-2.2.11.jar:2.2.11]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.2.11.jar:2.2.11]
at java.lang.Thread.run(Thread.java:748) [na:1.7.0_151]
{noformat}

> NullPointerException when querying a table with a previous state
> 
>
> Key: CASSANDRA-13991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13991
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Chris mildebrandt
>Priority: Major
> Attachments: CASSANDRA-13991.log
>
>
> Performing the following steps (using the gocql library) results in an NPE:
> * With a table of 12 entries, read all rows.
> * Set the page size to 1 and read the first row. Save the query state.
> * Read all the row again.
> * Set the page size to 5 and the page state to the previous state. (This is 
> where the NPE occurs).
> This can be reproduced with the following project:
> https://github.com/eyeofthefrog/CASSANDRA-13991



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13957) upgradesstables fails after upgrading from 2.1.x to 3.0.14

2017-11-03 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237983#comment-16237983
 ] 

Jeff Jirsa commented on CASSANDRA-13957:


The assertion 
[is|https://github.com/apache/cassandra/blob/cassandra-3.0.14/src/java/org/apache/cassandra/db/rows/Rows.java#L61-L70]:

{code}
/**
 * Collect statistics on a given row.
 *
 * @param row the row for which to collect stats.
 * @param collector the stats collector.
 * @return the total number of cells in {@code row}.
 */
public static int collectStats(Row row, PartitionStatisticsCollector 
collector)
{
assert !row.isEmpty();

{code}

Something in your sstable is in a state we don't expect, or don't handle 
properly. There were a bunch of fixes that went into 3.0.15, first step may be 
to try that instead of 3.0.14 (except you're on DSE, so I guess the real first 
step is probably to open a DSE support ticket, if I'm being honest, because we 
can't really be certain what's going on in the DSE version). If the newest 
3.0.15 doesn't work, the next option would be to upload the sstable that 
triggers this bug, and the schema. For some people that's really hard to do 
(privacy, company IP, personal info, etc), so I dont know if that's an option 
for you, but we need more information to debug it.

> upgradesstables fails after upgrading from 2.1.x to 3.0.14
> --
>
> Key: CASSANDRA-13957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13957
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dan Priscornic
>Priority: Major
>
> After upgrading DSE from 4.8.14 (cassandra 2.1.18.1463) to 5.0.10 (cassandra 
> 3.0.14.1862) I ran nodetool upgradesstables and it fails with the following 
> stack trace:
> {code:java}
> # nodetool -u cassandra -pwf /etc/dse/cassandra/jmxremote.password 
> upgradesstables
> error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:197)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:137)
>   at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111)
>   at 
> org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:149)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
>   at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:205)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:99)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:427)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:314)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6/61137731.run(Unknown
>  Source)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The bug seems similar to CASSANDRA-13320 which says it should be fixed in 
> cassandra 3.0.13 but does not look fixed in 3.0.14



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13991) NullPointerException when querying a table with a previous state

2017-11-03 Thread Chris mildebrandt (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris mildebrandt updated CASSANDRA-13991:
--
Attachment: CASSANDRA-13991.log

> NullPointerException when querying a table with a previous state
> 
>
> Key: CASSANDRA-13991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13991
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Chris mildebrandt
>Priority: Major
> Attachments: CASSANDRA-13991.log
>
>
> Performing the following steps (using the gocql library) results in an NPE:
> * With a table of 12 entries, read all rows.
> * Set the page size to 1 and read the first row. Save the query state.
> * Read all the row again.
> * Set the page size to 5 and the page state to the previous state. (This is 
> where the NPE occurs).
> This can be reproduced with the following project:
> https://github.com/eyeofthefrog/CASSANDRA-13991



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-11-03 Thread Chris mildebrandt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237969#comment-16237969
 ] 

Chris mildebrandt commented on CASSANDRA-13592:
---

Sure, https://issues.apache.org/jira/browse/CASSANDRA-13991
Thanks

> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>Priority: Major
>  Labels: beginner
> Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0
>
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> The error appears after the ---MORE--- line.
> The field "wert" has a JSON formatted string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13991) NullPointerException when querying a table with a previous state

2017-11-03 Thread Chris mildebrandt (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris mildebrandt updated CASSANDRA-13991:
--
Description: 
Performing the following steps (using the gocql library) results in an NPE:
* With a table of 12 entries, read all rows.
* Set the page size to 1 and read the first row. Save the query state.
* Read all the row again.
* Set the page size to 5 and the page state to the previous state. (This is 
where the NPE occurs).

This can be reproduced with the following project:
https://github.com/eyeofthefrog/CASSANDRA-13991

  was:
Performing the following steps (using the gocql library) results in an NPE:
* With a table of 12 entries, read all rows.
* Set the page size to 1 and read the first row. Save the query state.
* Read all the row again.
* Set the page size to 5 and the page state to the previous state. (This is 
where the NPE occurs).


> NullPointerException when querying a table with a previous state
> 
>
> Key: CASSANDRA-13991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13991
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Chris mildebrandt
>Priority: Major
>
> Performing the following steps (using the gocql library) results in an NPE:
> * With a table of 12 entries, read all rows.
> * Set the page size to 1 and read the first row. Save the query state.
> * Read all the row again.
> * Set the page size to 5 and the page state to the previous state. (This is 
> where the NPE occurs).
> This can be reproduced with the following project:
> https://github.com/eyeofthefrog/CASSANDRA-13991



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13991) NullPointerException when querying a table with a previous state

2017-11-03 Thread Chris mildebrandt (JIRA)
Chris mildebrandt created CASSANDRA-13991:
-

 Summary: NullPointerException when querying a table with a 
previous state
 Key: CASSANDRA-13991
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13991
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
Reporter: Chris mildebrandt
Priority: Major


Performing the following steps (using the gocql library) results in an NPE:
* With a table of 12 entries, read all rows.
* Set the page size to 1 and read the first row. Save the query state.
* Read all the row again.
* Set the page size to 5 and the page state to the previous state. (This is 
where the NPE occurs).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13475) First version of pluggable storage engine API.

2017-11-03 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237818#comment-16237818
 ] 

Blake Eggleston edited comment on CASSANDRA-13475 at 11/3/17 4:32 PM:
--

[~dikanggu], please, slow down!

Before getting into more details, we need to have a discussion with about this. 
This is going to be a long process, and without a high level plan or strategy, 
I don’t think it will get very far.

Each point in the plan proposal I posted is going to end up being it’s own 
lengthy discussion. A few sentences in a quip doc about a major cassandra 
component is a good start, but doesn’t really count as a plan.

So, back to the plan, each of my points (well pair of points, 
discussion/implementation are split up chronologically) is a component that’s 
going to need to be individually refactored. As an incremental approach to 
abstracting the storage layer, does this make sense?

edit: reworded to not come across as a dick. Sorry [~dikanggu] :)


was (Author: bdeggleston):
[~dikanggu] SLOW DOWN!

I’m not asking you questions, or asking for more details, I’m trying to have a 
discussion with you. This is going to be a long process, and without a high 
level plan or strategy, it’s not going anywhere.

Each point in the plan proposal I posted is going to end up being it’s own 
lengthy discussion. A few sentences in a quip doc about a major cassandra 
component is nearly meaningless, and is just noise at this point. Honestly, I’d 
suggest you edit out the pasted doc here just to remove an unnecessary wall of 
text which is linked elsewhere.

Now look, each of my points (well pair of points, discussion/implementation are 
split up chronologically) is a component that’s going to need to be 
individually refactored. As an incremental approach to abstracting the storage 
layer, does this make sense?

> First version of pluggable storage engine API.
> --
>
> Key: CASSANDRA-13475
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13475
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> In order to support pluggable storage engine, we need to define a unified 
> interface/API, which can allow us to plug in different storage engines for 
> different requirements. 
> Here is a design quip we are currently working on:  
> https://quip.com/bhw5ABUCi3co
> In very high level, the storage engine interface should include APIs to:
> 1. Apply update into the engine.
> 2. Query data from the engine.
> 3. Stream data in/out to/from the engine.
> 4. Table operations, like create/drop/truncate a table, etc.
> 5. Various stats about the engine.
> I create this ticket to start the discussions about the interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13475) First version of pluggable storage engine API.

2017-11-03 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237818#comment-16237818
 ] 

Blake Eggleston commented on CASSANDRA-13475:
-

[~dikanggu] SLOW DOWN!

I’m not asking you questions, or asking for more details, I’m trying to have a 
discussion with you. This is going to be a long process, and without a high 
level plan or strategy, it’s not going anywhere.

Each point in the plan proposal I posted is going to end up being it’s own 
lengthy discussion. A few sentences in a quip doc about a major cassandra 
component is nearly meaningless, and is just noise at this point. Honestly, I’d 
suggest you edit out the pasted doc here just to remove an unnecessary wall of 
text which is linked elsewhere.

Now look, each of my points (well pair of points, discussion/implementation are 
split up chronologically) is a component that’s going to need to be 
individually refactored. As an incremental approach to abstracting the storage 
layer, does this make sense?

> First version of pluggable storage engine API.
> --
>
> Key: CASSANDRA-13475
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13475
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> In order to support pluggable storage engine, we need to define a unified 
> interface/API, which can allow us to plug in different storage engines for 
> different requirements. 
> Here is a design quip we are currently working on:  
> https://quip.com/bhw5ABUCi3co
> In very high level, the storage engine interface should include APIs to:
> 1. Apply update into the engine.
> 2. Query data from the engine.
> 3. Stream data in/out to/from the engine.
> 4. Table operations, like create/drop/truncate a table, etc.
> 5. Various stats about the engine.
> I create this ticket to start the discussions about the interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237736#comment-16237736
 ] 

Jeff Jirsa commented on CASSANDRA-12182:


I realize it’s s pretty simple patch, but couldn’t nospamlogger give us this 
same behavior?

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> CompactionExecutor262   1234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> MemtableReclaimMemory 0 0 79 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,810  StatusLogger.java:56 - 
> PendingRangeCalculator0 0  3 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,819  StatusLogger.java:56 - 
> GossipStage   0 0   5214 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,820  StatusLogger.java:56 - 
> SecondaryIndexManagement  0 0  3 0
>  0
> INFO  

[jira] [Updated] (CASSANDRA-13990) Remove OldNetworkTopologyStrategy

2017-11-03 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-13990:
-
Component/s: Configuration

> Remove OldNetworkTopologyStrategy
> -
>
> Key: CASSANDRA-13990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13990
> Project: Cassandra
>  Issue Type: Wish
>  Components: Configuration
>Reporter: Jeremy Hanna
>
> RackAwareStrategy was renamed OldNetworkTopologyStrategy back in 0.7 
> (CASSANDRA-1392) and it's still around.  Is there any reason to keep this 
> relatively dead code in the codebase at this point?  I'm not aware of its use 
> and it sometimes confuses users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13990) Remove OldNetworkTopologyStrategy

2017-11-03 Thread Jeremy Hanna (JIRA)
Jeremy Hanna created CASSANDRA-13990:


 Summary: Remove OldNetworkTopologyStrategy
 Key: CASSANDRA-13990
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13990
 Project: Cassandra
  Issue Type: Wish
Reporter: Jeremy Hanna


RackAwareStrategy was renamed OldNetworkTopologyStrategy back in 0.7 
(CASSANDRA-1392) and it's still around.  Is there any reason to keep this 
relatively dead code in the codebase at this point?  I'm not aware of its use 
and it sometimes confuses users.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12182) redundant StatusLogger print out when both dropped message and long GC event happen

2017-11-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237572#comment-16237572
 ] 

Jason Brown commented on CASSANDRA-12182:
-

Nice patch, [~mychal].

wrt to the {{"StatusLogger is busy"}} message, i don't think we should bother 
logging that at {{info}}, or actually at any level visible to an operator. When 
someone is digging into a prod problem, the last thing we want to do is throw 
more information at them that is potentially confusing. Hence, at best we 
should log at {{trace}}. I updated the class and test here: 

||12182||
|[branch|https://github.com/jasobrown/cassandra/tree/12182]|
|[utests|https://circleci.com/gh/jasobrown/cassandra/tree/12182]|

I also fixed a naming thing in the test. wdyt?

I was worried about the thread safety and visibility effects of 
{{StatusLoggerTest.InMemoryAppender#events}}, but since the parent class 
({{AppenderBase}}) calls {{#append}} from within a {{synchronized}} method, we 
should be fine. (I also ran this test a few hundred times on my laptop.)

> redundant StatusLogger print out when both dropped message and long GC event 
> happen
> ---
>
> Key: CASSANDRA-12182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12182
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Wei Deng
>Assignee: Michał Szczygieł
>Priority: Minor
>  Labels: lhf
> Attachments: 12182-trunk.txt, 12182-trunk.txt
>
>
> I was stress testing a C* 3.0 environment and it appears that when the CPU is 
> running low, HINT and MUTATION messages will start to get dropped, and the GC 
> thread can also get some really long-running GC, and I'd get some redundant 
> log entries in system.log like the following:
> {noformat}
> WARN  [Service Thread] 2016-07-12 22:48:45,748  GCInspector.java:282 - G1 
> Young Generation GC in 522ms.  G1 Eden Space: 68157440 -> 0; G1 Old Gen: 
> 3376113224 -> 3468387912; G1 Survivor Space: 24117248 -> 0; 
> INFO  [Service Thread] 2016-07-12 22:48:45,763  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,775  MessagingService.java:983 - 
> MUTATION messages were dropped in last 5000 ms: 419 for internal timeout and 
> 0 for cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  MessagingService.java:983 - 
> HINT messages were dropped in last 5000 ms: 89 for internal timeout and 0 for 
> cross node timeout
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,776  StatusLogger.java:52 - Pool 
> NameActive   Pending  Completed   Blocked  All Time 
> Blocked
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> MutationStage32  4194   32997234 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,798  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,799  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,800  StatusLogger.java:56 - 
> MutationStage32  4363   32997333 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ViewMutationStage 0 0  0 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,801  StatusLogger.java:56 - 
> ReadStage 0 0940 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> RequestResponseStage  0 0   11094437 0
>  0
> INFO  [Service Thread] 2016-07-12 22:48:45,802  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,803  StatusLogger.java:56 - 
> RequestResponseStage  4 0   11094509 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,807  StatusLogger.java:56 - 
> ReadRepairStage   0 0  5 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,808  StatusLogger.java:56 - 
> CounterMutationStage  0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 - 
> MiscStage 0 0  0 0
>  0
> INFO  [ScheduledTasks:1] 2016-07-12 22:48:45,809  StatusLogger.java:56 

[jira] [Commented] (CASSANDRA-10404) Node to Node encryption transitional mode

2017-11-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237524#comment-16237524
 ] 

Jason Brown commented on CASSANDRA-10404:
-

ccm PR was merged, and I've pushed the code to trunk as sha 
{{260846685b6129a324a7cb7396da135fee85ec04}} and to dtests repo as sha 
{{7cc06a086f89ed76499837558ff263d84337acba}}

> Node to Node encryption transitional mode
> -
>
> Key: CASSANDRA-10404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10404
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Tom Lewis
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
>
> Create a transitional mode for encryption that allows encrypted and 
> unencrypted traffic node-to-node during a change over to encryption from 
> unencrypted. This alleviates downtime during the switch.
>  This is similar to CASSANDRA-10559 which is intended for client-to-node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10404) Node to Node encryption transitional mode

2017-11-03 Thread Jason Brown (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Brown updated CASSANDRA-10404:

   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Ready to Commit)

> Node to Node encryption transitional mode
> -
>
> Key: CASSANDRA-10404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10404
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Tom Lewis
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
>
> Create a transitional mode for encryption that allows encrypted and 
> unencrypted traffic node-to-node during a change over to encryption from 
> unencrypted. This alleviates downtime during the switch.
>  This is similar to CASSANDRA-10559 which is intended for client-to-node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: Node to Node encryption transitional mode

2017-11-03 Thread jasobrown
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 957ae2bc4 -> 7cc06a086


Node to Node encryption transitional mode

patch by jasobrown; reviewed by Stefan Podkowinski for CASSANDRA-10404


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/7cc06a08
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/7cc06a08
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/7cc06a08

Branch: refs/heads/master
Commit: 7cc06a086f89ed76499837558ff263d84337acba
Parents: 957ae2b
Author: Jason Brown 
Authored: Thu May 25 03:57:54 2017 -0700
Committer: Jason Brown 
Committed: Fri Nov 3 05:09:36 2017 -0700

--
 requirements.txt   |  2 +-
 sslnodetonode_test.py  | 87 +
 upgrade_tests/upgrade_through_versions_test.py |  8 +-
 3 files changed, 62 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/7cc06a08/requirements.txt
--
diff --git a/requirements.txt b/requirements.txt
index a939dcd..2832ff1 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -4,7 +4,7 @@
 futures
 six
 -e 
git+https://github.com/datastax/python-driver.git@cassandra-test#egg=cassandra-driver
-ccm==2.8.4
+ccm==3.1.0
 cql
 decorator
 docopt

http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/7cc06a08/sslnodetonode_test.py
--
diff --git a/sslnodetonode_test.py b/sslnodetonode_test.py
index a675985..d498b0f 100644
--- a/sslnodetonode_test.py
+++ b/sslnodetonode_test.py
@@ -31,7 +31,7 @@ class TestNodeToNodeSSLEncryption(Tester):
 credNode1 = sslkeygen.generate_credentials("127.0.0.1")
 credNode2 = sslkeygen.generate_credentials("127.0.0.2", 
credNode1.cakeystore, credNode1.cacert)
 
-self.setup_nodes(credNode1, credNode2, endpointVerification=True)
+self.setup_nodes(credNode1, credNode2, endpoint_verification=True)
 self.allow_log_errors = False
 self.cluster.start()
 time.sleep(2)
@@ -43,7 +43,7 @@ class TestNodeToNodeSSLEncryption(Tester):
 credNode1 = sslkeygen.generate_credentials("127.0.0.80")
 credNode2 = sslkeygen.generate_credentials("127.0.0.81", 
credNode1.cakeystore, credNode1.cacert)
 
-self.setup_nodes(credNode1, credNode2, endpointVerification=False)
+self.setup_nodes(credNode1, credNode2, endpoint_verification=False)
 self.cluster.start()
 time.sleep(2)
 self.cql_connection(self.node1)
@@ -54,7 +54,7 @@ class TestNodeToNodeSSLEncryption(Tester):
 credNode1 = sslkeygen.generate_credentials("127.0.0.80")
 credNode2 = sslkeygen.generate_credentials("127.0.0.81", 
credNode1.cakeystore, credNode1.cacert)
 
-self.setup_nodes(credNode1, credNode2, endpointVerification=True)
+self.setup_nodes(credNode1, credNode2, endpoint_verification=True)
 
 self.allow_log_errors = True
 self.cluster.start(no_wait=True)
@@ -66,7 +66,6 @@ class TestNodeToNodeSSLEncryption(Tester):
 self.assertTrue(found)
 
 self.cluster.stop()
-self.assertTrue(found)
 
 def ssl_client_auth_required_fail_test(self):
 """peers need to perform mutual auth (cient auth required), but do not 
supply the local cert"""
@@ -117,15 +116,41 @@ class TestNodeToNodeSSLEncryption(Tester):
 self.cluster.stop()
 self.assertTrue(found)
 
+def optional_outbound_tls_test(self):
+"""listen on TLS port, but optionally connect using TLS. this supports 
the upgrade case of starting with a non-encrypted cluster and then upgrading 
each node to use encryption."""
+credNode1 = sslkeygen.generate_credentials("127.0.0.1")
+credNode2 = sslkeygen.generate_credentials("127.0.0.2", 
credNode1.cakeystore, credNode1.cacert)
+
+# first, start cluster without TLS (either listening or connecting
+self.setup_nodes(credNode1, credNode2, internode_encryption='none', 
encryption_enabled=False)
+self.cluster.start()
+self.cql_connection(self.node1)
+
+# next bounce the cluster to listen on both plain/secure sockets (do 
not connect secure port, yet, though)
+self.bounce_node_with_updated_config(credNode1, self.node1, 'none', 
True, True)
+self.bounce_node_with_updated_config(credNode2, self.node2, 'none', 
True, True)
+
+# next connect with TLS for the outbound connections
+self.bounce_node_with_updated_config(credNode1, self.node1, 'all', 
True, True)
+self.bounce_node_with_updated_config(credNode2, self.node2, 'all', 
True, True)
+
+# 

cassandra git commit: Node to Node encryption transitional mode

2017-11-03 Thread jasobrown
Repository: cassandra
Updated Branches:
  refs/heads/trunk 87962dcf3 -> 260846685


Node to Node encryption transitional mode

patch by jasobrown; reviewed by Stefan Podkowinski for CASSANDRA-10404


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/26084668
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/26084668
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/26084668

Branch: refs/heads/trunk
Commit: 260846685b6129a324a7cb7396da135fee85ec04
Parents: 87962dc
Author: Jason Brown 
Authored: Wed Feb 15 05:41:30 2017 -0800
Committer: Jason Brown 
Committed: Fri Nov 3 05:06:38 2017 -0700

--
 NEWS.txt|   5 +-
 conf/cassandra.yaml |  23 +-
 .../org/apache/cassandra/config/Config.java |   4 +-
 .../cassandra/config/DatabaseDescriptor.java|  33 ++-
 .../cassandra/config/EncryptionOptions.java |  43 +++-
 .../locator/ReconnectableSnitchHelper.java  |   2 +-
 .../apache/cassandra/net/MessagingService.java  | 124 +++---
 .../cassandra/net/async/NettyFactory.java   | 117 ++
 .../cassandra/net/async/OptionalSslHandler.java |  67 ++
 .../net/async/OutboundConnectionIdentifier.java |   6 +
 .../net/async/OutboundMessagingConnection.java  |  27 +++
 .../cassandra/streaming/StreamSession.java  |   2 +-
 .../org/apache/cassandra/tools/BulkLoader.java  |   2 +-
 .../apache/cassandra/tools/LoaderOptions.java   |   6 +-
 .../org/apache/cassandra/transport/Client.java  |   7 +-
 .../org/apache/cassandra/transport/Server.java  |   2 +-
 .../cassandra/transport/SimpleClient.java   |  15 +-
 .../org/apache/cassandra/utils/FBUtilities.java |   2 +-
 .../cassandra/net/MessagingServiceTest.java | 228 ---
 .../cassandra/net/async/NettyFactoryTest.java   |  51 -
 .../async/OutboundMessagingConnectionTest.java  |  45 
 .../service/ProtocolBetaVersionTest.java|   4 +-
 .../cassandra/transport/MessagePayloadTest.java |   4 +-
 .../stress/settings/SettingsTransport.java  |   5 +-
 .../stress/settings/StressSettings.java |   2 +-
 .../cassandra/stress/util/JavaDriverClient.java |   6 +-
 26 files changed, 657 insertions(+), 175 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/26084668/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 7a133b8..09a9a7b 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -39,7 +39,7 @@ Upgrading
   4.0 and the legacy tables must have been removed. See the 'Upgrading' 
section
   for version 2.2 for migration instructions.
 - Cassandra 4.0 removed support for the deprecated Thrift interface. 
Amongst
-  Tother things, this imply the removal of all yaml option related to 
thrift
+  other things, this implies the removal of all yaml options related to 
thrift
   ('start_rpc', rpc_port, ...).
 - Cassandra 4.0 removed support for any pre-3.0 format. This means you
   cannot upgrade from a 2.x version to 4.0 directly, you have to upgrade to
@@ -67,6 +67,9 @@ Upgrading
- the miniumum value for internode message timeouts is 10ms. 
Previously, any
  positive value was allowed. See cassandra.yaml entries like
  read_request_timeout_in_ms for more details.
+   - Cassandra 4.0 allows a single port to be used for both secure and 
insecure
+ connections between cassandra nodes (CASSANDRA-10404). See the yaml 
for
+ specific property changes, and see the security doc for full details.
 
 Materialized Views
 ---

http://git-wip-us.apache.org/repos/asf/cassandra/blob/26084668/conf/cassandra.yaml
--
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index ef94613..e41af17 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -570,9 +570,10 @@ trickle_fsync_interval_in_kb: 10240
 # For security reasons, you should not expose this port to the internet.  
Firewall it if needed.
 storage_port: 7000
 
-# SSL port, for encrypted communication.  Unused unless enabled in
-# encryption_options
-# For security reasons, you should not expose this port to the internet.  
Firewall it if needed.
+# SSL port, for legacy encrypted communication. This property is unused unless 
enabled in
+# server_encryption_options (see below). As of cassandra 4.0, this property is 
deprecated
+# as a single port can be used for either/both secure and insecure connections.
+# For security reasons, you should not expose this port to the internet. 
Firewall it if needed.
 ssl_storage_port: 7001
 
 # Address or interface to bind to and tell other Cassandra nodes to connect to.
@@ 

[jira] [Commented] (CASSANDRA-10404) Node to Node encryption transitional mode

2017-11-03 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237480#comment-16237480
 ] 

Jason Brown commented on CASSANDRA-10404:
-

I've created a [ccm PR|https://github.com/pcmanus/ccm/pull/639]. After that is 
committed, I'll commit the dtest and trunk patches.

CASSANDRA-13989 is the ticket for the security docs updating.

> Node to Node encryption transitional mode
> -
>
> Key: CASSANDRA-10404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10404
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Tom Lewis
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.x
>
>
> Create a transitional mode for encryption that allows encrypted and 
> unencrypted traffic node-to-node during a change over to encryption from 
> unencrypted. This alleviates downtime during the switch.
>  This is similar to CASSANDRA-10559 which is intended for client-to-node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13989) Update security docs for 4.0

2017-11-03 Thread Jason Brown (JIRA)
Jason Brown created CASSANDRA-13989:
---

 Summary: Update security docs for 4.0
 Key: CASSANDRA-13989
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13989
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jason Brown
Assignee: Jason Brown
Priority: Minor
 Fix For: 4.x


CASSANDRA-8457 and CASSANDRA-10404 have brought changes to the way SSL works 
for both internode messaging and the native protocol. Update the docs to 
reflect information that is important to users/operators.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13475) First version of pluggable storage engine API.

2017-11-03 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237367#comment-16237367
 ] 

Stefan Podkowinski commented on CASSANDRA-13475:


This sounds... **very** ambitious. ;) 

> First version of pluggable storage engine API.
> --
>
> Key: CASSANDRA-13475
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13475
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> In order to support pluggable storage engine, we need to define a unified 
> interface/API, which can allow us to plug in different storage engines for 
> different requirements. 
> Here is a design quip we are currently working on:  
> https://quip.com/bhw5ABUCi3co
> In very high level, the storage engine interface should include APIs to:
> 1. Apply update into the engine.
> 2. Query data from the engine.
> 3. Stream data in/out to/from the engine.
> 4. Table operations, like create/drop/truncate a table, etc.
> 5. Various stats about the engine.
> I create this ticket to start the discussions about the interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X

2017-11-03 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237338#comment-16237338
 ] 

Sylvain Lebresne commented on CASSANDRA-10857:
--

Ok thanks. I'm +1 on the 3.0 and 3.11 patches, though I trust you'll run CI 
before committing.

Also +1 on the trunk minimal patch, except maybe that I believe the 
`COMPACT_STORAGE_DEPRECATION_MESSAGE` message would be primarily shown if 
something is trying to create a compact table in a mixed version cluster (the 
other case being the user trying to hack the schema tables manually, and while 
it's great we handle that properly, I'm less worried about the clarity of the 
message in that case because you are clearly messing around in the first 
place), so I'd add some kind of message to help the user understand why he's 
getting this. Something along the lines of "this can happen if you just created 
a COMPACT STORAGE table in a mixed-version cluster which is not supported". I'm 
fine having you do so on commit though. Same remark than above though, I'm 
assuming you made or will make sure CI looks good before committing. Lastly, we 
should also create a follow-up to clean up compact storage code internally.

> Allow dropping COMPACT STORAGE flag from tables in 3.X
> --
>
> Key: CASSANDRA-10857
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10857
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL, Distributed Metadata
>Reporter: Aleksey Yeschenko
>Assignee: Alex Petrov
>Priority: Blocker
>  Labels: client-impacting
> Fix For: 3.0.x, 3.11.x
>
>
> Thrift allows users to define flexible mixed column families - where certain 
> columns would have explicitly pre-defined names, potentially non-default 
> validation types, and be indexed.
> Example:
> {code}
> create column family foo
> and default_validation_class = UTF8Type
> and column_metadata = [
> {column_name: bar, validation_class: Int32Type, index_type: KEYS},
> {column_name: baz, validation_class: UUIDType, index_type: KEYS}
> ];
> {code}
> Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and 
> {{UUIDType}}, respectively, and be indexed. Columns with any other name will 
> be validated by {{UTF8Type}} and will not be indexed.
> With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns 
> internally. However, being {{WITH COMPACT STORAGE}}, the table will only 
> expose {{bar}} and {{baz}} columns. Accessing any dynamic columns (any column 
> not named {{bar}} and {{baz}}) right now requires going through Thrift.
> This is blocking Thrift -> CQL migration for users who have mixed 
> dynamic/static column families. That said, it *shouldn't* be hard to allow 
> users to drop the {{compact}} flag to expose the table as it is internally 
> now, and be able to access all columns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13957) upgradesstables fails after upgrading from 2.1.x to 3.0.14

2017-11-03 Thread Dan Priscornic (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237328#comment-16237328
 ] 

Dan Priscornic commented on CASSANDRA-13957:


This is a pretty big blocker for us. Is there a way to have someone look over 
this bug?

> upgradesstables fails after upgrading from 2.1.x to 3.0.14
> --
>
> Key: CASSANDRA-13957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13957
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dan Priscornic
>Priority: Major
>
> After upgrading DSE from 4.8.14 (cassandra 2.1.18.1463) to 5.0.10 (cassandra 
> 3.0.14.1862) I ran nodetool upgradesstables and it fails with the following 
> stack trace:
> {code:java}
> # nodetool -u cassandra -pwf /etc/dse/cassandra/jmxremote.password 
> upgradesstables
> error: null
> -- StackTrace --
> java.lang.AssertionError
>   at org.apache.cassandra.db.rows.Rows.collectStats(Rows.java:70)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$StatsCollector.applyToRow(BigTableWriter.java:197)
>   at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:137)
>   at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111)
>   at 
> org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:149)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
>   at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:205)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:99)
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$5.execute(CompactionManager.java:427)
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:314)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>   at 
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6/61137731.run(Unknown
>  Source)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The bug seems similar to CASSANDRA-13320 which says it should be fixed in 
> cassandra 3.0.13 but does not look fixed in 3.0.14



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13810) Overload because of hint pressure + MVs

2017-11-03 Thread Tom van der Woerdt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237327#comment-16237327
 ] 

Tom van der Woerdt commented on CASSANDRA-13810:


It's still happening, but we (mostly) worked around it by lowering 
hinted_handoff_throttle_in_kb 100x. :)

> Overload because of hint pressure + MVs
> ---
>
> Key: CASSANDRA-13810
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13810
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Tom van der Woerdt
>Priority: Major
>  Labels: materializedviews
>
> Cluster setup: 3 DCs, 20 Cassandra nodes each, all 3.0.14, with approx. 200GB 
> data per machine. Many tables have MVs associated.
> During some maintenance we did a rolling restart of all nodes in the cluster. 
> This caused a buildup of hints/batches, as expected. Most nodes came back 
> just fine, except for two nodes.
> These two nodes came back with a loadavg of >100, and 'nodetool tpstats' 
> showed a million (not exaggerating) MutationStage tasks per second(!). It was 
> clear that these were mostly (all?) mutations coming from hints, as indicated 
> by thousands of log entries per second in debug.log :
> {noformat}
> DEBUG [SharedPool-Worker-107] 2017-08-27 13:16:51,098 HintVerbHandler.java:95 
> - Failed to apply hint
> java.util.concurrent.CompletionException: 
> org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
> received only 0 responses.
> at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) 
> ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[na:1.8.0_144]
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:481) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.db.Keyspace.lambda$applyInternal$0(Keyspace.java:495) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_144]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144]
> Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation 
> timed out - received only 0 responses.
> ... 6 common frames omitted
> {noformat}
> After reading the relevant code, it seems that a hint is considered 
> droppable, and in the mutation path when the table contains a MV and the lock 
> fails to acquire and the mutation is droppable, it throws a WTE without 
> waiting until the timeout expires. This explains why Cassandra is able to 
> process a million mutations per second without actually considering them 
> 'dropped' in the 'nodetool tpstats' output.
> I managed to recover the two nodes by stopping handoffs on all nodes in the 
> cluster and reenabling them one at a time. It's likely that the hint/batchlog 
> settings were sub-optimal on this cluster, but I think that the retry 
> behavior(?) of hints should be improved as it's hard to express hint 
> throughput in kb/s when the mutations can involve MVs.
> More data available upon request -- I'm not sure which bits are relevant and 
> which aren't.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13810) Overload because of hint pressure + MVs

2017-11-03 Thread Samo Gabrovec (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237322#comment-16237322
 ] 

Samo Gabrovec edited comment on CASSANDRA-13810 at 11/3/17 8:56 AM:


I'm experiencing the same problem on 4 node 3.11.1 cluster.
[~tvdw] what did you change in configs ? Is it still happening to you ? 



was (Author: samek):
I'm experiencing the same problem on 4 node 3.11.1 cluster.
[~TomVanWemmel] what did you change in configs ? Is it still happening to you ? 


> Overload because of hint pressure + MVs
> ---
>
> Key: CASSANDRA-13810
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13810
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Tom van der Woerdt
>Priority: Major
>  Labels: materializedviews
>
> Cluster setup: 3 DCs, 20 Cassandra nodes each, all 3.0.14, with approx. 200GB 
> data per machine. Many tables have MVs associated.
> During some maintenance we did a rolling restart of all nodes in the cluster. 
> This caused a buildup of hints/batches, as expected. Most nodes came back 
> just fine, except for two nodes.
> These two nodes came back with a loadavg of >100, and 'nodetool tpstats' 
> showed a million (not exaggerating) MutationStage tasks per second(!). It was 
> clear that these were mostly (all?) mutations coming from hints, as indicated 
> by thousands of log entries per second in debug.log :
> {noformat}
> DEBUG [SharedPool-Worker-107] 2017-08-27 13:16:51,098 HintVerbHandler.java:95 
> - Failed to apply hint
> java.util.concurrent.CompletionException: 
> org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
> received only 0 responses.
> at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) 
> ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[na:1.8.0_144]
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:481) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.db.Keyspace.lambda$applyInternal$0(Keyspace.java:495) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_144]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144]
> Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation 
> timed out - received only 0 responses.
> ... 6 common frames omitted
> {noformat}
> After reading the relevant code, it seems that a hint is considered 
> droppable, and in the mutation path when the table contains a MV and the lock 
> fails to acquire and the mutation is droppable, it throws a WTE without 
> waiting until the timeout expires. This explains why Cassandra is able to 
> process a million mutations per second without actually considering them 
> 'dropped' in the 'nodetool tpstats' output.
> I managed to recover the two nodes by stopping handoffs on all nodes in the 
> cluster and reenabling them one at a time. It's likely that the hint/batchlog 
> settings were sub-optimal on this cluster, but I think that the retry 
> behavior(?) of hints should be improved as it's hard to express hint 
> throughput in kb/s when the mutations can involve MVs.
> More data available upon request -- I'm not sure which bits are relevant and 
> which aren't.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13810) Overload because of hint pressure + MVs

2017-11-03 Thread Samo Gabrovec (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237322#comment-16237322
 ] 

Samo Gabrovec edited comment on CASSANDRA-13810 at 11/3/17 8:55 AM:


I'm experiencing the same problem on 4 node 3.11.1 cluster.
[~TomVanWemmel] what did you change in configs ? Is it still happening to you ? 



was (Author: samek):
I'm experiencing the same problem on 4 node 3.11.1 cluster.
[~tomvandenberge] what did you change in configs ? Is it still happening to you 
? 


> Overload because of hint pressure + MVs
> ---
>
> Key: CASSANDRA-13810
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13810
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Tom van der Woerdt
>Priority: Major
>  Labels: materializedviews
>
> Cluster setup: 3 DCs, 20 Cassandra nodes each, all 3.0.14, with approx. 200GB 
> data per machine. Many tables have MVs associated.
> During some maintenance we did a rolling restart of all nodes in the cluster. 
> This caused a buildup of hints/batches, as expected. Most nodes came back 
> just fine, except for two nodes.
> These two nodes came back with a loadavg of >100, and 'nodetool tpstats' 
> showed a million (not exaggerating) MutationStage tasks per second(!). It was 
> clear that these were mostly (all?) mutations coming from hints, as indicated 
> by thousands of log entries per second in debug.log :
> {noformat}
> DEBUG [SharedPool-Worker-107] 2017-08-27 13:16:51,098 HintVerbHandler.java:95 
> - Failed to apply hint
> java.util.concurrent.CompletionException: 
> org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
> received only 0 responses.
> at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) 
> ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[na:1.8.0_144]
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:481) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.db.Keyspace.lambda$applyInternal$0(Keyspace.java:495) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_144]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144]
> Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation 
> timed out - received only 0 responses.
> ... 6 common frames omitted
> {noformat}
> After reading the relevant code, it seems that a hint is considered 
> droppable, and in the mutation path when the table contains a MV and the lock 
> fails to acquire and the mutation is droppable, it throws a WTE without 
> waiting until the timeout expires. This explains why Cassandra is able to 
> process a million mutations per second without actually considering them 
> 'dropped' in the 'nodetool tpstats' output.
> I managed to recover the two nodes by stopping handoffs on all nodes in the 
> cluster and reenabling them one at a time. It's likely that the hint/batchlog 
> settings were sub-optimal on this cluster, but I think that the retry 
> behavior(?) of hints should be improved as it's hard to express hint 
> throughput in kb/s when the mutations can involve MVs.
> More data available upon request -- I'm not sure which bits are relevant and 
> which aren't.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13810) Overload because of hint pressure + MVs

2017-11-03 Thread Samo Gabrovec (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237322#comment-16237322
 ] 

Samo Gabrovec commented on CASSANDRA-13810:
---

I'm experiencing the same problem on 4 node 3.11.1 cluster.
[~tomvandenberge] what did you change in configs ? Is it still happening to you 
? 


> Overload because of hint pressure + MVs
> ---
>
> Key: CASSANDRA-13810
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13810
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Tom van der Woerdt
>Priority: Major
>  Labels: materializedviews
>
> Cluster setup: 3 DCs, 20 Cassandra nodes each, all 3.0.14, with approx. 200GB 
> data per machine. Many tables have MVs associated.
> During some maintenance we did a rolling restart of all nodes in the cluster. 
> This caused a buildup of hints/batches, as expected. Most nodes came back 
> just fine, except for two nodes.
> These two nodes came back with a loadavg of >100, and 'nodetool tpstats' 
> showed a million (not exaggerating) MutationStage tasks per second(!). It was 
> clear that these were mostly (all?) mutations coming from hints, as indicated 
> by thousands of log entries per second in debug.log :
> {noformat}
> DEBUG [SharedPool-Worker-107] 2017-08-27 13:16:51,098 HintVerbHandler.java:95 
> - Failed to apply hint
> java.util.concurrent.CompletionException: 
> org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out - 
> received only 0 responses.
> at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) 
> ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[na:1.8.0_144]
> at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[na:1.8.0_144]
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:481) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.db.Keyspace.lambda$applyInternal$0(Keyspace.java:495) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_144]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> ~[apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144]
> Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation 
> timed out - received only 0 responses.
> ... 6 common frames omitted
> {noformat}
> After reading the relevant code, it seems that a hint is considered 
> droppable, and in the mutation path when the table contains a MV and the lock 
> fails to acquire and the mutation is droppable, it throws a WTE without 
> waiting until the timeout expires. This explains why Cassandra is able to 
> process a million mutations per second without actually considering them 
> 'dropped' in the 'nodetool tpstats' output.
> I managed to recover the two nodes by stopping handoffs on all nodes in the 
> cluster and reenabling them one at a time. It's likely that the hint/batchlog 
> settings were sub-optimal on this cluster, but I think that the retry 
> behavior(?) of hints should be improved as it's hard to express hint 
> throughput in kb/s when the mutations can involve MVs.
> More data available upon request -- I'm not sure which bits are relevant and 
> which aren't.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13592) Null Pointer exception at SELECT JSON statement

2017-11-03 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237281#comment-16237281
 ] 

Benjamin Lerer commented on CASSANDRA-13592:


It is a different issue. Could you open another ticket?


> Null Pointer exception at SELECT JSON statement
> ---
>
> Key: CASSANDRA-13592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13592
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Debian Linux
>Reporter: Wyss Philipp
>Assignee: ZhaoYang
>Priority: Major
>  Labels: beginner
> Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0
>
> Attachments: system.log
>
>
> A Nulll pointer exception appears when the command
> {code}
> SELECT JSON * FROM examples.basic;
> ---MORE---
>  message="java.lang.NullPointerException">
> Examples.basic has the following description (DESC examples.basic;):
> CREATE TABLE examples.basic (
> key frozen> PRIMARY KEY,
> wert text
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}
> The error appears after the ---MORE--- line.
> The field "wert" has a JSON formatted string.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10404) Node to Node encryption transitional mode

2017-11-03 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237270#comment-16237270
 ] 

Stefan Podkowinski commented on CASSANDRA-10404:


+1
Please don't forget the separate ticket on docs and let me know if you need 
help on that.

> Node to Node encryption transitional mode
> -
>
> Key: CASSANDRA-10404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10404
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Tom Lewis
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.x
>
>
> Create a transitional mode for encryption that allows encrypted and 
> unencrypted traffic node-to-node during a change over to encryption from 
> unencrypted. This alleviates downtime during the switch.
>  This is similar to CASSANDRA-10559 which is intended for client-to-node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10404) Node to Node encryption transitional mode

2017-11-03 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-10404:
---
Status: Ready to Commit  (was: Patch Available)

> Node to Node encryption transitional mode
> -
>
> Key: CASSANDRA-10404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10404
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Tom Lewis
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.x
>
>
> Create a transitional mode for encryption that allows encrypted and 
> unencrypted traffic node-to-node during a change over to encryption from 
> unencrypted. This alleviates downtime during the switch.
>  This is similar to CASSANDRA-10559 which is intended for client-to-node



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13948) Reload compaction strategies when JBOD disk boundary changes

2017-11-03 Thread Loic Lambiel (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Loic Lambiel updated CASSANDRA-13948:
-
Attachment: trace.log

Ok I was able to reproduce it on one node. I've attached the trace log. It's 
unfiltered since I didn't managed to filter only to 
org.apache.cassandra.db.compaction





> Reload compaction strategies when JBOD disk boundary changes
> 
>
> Key: CASSANDRA-13948
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13948
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Major
> Fix For: 3.11.x, 4.x
>
> Attachments: debug.log, trace.log
>
>
> The thread dump below shows a race between an sstable replacement by the 
> {{IndexSummaryRedistribution}} and 
> {{AbstractCompactionTask.getNextBackgroundTask}}:
> {noformat}
> Thread 94580: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=175 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
> @bci=1, line=836 (Compiled frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
>  int) @bci=67, line=870 (Compiled frame)
>  - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) 
> @bci=17, line=1199 (Compiled frame)
>  - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, 
> line=943 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleListChangedNotification(java.lang.Iterable,
>  java.lang.Iterable) @bci=359, line=483 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionStrategyManager.handleNotification(org.apache.cassandra.notifications.INotification,
>  java.lang.Object) @bci=53, line=555 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.Tracker.notifySSTablesChanged(java.util.Collection,
>  java.util.Collection, org.apache.cassandra.db.compaction.OperationType, 
> java.lang.Throwable) @bci=50, line=409 (Interpreted frame)
>  - 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.doCommit(java.lang.Throwable)
>  @bci=157, line=227 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit(java.lang.Throwable)
>  @bci=61, line=116 (Compiled frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.commit()
>  @bci=2, line=200 (Interpreted frame)
>  - 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish()
>  @bci=5, line=185 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryRedistribution.redistributeSummaries()
>  @bci=559, line=130 (Interpreted frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionManager.runIndexSummaryRedistribution(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=9, line=1420 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(org.apache.cassandra.io.sstable.IndexSummaryRedistribution)
>  @bci=4, line=250 (Interpreted frame)
>  - 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries() 
> @bci=30, line=228 (Interpreted frame)
>  - org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow() 
> @bci=4, line=125 (Interpreted frame)
>  - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run()
>  @bci=4, line=118 (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
> (Compiled frame)
>  - java.util.concurrent.FutureTask.runAndReset() @bci=47, line=308 (Compiled 
> frame)
>  - 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask)
>  @bci=1, line=180 (Compiled frame)
>  - java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run() 
> @bci=37, line=294 (Compiled frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1149 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 
> (Interpreted frame)
>  - 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
>  @bci=1, line=81 (Interpreted frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$8.run() @bci=4 
> (Interpreted frame)
>