[jira] [Created] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Alexander Dejanovski (JIRA)
Alexander Dejanovski created CASSANDRA-14326:


 Summary: Handle verbose logging at a different level than DEBUG
 Key: CASSANDRA-14326
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
 Project: Cassandra
  Issue Type: Improvement
Reporter: Alexander Dejanovski
 Fix For: 4.0


CASSANDRA-10241 introduced debug logging turned on by default to act as a 
verbose system.log and help troubleshoot production issues. 

One of the consequence was to severely affect read performance in 2.2 as 
contributors weren't all up to speed on how to use logging levels 
(CASSANDRA-14318).

As DEBUG level has a very specific meaning in dev, it is confusing to use it 
for always on verbose logging and should probably not be used this way in 
Cassandra.

Options so far are :
 # Bring back common loggings to INFO level (compactions, flushes, etc...) and 
disable debug logging by default
 # Use files named as verbose-system.log instead of debug.log and use a custom 
logging level instead of DEBUG for verbose tracing, that would be enabled by 
default. Debug logging would still exist and be disabled by default and the 
root logger level (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14323) Same timestamp insert conflict resolution breaks row-level data consistency

2018-03-20 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405968#comment-16405968
 ] 

Benjamin Lerer commented on CASSANDRA-14323:


If you look only at you example it is true that the result is surprising.

Now, if you perform the 2 following queries one after the other what result do 
you expect?
 # {{insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2') USING 
TIMESTAMP 1521080773000;}}
 # {{insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1') USING 
TIMESTAMP 1521080773000;}}

One or the other? The problem here is that due to the distributed nature of C* 
a node can receive the inserts in any order and 2 node of the same cluster can 
receive them in different order. Due to that C* cannot rely on the order to 
determine which value it should keep, and need to rely on predictable way to 
merge those 2 inserts into one row. Unfortunately, there are no perfect way to 
do that so an arbitrary rule had to be chosen.

Batches behave in the same way. Which make them {{consistent}} with normal 
inserts and other operations. Changing the behavior for batches will make them 
inconsistent with the rest of the application. 

Multiple collection updates within the same queries follow also the same rules.

I hope it clarifies the things. 

> Same timestamp insert conflict resolution breaks row-level data consistency
> ---
>
> Key: CASSANDRA-14323
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14323
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Rishi Kathera
>Priority: Minor
>
> When inserting multiple rows with the same primary key and timestamp, 
> memtable update logic does not maintain row-level consistency for the key 
> inserted. For example,
> {code:java}
> create table test.consistency(pk int PRIMARY KEY , nk1 text, nk2 text);
> BEGIN UNLOGGED BATCH USING TIMESTAMP 1521080773000 
> insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2'); 
> insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1'); 
> APPLY BATCH; 
> select * from test.consistency;
> {code}
> In this case, I would expect either one row overwrites the other so the 
> result of the read would be either
> {code:java}
> 2, nk1, nk2{code}
> or
> {code:java}
> 2, nk2, nk1{code}
> but the row retrieved is
> {code:java}
> 2, nk2, nk2{code}
>  which breaks consistency of the writes. This behavior comes from this logic, 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Conflicts.java#L45]
> where it appears that the value of the cell itself is used to resolve 
> overwrite conflict which I don't think is the correct way of handling the 
> situation. Shouldn't it either be overwrite or not overwrite for all cases?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14323) Same timestamp insert conflict resolution breaks row-level data consistency

2018-03-20 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer resolved CASSANDRA-14323.

   Resolution: Won't Fix
Reproduced In: 3.11.2, 2.1.13  (was: 2.1.13, 3.11.2)

> Same timestamp insert conflict resolution breaks row-level data consistency
> ---
>
> Key: CASSANDRA-14323
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14323
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Rishi Kathera
>Priority: Minor
>
> When inserting multiple rows with the same primary key and timestamp, 
> memtable update logic does not maintain row-level consistency for the key 
> inserted. For example,
> {code:java}
> create table test.consistency(pk int PRIMARY KEY , nk1 text, nk2 text);
> BEGIN UNLOGGED BATCH USING TIMESTAMP 1521080773000 
> insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk1','nk2'); 
> insert into test.consistency (pk,nk1,nk2) VALUES (2,'nk2','nk1'); 
> APPLY BATCH; 
> select * from test.consistency;
> {code}
> In this case, I would expect either one row overwrites the other so the 
> result of the read would be either
> {code:java}
> 2, nk1, nk2{code}
> or
> {code:java}
> 2, nk2, nk1{code}
> but the row retrieved is
> {code:java}
> 2, nk2, nk2{code}
>  which breaks consistency of the writes. This behavior comes from this logic, 
> [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Conflicts.java#L45]
> where it appears that the value of the cell itself is used to resolve 
> overwrite conflict which I don't think is the correct way of handling the 
> situation. Shouldn't it either be overwrite or not overwrite for all cases?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14322) Cassandra NodeTool clientstats should show SSL Cipher

2018-03-20 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14322:
-
Labels: security  (was: )

> Cassandra NodeTool clientstats should show SSL Cipher
> -
>
> Key: CASSANDRA-14322
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14322
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Minor
>  Labels: security
> Fix For: 4.0
>
>
> Currently nodetool prints out some information that does not include the SSL 
> Cipher being used by the client. It would be helpful to add this in for 
> better visibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14316) Read repair mutations should be sent to pending nodes

2018-03-20 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406189#comment-16406189
 ] 

Jeremiah Jordan commented on CASSANDRA-14316:
-

Agreed. Sounds like this is an issue.

> Read repair mutations should be sent to pending nodes
> -
>
> Key: CASSANDRA-14316
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14316
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Blake Eggleston
>Priority: Major
>
> Since read repair doesn't mirror mutations to pending endpoints, it seems 
> likely that there's an edge case that can break the monotonic quorum read 
> guarantee blocking read repair is supposed to provide.
> Assuming there are 3 nodes (A, B, & C) which replicate a token range. A new 
> node D is added, which will take over some of A's token range. During the 
> bootstrap of D, if there's a failed write that only makes it to a single node 
> (A) after bootstrap has started, then there's a quorum read including A & B, 
> which replicates that value to B. If A is removed when D finishes 
> bootstrapping, a quorum read including node C & D will not see the value 
> returned in the last quorum read which queried A & B. 
> Table to illustrate:
> |state | A | B | C | D|
> |1 begin |  | | | pending|
> |2 write |1 | | | pending|
> |3 repair|1|1| | pending|
> |4 joined| n/a |1| | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9608) Support Java 9

2018-03-20 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406190#comment-16406190
 ] 

Robert Stupp commented on CASSANDRA-9608:
-

This patch won't make into any current release - i.e. none of 2.1, 2.2, 3.0, 
3.11. The changes are too intrusive.

Given that Java 9 will be EOL's, when Java 10 is released and 10 will EOL when 
11 is released (6 months in between), the earliest newer version that will be 
supported is the one that's current when 4.0 goes GA.

I'm even unsure whether it makes sense to support Java 8 after this patch, as 
Java 8 will have reached EOL by end of this year. [~jasobrown], wdyt about 
removing support for Java 8 with this one?

CI needs to be set up for Java 10/11/42/whatever, too.

> Support Java 9
> --
>
> Key: CASSANDRA-9608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9608
> Project: Cassandra
>  Issue Type: Task
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> This ticket is intended to group all issues found to support Java 9 in the 
> future.
> From what I've found out so far:
> * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. 
> It can be easily solved using this patch:
> {code}
> - artifactId="cobertura"/>
> + artifactId="cobertura">
> +  
> +
> {code}
> * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods 
> {{monitorEnter}} + {{monitorExit}}. These methods are used by 
> {{o.a.c.utils.concurrent.Locks}} which is only used by 
> {{o.a.c.db.AtomicBTreeColumns}}.
> I don't mind to start working on this yet since Java 9 is in a too early 
> development phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14326:
---
Fix Version/s: (was: 4.0)
   4.x

> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.x
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default and the root logger level (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2018-03-20 Thread Eric Hubert (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hubert updated CASSANDRA-13396:

Status: Patch Available  (was: Open)

Hi [~jasobrown]! Today, I took some time to prepare a patch against the 
Cassandra 3.11 branch which basically:
 * bundles all logback implementation specific functionality in one class 
(required a bit of code reorganization)
 * extracted an interface to be able to a) minimize use of reflection and b) be 
able to provide alternative implementations (the patch itself only provides a 
no-op fallback implementation)
 * load and instantiate logging-implementation specific extension according to 
used slf4j binding via reflection (Cassandra code only works on new interface 
which has no java class dependencies to specific implementations)

So far there are no new (integration) tests which likely would also require 
some classpath /  ClassLoader magic.

I tested the change using "a neutral" application use case by utilizing 
[Cassandra Unit|https://github.com/jsevellec/cassandra-unit].

The "test" involved adjusting log4j config from Cassandra Unit test resources, 
changing the used cassandra-all version in parent pom, excluding logback deps 
from the pom and executing any of the tests.

With stock Cassandra 3.11.2 we see:
{code:java}
2018-03-20 10:51:43,753 [pool-2-thread-1] ERROR 
cassandra.service.CassandraDaemon - Exception encountered during startup
java.lang.NoClassDefFoundError: ch/qos/logback/classic/Logger
    at 
org.apache.cassandra.cql3.functions.ThreadAwareSecurityManager.install(ThreadAwareSecurityManager.java:92)
    at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:192)
    at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:602)
    at 
org.cassandraunit.utils.EmbeddedCassandraServerHelper.lambda$startEmbeddedCassandra$1(EmbeddedCassandraServerHelper.java:144)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: ch.qos.logback.classic.Logger
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more
{code}
Using the patch [^CASSANDRA-13396_ehubert_1.patch] I provided we have a 
successful server startup with a warning:
{code:java}
2018-03-20 13:47:32,688 [pool-2-thread-1] WARN  
utils.logging.LoggingSupportFactory - You are using Cassandra with an 
unsupported deployment. The intended logging implementation library logback is 
not used by slf4j. Detected slf4j binding: org.slf4j.impl.Log4jLoggerFactory. 
You will not be able to dynamically manage log levels via JMX and may have 
performance or other issues.
{code}

Please consider this as an initial patch suggestion to gather quick feedback on 
the approach! I'm willing to adjust things according to your requirements or 
are happy if you like to tweak it to your requirements.

Would be great to see this in Cassandra 3.11.3 if possible.

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Edward Capriolo
>Assignee: Eugene Fedotov
>Priority: Minor
> Attachments: CASSANDRA-13396_ehubert_1.patch
>
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9608) Support Java 9

2018-03-20 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406268#comment-16406268
 ] 

Jason Brown commented on CASSANDRA-9608:


>>  I'm even unsure whether it makes sense to support Java 8 after this patch, 
>> as Java 8 will have reached EOL by end of this year. Jason Brown, wdyt about 
>> removing support for Java 8 with this one?

Ugh, I knew we were gonna get to this point :). I have some thoughts, and I'd 
like to start a dev list email thread - this ticket isn't the right place for 
that conversation.

wrt to this patch specifically, I think we should hold onto it once we 
determine the way forward for java releases.

> Support Java 9
> --
>
> Key: CASSANDRA-9608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9608
> Project: Cassandra
>  Issue Type: Task
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> This ticket is intended to group all issues found to support Java 9 in the 
> future.
> From what I've found out so far:
> * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. 
> It can be easily solved using this patch:
> {code}
> - artifactId="cobertura"/>
> + artifactId="cobertura">
> +  
> +
> {code}
> * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods 
> {{monitorEnter}} + {{monitorExit}}. These methods are used by 
> {{o.a.c.utils.concurrent.Locks}} which is only used by 
> {{o.a.c.db.AtomicBTreeColumns}}.
> I don't mind to start working on this yet since Java 9 is in a too early 
> development phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406317#comment-16406317
 ] 

Paulo Motta commented on CASSANDRA-14326:
-

I think we should go with 2 otherwise we lose the ability introduced by 
CASSANDRA-10421 to have a human-readable system.log vs verbose troubleshooting 
log, which I think it has been useful to troubleshoot hard-to-reproduce issues 
after the fact without affecting performance (best-effort async-appender 
logging) and cluttering system.log readability (separation of system vs debug 
log), but I'd be interested in hearing more on what operators from 3.0+ 
clusters think of the current separation of system.log and debug.log.

In any case, the first step of this ticket is to go through all messages being 
logged at DEBUG and reclassify important ones as INFO, and then we can decide 
to go with 1 or 2 later.

> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.0
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default and the root logger level (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13396) Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager

2018-03-20 Thread Eric Hubert (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hubert updated CASSANDRA-13396:

Attachment: CASSANDRA-13396_ehubert_1.patch

> Cassandra 3.10: ClassCastException in ThreadAwareSecurityManager
> 
>
> Key: CASSANDRA-13396
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13396
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Edward Capriolo
>Assignee: Eugene Fedotov
>Priority: Minor
> Attachments: CASSANDRA-13396_ehubert_1.patch
>
>
> https://www.mail-archive.com/user@cassandra.apache.org/msg51603.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-6719) redesign loadnewsstables

2018-03-20 Thread Marcus Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reassigned CASSANDRA-6719:
--

Assignee: Marcus Eriksson  (was: Bhaskar Muppana)

> redesign loadnewsstables
> 
>
> Key: CASSANDRA-6719
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6719
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: 6719.patch
>
>
> CFSMBean.loadNewSSTables scans data directories for new sstables dropped 
> there by an external agent.  This is dangerous because of possible filename 
> conflicts with existing or newly generated sstables.
> Instead, we should support leaving the new sstables in a separate directory 
> (specified by a parameter, or configured as a new location in yaml) and take 
> care of renaming as necessary automagically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14327) Compact new sstables together before importing in nodetool refresh

2018-03-20 Thread Marcus Eriksson (JIRA)
Marcus Eriksson created CASSANDRA-14327:
---

 Summary: Compact new sstables together before importing in 
nodetool refresh
 Key: CASSANDRA-14327
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14327
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
 Fix For: 4.x


{{nodetool refresh}} just randomly puts sstables in a directory, then relies on 
compaction to move the tokens to their correct directory. Instead we should 
consider adding an option to compact all new sstables together and put the 
tokens in the correct places.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14314) Fix argument passing for SSLContext in trunk

2018-03-20 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14314:
-
Labels: security  (was: )

> Fix argument passing for SSLContext in trunk
> 
>
> Key: CASSANDRA-14314
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14314
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: security
>
> Argument passing has a minor bug while creating the SSLContext. Audit and 
> make sure that the client & server SSL contexts are created at appropriate 
> locations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14295) no ssl hostname validation in cqlsh

2018-03-20 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14295:
-
Labels: Security  (was: )

> no ssl hostname validation in cqlsh
> ---
>
> Key: CASSANDRA-14295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14295
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Christian Becker
>Priority: Major
>  Labels: Security
>
> In order to validate certificates properly the python driver requires 
> {{check_hostname}} to be set.
> [https://github.com/datastax/python-driver/blob/master/cassandra/cluster.py#L558-L562]
> However it is not available as a setting in cqlsh:
> [https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/sslhandling.py#L86-L89]
> I noticed this because cqlsh is connecting to 127.0.0.1 per default, but the 
> configured certificate is just containing the hostname and the local ip. The 
> connection was always successful. But when adding {{check_hostname}} to 
> {{cqlshlib/sslhandling.py}} the validation works as expected:
> current behaviour:
> {code:java}
> # cqlsh --ssl
> Connected to -cassandra at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
> @cqlsh>{code}
> expected:
> {code:java}
> # cqlsh --ssl
> Connection error: ('Unable to connect to any servers', {'127.0.0.1': 
> CertificateError("hostname '127.0.0.1' doesn't match ''",)}){code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14197) SSTable upgrade should be automatic

2018-03-20 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406068#comment-16406068
 ] 

Marcus Eriksson commented on CASSANDRA-14197:
-

pushed a new version here: 
https://github.com/krummas/cassandra/commits/marcuse/autoupgrade-v2

Idea is that if there is no other compaction to do for a table, we will pick 
the oldest non-upgraded sstable and upgrade it. The patch also adds 
old-sstable-count to {{nodetool tablestats}} and a metric so that users can 
know when they are fully upgraded. There will be no starvation of tables that 
we don't write anything to since we have 
[this|https://github.com/krummas/cassandra/blob/marcuse/autoupgrade-v2/src/java/org/apache/cassandra/service/CassandraDaemon.java#L408]
 which checks if there are compactions to do every minute. Only time we won't 
run an upgrade for a long time is if we are way behind on compactions, and in 
this case it is probably better to just let standard compactions run the 
upgrades until we catch up. Users can either enable this feature in 
cassandra.yaml or via JMX.

[~KurtG] [~aweisberg] wdyt?

> SSTable upgrade should be automatic
> ---
>
> Key: CASSANDRA-14197
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14197
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
> Fix For: 4.x
>
>
> Upgradesstables should run automatically on node upgrade



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-9608) Support Java 9

2018-03-20 Thread Kamil (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406122#comment-16406122
 ] 

Kamil commented on CASSANDRA-9608:
--

Any plans to release the fix? 

This issue affects both SpringBoot 
[https://github.com/spring-projects/spring-boot/issues/10453] and 
CassandraUnit: [https://github.com/jsevellec/cassandra-unit/issues/249] 

> Support Java 9
> --
>
> Key: CASSANDRA-9608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9608
> Project: Cassandra
>  Issue Type: Task
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> This ticket is intended to group all issues found to support Java 9 in the 
> future.
> From what I've found out so far:
> * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. 
> It can be easily solved using this patch:
> {code}
> - artifactId="cobertura"/>
> + artifactId="cobertura">
> +  
> +
> {code}
> * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods 
> {{monitorEnter}} + {{monitorExit}}. These methods are used by 
> {{o.a.c.utils.concurrent.Locks}} which is only used by 
> {{o.a.c.db.AtomicBTreeColumns}}.
> I don't mind to start working on this yet since Java 9 is in a too early 
> development phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12700) During writing data into Cassandra 3.7.0 using Python driver 3.7 sometimes Connection get lost, because of Server NullPointerException

2018-03-20 Thread Pranav Jindal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406360#comment-16406360
 ] 

Pranav Jindal edited comment on CASSANDRA-12700 at 3/20/18 1:59 PM:


Using cassandra 3.10, facing below issue.

[~jjirsa] [~jasonstack]
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
 [apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.10.jar:3.10]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.lang.NullPointerException: null
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getBoolean(UntypedResultSet.java:273)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:88)
 ~[apache-cassandra-3.10.jar:3.10]
... 16 common frames omitted

{code}


was (Author: prnvjndl):
Using cassandra 3.10, facing below issue.

[~jjirsa]
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 

[jira] [Comment Edited] (CASSANDRA-12700) During writing data into Cassandra 3.7.0 using Python driver 3.7 sometimes Connection get lost, because of Server NullPointerException

2018-03-20 Thread Pranav Jindal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406360#comment-16406360
 ] 

Pranav Jindal edited comment on CASSANDRA-12700 at 3/20/18 1:59 PM:


Using cassandra 3.10, facing below issue.

[~jjirsa]
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
 [apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.10.jar:3.10]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.lang.NullPointerException: null
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getBoolean(UntypedResultSet.java:273)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:88)
 ~[apache-cassandra-3.10.jar:3.10]
... 16 common frames omitted

{code}


was (Author: prnvjndl):
Using cassandra 3.10, facing below issue.

[~jjirsa] 
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 

[jira] [Assigned] (CASSANDRA-14329) Update logging guidelines and move in-tree

2018-03-20 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski reassigned CASSANDRA-14329:
--

Assignee: Stefan Podkowinski

> Update logging guidelines and move in-tree
> --
>
> Key: CASSANDRA-14329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14329
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>
> We should update the existing [logging 
> guidelines|https://wiki.apache.org/cassandra/LoggingGuidelines] while moving 
> them in-tree. Maybe also split up between "Configuring Cassandra" and 
> "Contributing to Cassandra".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406387#comment-16406387
 ] 

Paulo Motta commented on CASSANDRA-14326:
-

bq. The only thing I can think of that will be a bit odd, is to have two 
different logs, both having INFO messages with one being a superset of the 
other.

We could add a new marker INFO-VERBOSE, that is logged asynchronously to 
verbose-system.log (former debug.log). Admins would typically only look at 
system.log, and only go to verbose-system.log when facing problems, 
troubleshooting issues or wanting to perform advanced tuning, etc. Ultimately 
advanced operators could still disable the system-verbose.log if/when they're 
not interested in that.

> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.x
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default and the root logger level (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406447#comment-16406447
 ] 

Jeremiah Jordan commented on CASSANDRA-14326:
-

One thing I would suggest is to still keep DEBUG filtered to the asynchronous 
log, even after this change. So that when debug logging is enabled by someone 
it has less impact in the default logging setup.
I think it is useful to keep all messages in the verbose log, otherwise you 
need to some how merge or otherwise correlate things between the two files.

> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.x
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default and the root logger level (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14328) Invalid metadata has been detected for role

2018-03-20 Thread Pranav Jindal (JIRA)
Pranav Jindal created CASSANDRA-14328:
-

 Summary: Invalid metadata has been detected for role
 Key: CASSANDRA-14328
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14328
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
Reporter: Pranav Jindal


Cassandra Version : 3.10

One node was replaced and was successfully up and working but CQL-SH fails with 
error.
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
 [apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.10.jar:3.10]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.lang.NullPointerException: null
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getBoolean(UntypedResultSet.java:273)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:88)
 ~[apache-cassandra-3.10.jar:3.10]
... 16 common frames omitted
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406374#comment-16406374
 ] 

Stefan Podkowinski commented on CASSANDRA-14326:


I'd prefer option #2 over what we have now, if we can use a SLF4J marker for 
filtering messages in a practical way, as suggested by [~jjordan]. The only 
thing I can think of that will be a bit odd, is to have two different logs, 
both having INFO messages with one being a superset of the other. Although I'm 
aware of the reasons behind this, it's probably not that obvious why you need 
to keep system.log around with verbose-system.log enabled, too. But this is 
really more a minor issue, compared to the benefit of preventing performance 
regressions by just adding a debug statement in hot code paths. 

> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.x
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default and the root logger level (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406387#comment-16406387
 ] 

Paulo Motta edited comment on CASSANDRA-14326 at 3/20/18 2:14 PM:
--

bq. The only thing I can think of that will be a bit odd, is to have two 
different logs, both having INFO messages with one being a superset of the 
other.

We could add a new marker INFO-VERBOSE, that is logged asynchronously to 
verbose-system.log (former debug.log) - all other INFO/WARN/ERROR logs would 
only go to system.log. Admins would typically only look at system.log, and only 
go to verbose-system.log when facing problems, troubleshooting issues or 
wanting to perform advanced tuning, etc. Ultimately advanced operators could 
still disable the system-verbose.log if/when they're not interested in that.


was (Author: pauloricardomg):
bq. The only thing I can think of that will be a bit odd, is to have two 
different logs, both having INFO messages with one being a superset of the 
other.

We could add a new marker INFO-VERBOSE, that is logged asynchronously to 
verbose-system.log (former debug.log). Admins would typically only look at 
system.log, and only go to verbose-system.log when facing problems, 
troubleshooting issues or wanting to perform advanced tuning, etc. Ultimately 
advanced operators could still disable the system-verbose.log if/when they're 
not interested in that.

> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.x
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default and the root logger level (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12700) During writing data into Cassandra 3.7.0 using Python driver 3.7 sometimes Connection get lost, because of Server NullPointerException

2018-03-20 Thread Pranav Jindal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406360#comment-16406360
 ] 

Pranav Jindal edited comment on CASSANDRA-12700 at 3/20/18 1:58 PM:


Using cassandra 3.10, facing below issue.

[~jjirsa] 
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
 [apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.10.jar:3.10]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.lang.NullPointerException: null
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getBoolean(UntypedResultSet.java:273)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:88)
 ~[apache-cassandra-3.10.jar:3.10]
... 16 common frames omitted

{code}


was (Author: prnvjndl):
Using cassandra 3.10, facing below issue.

 
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 

[jira] [Commented] (CASSANDRA-12700) During writing data into Cassandra 3.7.0 using Python driver 3.7 sometimes Connection get lost, because of Server NullPointerException

2018-03-20 Thread Pranav Jindal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406360#comment-16406360
 ] 

Pranav Jindal commented on CASSANDRA-12700:
---

Using cassandra 3.10, facing below issue.

 
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
 [apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.10.jar:3.10]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.lang.NullPointerException: null
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getBoolean(UntypedResultSet.java:273)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:88)
 ~[apache-cassandra-3.10.jar:3.10]
... 16 common frames omitted

{code}

> During writing data into Cassandra 3.7.0 using Python driver 3.7 sometimes 
> Connection get lost, because of Server NullPointerException
> --
>
> Key: CASSANDRA-12700
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12700
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra cluster with two nodes running C* version 
> 3.7.0 and Python Driver 3.7 using Python 2.7.11. 
> OS: Red Hat Enterprise Linux 6.x x64, 
> RAM :8GB
> DISK :210GB
> Cores: 2
> Java 1.8.0_73 JRE
>Reporter: Rajesh Radhakrishnan
>Assignee: Jeff Jirsa
>Priority: Major
> Fix For: 2.2.9, 3.0.10, 3.10
>
>
> In our C* cluster we are using the latest Cassandra 3.7.0 (datastax-ddc.3.70) 
> with Python driver 3.7. Trying to insert 2 million row or more data into the 
> database, but sometimes we are getting "Null pointer Exception". 
> We are using Python 2.7.11 and Java 1.8.0_73 in the Cassandra nodes and in 
> the client its Python 2.7.12.
> {code:title=cassandra server log}
> ERROR [SharedPool-Worker-6] 2016-09-23 09:42:55,002 Message.java:611 - 
> Unexpected exception during request; channel = [id: 0xc208da86, 
> L:/IP1.IP2.IP3.IP4:9042 - R:/IP5.IP6.IP7.IP8:58418]
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.serializers.BooleanSerializer.deserialize(BooleanSerializer.java:33)
>  ~[apache-cassandra-3.7.0.jar:3.7.0]
> at 
> 

[jira] [Updated] (CASSANDRA-14328) Invalid metadata has been detected for role

2018-03-20 Thread Pranav Jindal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranav Jindal updated CASSANDRA-14328:
--
Description: 
Cassandra Version : 3.10

One node was replaced and was successfully up and working but CQL-SH fails with 
error.

 

CQL-SH error:

 
{code:java}
Connection error: ('Unable to connect to any servers', {'10.180.0.150': 
AuthenticationFailed('Failed to authenticate to 10.180.0.150: Error from 
server: code= [Server error] message="java.lang.RuntimeException: Invalid 
metadata has been detected for role utorjwcnruzzlzafxffgyqmlvkxiqcgb"',)})
{code}
 

Cassandra server ERROR:
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:310)
 ~[apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.service.ClientState.login(ClientState.java:271) 
~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:80)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:517)
 [apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.10.jar:3.10]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.10.jar:3.10]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
Caused by: java.lang.NullPointerException: null
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getBoolean(UntypedResultSet.java:273)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:88)
 ~[apache-cassandra-3.10.jar:3.10]
... 16 common frames omitted
{code}

  was:
Cassandra Version : 3.10

One node was replaced and was successfully up and working but CQL-SH fails with 
error.
{code:java}
WARN [Native-Transport-Requests-1] 2018-03-20 13:37:17,894 
CassandraRoleManager.java:96 - An invalid value has been detected in the roles 
table for role utorjwcnruzzlzafxffgyqmlvkxiqcgb. If you are unable to login, 
you may need to disable authentication and confirm that values in that table 
are accurate
ERROR [Native-Transport-Requests-1] 2018-03-20 13:37:17,895 Message.java:623 - 
Unexpected exception during request; channel = [id: 0xdfc3604f, 
L:/10.180.0.150:9042 - R:/10.180.0.150:51668]
java.lang.RuntimeException: Invalid metadata has been detected for role 
utorjwcnruzzlzafxffgyqmlvkxiqcgb
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:99)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager$1.apply(CassandraRoleManager.java:82)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:528)
 ~[apache-cassandra-3.10.jar:3.10]
at 
org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:503)
 ~[apache-cassandra-3.10.jar:3.10]

[jira] [Created] (CASSANDRA-14329) Update logging guidelines and move in-tree

2018-03-20 Thread Stefan Podkowinski (JIRA)
Stefan Podkowinski created CASSANDRA-14329:
--

 Summary: Update logging guidelines and move in-tree
 Key: CASSANDRA-14329
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14329
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation and Website
Reporter: Stefan Podkowinski


We should update the existing [logging 
guidelines|https://wiki.apache.org/cassandra/LoggingGuidelines] while moving 
them in-tree. Maybe also split up between "Configuring Cassandra" and 
"Contributing to Cassandra".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi commented on CASSANDRA-13853:


[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
 Live: 1
 Joining: 0
 Moving: 0
 Leaving: 0
 Unreachable: 0
Data Centers: 
 datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
 system_schema -> Replication class: LocalStrategy {}
 system -> Replication class: LocalStrategy {}
 system_auth -> Replication class: SimpleStrategy {replication_factor=1}
 system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
 system_traces -> Replication class: SimpleStrategy {replication_factor=2}
 whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
 Name: Test Cluster
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 DynamicEndPointSnitch: enabled
 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 Schema versions:
 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 7:06 PM:
-

[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
 Live: 1
 Joining: 0
 Moving: 0
 Leaving: 0
 Unreachable: 0
Data Centers: 
 datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
 system_schema -> Replication class: LocalStrategy {}
 system -> Replication class: LocalStrategy {}
 system_auth -> Replication class: SimpleStrategy {replication_factor=1}
 system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
 system_traces -> Replication class: SimpleStrategy {replication_factor=2}
 whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
 Name: Test Cluster
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 DynamicEndPointSnitch: enabled
 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 Schema versions:
 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
 Live: 1
 Joining: 0
 Moving: 0
 Leaving: 0
 Unreachable: 0
Data Centers: 
 datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
 system_schema -> Replication class: LocalStrategy {}
 system -> Replication class: LocalStrategy {}
 system_auth -> Replication class: SimpleStrategy {replication_factor=1}
 system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
 system_traces -> Replication class: SimpleStrategy {replication_factor=2}
 whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
 Name: Test Cluster
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 DynamicEndPointSnitch: enabled
 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 Schema versions:
 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:49 PM:
-

[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:49 PM:
-

[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:58 PM:
-

[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
       Live: 1
       Joining: 0
       Moving: 0
       Leaving: 0
       Unreachable: 0
 {{Data Centers:}}
 {{    datacenter1 #Nodes: 1 #Down: 0}}
 {{Keyspaces:}}
 {{    system_schema -> Replication class: LocalStrategy {}}}
 {{    system -> Replication class: LocalStrategy {}}}
 {{    system_auth -> Replication class: SimpleStrategy {replication_factor=1}}}

         s{{ystem_distributed -> Replication class: SimpleStrategy 
{}}{{replication_factor=3} }}

{{    system_traces -> Replication class: SimpleStrategy 
}}{{{replication_factor=2}}}

{{    whatever -> Replication class: NetworkTopologyStrategy 
}}{{{datacenter1=3}}}

{{Cluster Information:}}
{{    Name: Test Cluster}}
{{    Snitch: org.apache.cassandra.locator.SimpleSnitch}}
{{    DynamicEndPointSnitch: enabled}}
{{    Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
{{    Schema versions:}}
{{        7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
{{     Live: 1}}
{{     Joining: 0}}
{{     Moving: 0}}
{{     Leaving: 0}}
{{     Unreachable: 0}}
{{Data Centers:}}
{{    datacenter1 #Nodes: 1 #Down: 0}}
{{Keyspaces:}}
{{    system_schema -> Replication class: LocalStrategy {}}}
{{    system -> Replication class: LocalStrategy {}}}
{{    system_auth -> Replication class: SimpleStrategy \{replication_factor=1}}}

{{system_distributed -> Replication class: SimpleStrategy}}{{

{replication_factor=3}

}}{{system_traces -> Replication class: SimpleStrategy}}{{

{replication_factor=2}

}}{{whatever -> Replication class: NetworkTopologyStrategy}}{{

{datacenter1=3}

}}{{Cluster Information:}}
 \{{ Name: Test Cluster}}
 \{{ Snitch: org.apache.cassandra.locator.SimpleSnitch}}
 \{{ DynamicEndPointSnitch: enabled}}
 \{{ Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
 \{{ Schema versions:}}
 {{ 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14260) Refactor pair to avoid boxing longs/ints

2018-03-20 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407223#comment-16407223
 ] 

Nate McCall edited comment on CASSANDRA-14260 at 3/20/18 11:32 PM:
---

-Nit on the {{equals()}} and {{hashCode()}} impls - those would be cleaner 
using {{Objects.equals}} and {{Objects.hash}} (JDK version).- We use these in a 
lot of other places in the code already and would be nice to keep new 
contributions in line with such. 

Update: [~jjirsa] just pointed out that Objects autoboxes with primitives which 
was news to me and sort of makes sense when i think about it. 


was (Author: zznate):
Nit on the {{equals()}} and {{hashCode()}} impls - those would be cleaner using 
{{Objects.equals}} and {{Objects.hash}} (JDK version). We use these in a lot of 
other places in the code already and would be nice to keep new contributions in 
line with such. 

> Refactor pair to avoid boxing longs/ints
> 
>
> Key: CASSANDRA-14260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14260
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
>
> We uses Pair all over the place, and in many cases either/both of X and 
> Y are primitives (ints, longs), and we end up boxing them into Integers and 
> Longs. We should have specialized versions that take primitives. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407060#comment-16407060
 ] 

Jon Haddad edited comment on CASSANDRA-13853 at 3/20/18 9:13 PM:
-

JIRA code blocks use the following format:
 {  code  } 
my code here
 {  code  } 

(spacing added to avoid it getting parsed)


was (Author: rustyrazorblade):
JIRA code blocks use {  code  } (spacing added to avoid it getting parsed

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407062#comment-16407062
 ] 

Preetika Tyagi commented on CASSANDRA-13853:


OK. I was able to figure out the formatting. I have updated the above comment 
with the output.

I will upload the patch soon.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14319) nodetool rebuild from DC lets you pass invalid datacenters

2018-03-20 Thread Vinay Chella (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407169#comment-16407169
 ] 

Vinay Chella commented on CASSANDRA-14319:
--

Hi Jon,

Are you looking for something like below when we provide invalid datacenter as 
an option to rebuild?
{code:java}
$ bin/nodetool rebuild no_dc
nodetool: Provided datacenter: no_dc is not a valid datacenter, available 
datacenters are: datacenter1, datacenter2
See 'nodetool help' or 'nodetool help '.
$ 
{code}
And for
{quote}3. Ideally, we indicate which keyspaces are set to replicate to this DC 
and which aren't
{quote}
Are you referring {{this}} as the datacenter from where the {{rebuild}} command 
is being executed from or the one which is provided as an option (e.g., 
{{no_dc}} in above example)? if it is the later, invalid dc would not have any 
keyspaces, so what is expected for #3 in this scenario?

 

I am interested in working on this ticket. Can you assign this ticket to me? 

> nodetool rebuild from DC lets you pass invalid datacenters 
> ---
>
> Key: CASSANDRA-14319
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14319
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Priority: Major
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> If you pass an invalid datacenter to nodetool rebuild, you'll get an error 
> like this:
> {code}
> Unable to find sufficient sources for streaming range 
> (3074457345618258602,-9223372036854775808] in keyspace system_distributed
> {code}
> Unfortunately, this is a rabbit hole of frustration if you are using caps for 
> your DC names and you pass in a lowercase DC name, or you just typo the DC.  
> Let's do the following:
> # Check the DC name that's passed in against the list of DCs we know about
> # If we don't find it, let's output a reasonable error, and list all the DCs 
> someone could put in.
> # Ideally we indicate which keyspaces are set to replicate to this DC and 
> which aren't



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-14252:
---
Reviewer: Jay Zhuang

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:50 PM:
-

[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]



was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:50 PM:
-

[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]


> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407165#comment-16407165
 ] 

Jay Zhuang commented on CASSANDRA-14252:


Hi [~dikanggu], instead of committing the change to each branch separately, I 
think it would be better to merge up to later branches:
https://cassandra.apache.org/doc/latest/development/how_to_commit.html

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406947#comment-16406947
 ] 

Jay Zhuang commented on CASSANDRA-14252:


+1

Good catch. I created a dtest to reproduce the problem: 
[14252|https://github.com/cooldoger/cassandra-dtest/tree/14252]
Also when comparing 2 versions, the existing code uses {{0.0}} as default 
value: 
[{{DynamicEndpointSnitch.java:267}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java#L267]

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 9:00 PM:
-

[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
{{        Live: 1}}
{{        Joining: 0}}
{{        Moving: 0}}
{{        Leaving: 0}}
{{        Unreachable: 0}}
{{ Data Centers:}}
{{     datacenter1 #Nodes: 1 #Down: 0}}
{{ Keyspaces:}}
{{     system_schema -> Replication class: LocalStrategy {}}}
{{     system -> Replication class: LocalStrategy {}}}
{{    system_auth -> Replication class: SimpleStrategy 
}}{{{replication_factor=1}}}

{{    system_distributed -> Replication class: SimpleStrategy 
\{replication_factor=3} }}{{    }}

{{    system_traces -> Replication class: SimpleStrategy 
}}{{{replication_factor=2}}}{{    }}

{{    whatever -> Replication class: NetworkTopologyStrategy 
}}{{{datacenter1=3}}}

{{Cluster Information:}}
{{     Name: Test Cluster}}
{{     Snitch: org.apache.cassandra.locator.SimpleSnitch}}
{{     DynamicEndPointSnitch: enabled}}
{{     Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
{{     Schema versions:}}
{{         7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
       Live: 1
       Joining: 0
       Moving: 0
       Leaving: 0
       Unreachable: 0
 {{Data Centers:}}
 {{    datacenter1 #Nodes: 1 #Down: 0}}
 {{Keyspaces:}}
 {{    system_schema -> Replication class: LocalStrategy {}}}
 {{    system -> Replication class: LocalStrategy {}}}
 {{    system_auth -> Replication class: SimpleStrategy {replication_factor=1}}}

         s{{ystem_distributed -> Replication class: SimpleStrategy 
{}}{{replication_factor=3} }}

{{    system_traces -> Replication class: SimpleStrategy 
}}{{{replication_factor=2}}}

{{    whatever -> Replication class: NetworkTopologyStrategy 
}}{{{datacenter1=3}}}

{{Cluster Information:}}
{{    Name: Test Cluster}}
{{    Snitch: org.apache.cassandra.locator.SimpleSnitch}}
{{    DynamicEndPointSnitch: enabled}}
{{    Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
{{    Schema versions:}}
{{        7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407065#comment-16407065
 ] 

Preetika Tyagi commented on CASSANDRA-13853:


Oh Ok. I ended up using "\{noformat}". Will try "\{ code }" as well next time. 
Thanks! :)

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407141#comment-16407141
 ] 

Dikang Gu commented on CASSANDRA-14252:
---

Thanks  [~jay.zhuang], committed as *f109f200a3a7f6002d7e1f6cc67e9ef5bf5cb2df*



> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-14252:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:53 PM:
-

[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
{{    Live: 1}}
{{    Joining: 0}}
{{    Moving: 0}}
{{    Leaving: 0}}
{{    Unreachable: 0}}
{{ Data Centers: }}
{{ datacenter1 #Nodes: 1 #Down: 0}}
{{ Keyspaces:}}
{{ system_schema -> Replication class: LocalStrategy {}}}
{{ system -> Replication class: LocalStrategy {}}}
{{ system_auth -> Replication class: 
SimpleStrategy}}{{{replication_factor=1}}}{{system_distributed -> Replication 
class: SimpleStrategy}}{{{replication_factor=3}}}{{system_traces -> Replication 
class: SimpleStrategy}}{{{replication_factor=2}}}{{whatever -> Replication 
class: NetworkTopologyStrategy}}{{{datacenter1=3}}}{{Cluster Information:}}
{{ Name: Test Cluster}}
{{ Snitch: org.apache.cassandra.locator.SimpleSnitch}}
{{ DynamicEndPointSnitch: enabled}}
{{ Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
{{ Schema versions:}}
{{ 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14260) Refactor pair to avoid boxing longs/ints

2018-03-20 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407223#comment-16407223
 ] 

Nate McCall commented on CASSANDRA-14260:
-

Nit on the {{equals()}} and {{hashCode()}} impls - those would be cleaner using 
{{Objects.equals}} and {{Objects.hash}} (JDK version). We use these in a lot of 
other places in the code already and would be nice to keep new contributions in 
line with such. 

> Refactor pair to avoid boxing longs/ints
> 
>
> Key: CASSANDRA-14260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14260
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
>
> We uses Pair all over the place, and in many cases either/both of X and 
> Y are primitives (ints, longs), and we end up boxing them into Integers and 
> Longs. We should have specialized versions that take primitives. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 9:12 PM:
-

[~rustyrazorblade]

Does the below output look okay?
{noformat}
Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy {replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]{noformat}
 


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
           Live: 1
           Joining: 0
           Moving: 0
           Leaving: 0
           Unreachable: 0

Data Centers:
        datacenter1 #Nodes: 1 #Down: 0
 Keyspaces:
         system_schema -> Replication class: LocalStrategy {}
        system -> Replication class: LocalStrategy {}
 {{    system_auth -> Replication class: SimpleStrategy 
}}{{{replication_factor=1}}}

{{    system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}}}

{{    system_traces -> Replication class: SimpleStrategy 
}}{{{replication_factor=2}}}

{{    whatever -> Replication class: NetworkTopologyStrategy 
}}{{{datacenter1=3}}}

{{Cluster Information:}}
        Name: Test Cluster
        Snitch: org.apache.cassandra.locator.SimpleSnitch
        DynamicEndPointSnitch: enabled
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
            7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407060#comment-16407060
 ] 

Jon Haddad commented on CASSANDRA-13853:


JIRA code blocks use {  code  } (spacing added to avoid it getting parsed

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14319) nodetool rebuild from DC lets you pass invalid datacenters

2018-03-20 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang reassigned CASSANDRA-14319:
--

Assignee: Vinay Chella

> nodetool rebuild from DC lets you pass invalid datacenters 
> ---
>
> Key: CASSANDRA-14319
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14319
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Assignee: Vinay Chella
>Priority: Major
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> If you pass an invalid datacenter to nodetool rebuild, you'll get an error 
> like this:
> {code}
> Unable to find sufficient sources for streaming range 
> (3074457345618258602,-9223372036854775808] in keyspace system_distributed
> {code}
> Unfortunately, this is a rabbit hole of frustration if you are using caps for 
> your DC names and you pass in a lowercase DC name, or you just typo the DC.  
> Let's do the following:
> # Check the DC name that's passed in against the list of DCs we know about
> # If we don't find it, let's output a reasonable error, and list all the DCs 
> someone could put in.
> # Ideally we indicate which keyspaces are set to replicate to this DC and 
> which aren't



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:54 PM:
-

[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
{{     Live: 1}}
{{     Joining: 0}}
{{     Moving: 0}}
{{     Leaving: 0}}
{{     Unreachable: 0}}
{{Data Centers:}}
{{    datacenter1 #Nodes: 1 #Down: 0}}
{{Keyspaces:}}
{{    system_schema -> Replication class: LocalStrategy {}}}
{{    system -> Replication class: LocalStrategy {}}}
{{    system_auth -> Replication class: SimpleStrategy \{replication_factor=1}}}

{{system_distributed -> Replication class: SimpleStrategy}}{{

{replication_factor=3}

}}{{system_traces -> Replication class: SimpleStrategy}}{{

{replication_factor=2}

}}{{whatever -> Replication class: NetworkTopologyStrategy}}{{

{datacenter1=3}

}}{{Cluster Information:}}
 \{{ Name: Test Cluster}}
 \{{ Snitch: org.apache.cassandra.locator.SimpleSnitch}}
 \{{ DynamicEndPointSnitch: enabled}}
 \{{ Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
 \{{ Schema versions:}}
 {{ 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
{{    Live: 1}}
{{    Joining: 0}}
{{    Moving: 0}}
{{    Leaving: 0}}
{{    Unreachable: 0}}
{{ Data Centers: }}
{{ datacenter1 #Nodes: 1 #Down: 0}}
{{ Keyspaces:}}
{{ system_schema -> Replication class: LocalStrategy {}}}
{{ system -> Replication class: LocalStrategy {}}}
{{ system_auth -> Replication class: 
SimpleStrategy}}{{{replication_factor=1}}}{{system_distributed -> Replication 
class: SimpleStrategy}}{{{replication_factor=3}}}{{system_traces -> Replication 
class: SimpleStrategy}}{{{replication_factor=2}}}{{whatever -> Replication 
class: NetworkTopologyStrategy}}{{{datacenter1=3}}}{{Cluster Information:}}
{{ Name: Test Cluster}}
{{ Snitch: org.apache.cassandra.locator.SimpleSnitch}}
{{ DynamicEndPointSnitch: enabled}}
{{ Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
{{ Schema versions:}}
{{ 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 9:03 PM:
-

[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
           Live: 1
           Joining: 0
           Moving: 0
           Leaving: 0
           Unreachable: 0

Data Centers:
        datacenter1 #Nodes: 1 #Down: 0
 Keyspaces:
         system_schema -> Replication class: LocalStrategy {}
        system -> Replication class: LocalStrategy {}
 {{    system_auth -> Replication class: SimpleStrategy 
}}{{{replication_factor=1}}}

{{    system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}}}

{{    system_traces -> Replication class: SimpleStrategy 
}}{{{replication_factor=2}}}

{{    whatever -> Replication class: NetworkTopologyStrategy 
}}{{{datacenter1=3}}}

{{Cluster Information:}}
        Name: Test Cluster
        Snitch: org.apache.cassandra.locator.SimpleSnitch
        DynamicEndPointSnitch: enabled
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
            7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
          Live: 1
          Joining: 0
          Moving: 0
          Leaving: 0
          Unreachable: 0

Data Centers:
       datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
        system_schema -> Replication class: LocalStrategy {}
       system -> Replication class: LocalStrategy {}
 {{    system_auth -> Replication class: SimpleStrategy}}

{replication_factor=1}

{{    system_distributed -> Replication class: SimpleStrategy 
\{replication_factor=3}}}

{{    system_traces -> Replication class: SimpleStrategy 
}}{replication_factor=2}

{{    whatever -> Replication class: NetworkTopologyStrategy }}{datacenter1=3}

{{Cluster Information:}}
       Name: Test Cluster
       Snitch: org.apache.cassandra.locator.SimpleSnitch
       DynamicEndPointSnitch: enabled
       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
       Schema versions:
           7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407237#comment-16407237
 ] 

Dikang Gu commented on CASSANDRA-14252:
---

[~jay.zhuang], sure, almost forgot that wiki.

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
> Attachments: IMG_3180.jpg
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 8:48 PM:
-

[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
Live: 1
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers: 
datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
system_schema -> Replication class: LocalStrategy {}
system -> Replication class: LocalStrategy {}
system_auth -> Replication class: SimpleStrategy {replication_factor=1}
system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
system_traces -> Replication class: SimpleStrategy 
{replication_factor=2}
whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

Stats for all nodes:
 Live: 1
 Joining: 0
 Moving: 0
 Leaving: 0
 Unreachable: 0
Data Centers: 
 datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
 system_schema -> Replication class: LocalStrategy {}
 system -> Replication class: LocalStrategy {}
 system_auth -> Replication class: SimpleStrategy {replication_factor=1}
 system_distributed -> Replication class: SimpleStrategy 
{replication_factor=3}
 system_traces -> Replication class: SimpleStrategy {replication_factor=2}
 whatever -> Replication class: NetworkTopologyStrategy {datacenter1=3}
Cluster Information:
 Name: Test Cluster
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 DynamicEndPointSnitch: enabled
 Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 Schema versions:
 7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406872#comment-16406872
 ] 

Preetika Tyagi edited comment on CASSANDRA-13853 at 3/20/18 9:02 PM:
-

[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
          Live: 1
          Joining: 0
          Moving: 0
          Leaving: 0
          Unreachable: 0

Data Centers:
       datacenter1 #Nodes: 1 #Down: 0
Keyspaces:
        system_schema -> Replication class: LocalStrategy {}
       system -> Replication class: LocalStrategy {}
 {{    system_auth -> Replication class: SimpleStrategy}}

{replication_factor=1}

{{    system_distributed -> Replication class: SimpleStrategy 
\{replication_factor=3}}}

{{    system_traces -> Replication class: SimpleStrategy 
}}{replication_factor=2}

{{    whatever -> Replication class: NetworkTopologyStrategy }}{datacenter1=3}

{{Cluster Information:}}
       Name: Test Cluster
       Snitch: org.apache.cassandra.locator.SimpleSnitch
       DynamicEndPointSnitch: enabled
       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
       Schema versions:
           7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]


was (Author: preetikatyagi):
[~rustyrazorblade]

Does the below output look okay?

{{Stats for all nodes:}}
{{        Live: 1}}
{{        Joining: 0}}
{{        Moving: 0}}
{{        Leaving: 0}}
{{        Unreachable: 0}}
{{ Data Centers:}}
{{     datacenter1 #Nodes: 1 #Down: 0}}
{{ Keyspaces:}}
{{     system_schema -> Replication class: LocalStrategy {}}}
{{     system -> Replication class: LocalStrategy {}}}
{{    system_auth -> Replication class: SimpleStrategy 
}}{{{replication_factor=1}}}

{{    system_distributed -> Replication class: SimpleStrategy 
\{replication_factor=3} }}{{    }}

{{    system_traces -> Replication class: SimpleStrategy 
}}{{{replication_factor=2}}}{{    }}

{{    whatever -> Replication class: NetworkTopologyStrategy 
}}{{{datacenter1=3}}}

{{Cluster Information:}}
{{     Name: Test Cluster}}
{{     Snitch: org.apache.cassandra.locator.SimpleSnitch}}
{{     DynamicEndPointSnitch: enabled}}
{{     Partitioner: org.apache.cassandra.dht.Murmur3Partitioner}}
{{     Schema versions:}}
{{         7a107b6d-dbd3-32b2-9756-dd25f2f5bd7a: [127.0.0.1]}}

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[2/3] cassandra git commit: use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread dikang
use zero as default score in DynamicEndpointSnitch


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/15bd10af
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/15bd10af
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/15bd10af

Branch: refs/heads/cassandra-3.0
Commit: 15bd10afbfc5eda024e7048438b00bfc81a9e3ea
Parents: cccaf7c
Author: Dikang Gu 
Authored: Wed Feb 21 15:48:11 2018 -0800
Committer: Dikang Gu 
Committed: Tue Mar 20 15:09:32 2018 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 .../org/apache/cassandra/locator/DynamicEndpointSnitchTest.java| 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/15bd10af/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 6099b01..b967580 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.17
+ * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252)
  * Respect max hint window when hinting for LWT (CASSANDRA-14215)
  * Adding missing WriteType enum values to v3, v4, and v5 spec 
(CASSANDRA-13697)
  * Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/15bd10af/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 8c255f5..5356e8c 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -186,7 +186,7 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 {
 Double score = scores.get(inet);
 if (score == null)
-continue;
+score = 0.0;
 subsnitchOrderedScores.add(score);
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/15bd10af/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java 
b/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
index af7dc17..866cd82 100644
--- a/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
+++ b/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
@@ -92,7 +92,7 @@ public class DynamicEndpointSnitchTest
 assertEquals(order, dsnitch.getSortedListByProximity(self, 
Arrays.asList(host1, host2, host3, host4)));
 
 setScores(dsnitch, 20, hosts, 10, 10, 10);
-order = Arrays.asList(host1, host2, host3, host4);
+order = Arrays.asList(host4, host1, host2, host3);
 assertEquals(order, dsnitch.getSortedListByProximity(self, 
Arrays.asList(host1, host2, host3, host4)));
 }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[3/3] cassandra git commit: Use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread dikang
Use zero as default score in DynamicEndpointSnitch


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/29d6237a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/29d6237a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/29d6237a

Branch: refs/heads/cassandra-3.11
Commit: 29d6237a00b577400c4b4a63989d41f1baa2aa60
Parents: 656cca7
Author: Dikang Gu 
Authored: Thu Feb 22 14:35:20 2018 -0800
Committer: Dikang Gu 
Committed: Tue Mar 20 15:10:06 2018 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 .../org/apache/cassandra/locator/DynamicEndpointSnitchTest.java| 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/29d6237a/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index bae967f..27a25cc 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.11.3
+ * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252)
  * SASI tokenizer for simple delimiter based entries (CASSANDRA-14247)
  * Fix Loss of digits when doing CAST from varint/bigint to decimal 
(CASSANDRA-14170)
  * RateBasedBackPressure unnecessarily invokes a lock on the Guava RateLimiter 
(CASSANDRA-14163)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/29d6237a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 42fc26c..dce25ee 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -234,7 +234,7 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 {
 Double score = scores.get(inet);
 if (score == null)
-continue;
+score = 0.0;
 subsnitchOrderedScores.add(score);
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/29d6237a/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java 
b/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
index 8a59a4a..051d2c2 100644
--- a/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
+++ b/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
@@ -101,7 +101,7 @@ public class DynamicEndpointSnitchTest
 assertEquals(order, dsnitch.getSortedListByProximity(self, 
Arrays.asList(host1, host2, host3, host4)));
 
 setScores(dsnitch, 20, hosts, 10, 10, 10);
-order = Arrays.asList(host1, host2, host3, host4);
+order = Arrays.asList(host4, host1, host2, host3);
 assertEquals(order, dsnitch.getSortedListByProximity(self, 
Arrays.asList(host1, host2, host3, host4)));
 }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[1/3] cassandra git commit: use zero as default score in DynamicEndpointSnitch

2018-03-20 Thread dikang
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 cccaf7ca2 -> 15bd10afb
  refs/heads/cassandra-3.11 656cca778 -> 29d6237a0
  refs/heads/trunk c09e298a4 -> f109f200a


use zero as default score in DynamicEndpointSnitch


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f109f200
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f109f200
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f109f200

Branch: refs/heads/trunk
Commit: f109f200a3a7f6002d7e1f6cc67e9ef5bf5cb2df
Parents: c09e298
Author: Dikang Gu 
Authored: Wed Feb 21 15:48:11 2018 -0800
Committer: Dikang Gu 
Committed: Tue Mar 20 15:08:36 2018 -0700

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java   | 2 +-
 .../org/apache/cassandra/locator/DynamicEndpointSnitchTest.java| 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f109f200/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index fbcc1bb..c092a9f 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252)
  * Handle static and partition deletion properly on 
ThrottledUnfilteredIterator (CASSANDRA-14315)
  * NodeTool clientstats should show SSL Cipher (CASSANDRA-14322)
  * Add ability to specify driver name and version (CASSANDRA-14275)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f109f200/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index b9c9ba0..9ea7e05 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -235,7 +235,7 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 {
 Double score = scores.get(inet);
 if (score == null)
-continue;
+score = 0.0;
 subsnitchOrderedScores.add(score);
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f109f200/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java 
b/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
index bf1e4c2..202d7f1 100644
--- a/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
+++ b/test/unit/org/apache/cassandra/locator/DynamicEndpointSnitchTest.java
@@ -100,7 +100,7 @@ public class DynamicEndpointSnitchTest
 assertEquals(order, dsnitch.getSortedListByProximity(self, 
Arrays.asList(host1, host2, host3, host4)));
 
 setScores(dsnitch, 20, hosts, 10, 10, 10);
-order = Arrays.asList(host1, host2, host3, host4);
+order = Arrays.asList(host4, host1, host2, host3);
 assertEquals(order, dsnitch.getSortedListByProximity(self, 
Arrays.asList(host1, host2, host3, host4)));
 }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Preetika Tyagi updated CASSANDRA-13853:
---
Attachment: (was: cassandra-13853.patch)

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407473#comment-16407473
 ] 

Preetika Tyagi commented on CASSANDRA-13853:


[~rustyrazorblade] I have uploaded the new patch. Please let me know if looks 
okay. Thanks.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Preetika Tyagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Preetika Tyagi updated CASSANDRA-13853:
---
Attachment: cassandra-13853-v2.patch

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853-v2.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14320) dtest tools/jmxutils.py JolokiaAgent raises TypeError using json.loads on bytes

2018-03-20 Thread Patrick Bannister (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Bannister updated CASSANDRA-14320:
--
Labels: Python3 dtest python3 repair  (was: Python3 dtest python3)
Status: Patch Available  (was: Open)

Please see [https://github.com/apache/cassandra-dtest/pull/21] for a pull 
request for changes to tools/jmxutils.py to address this bug in the dtests.

If wanted, I can submit a patch instead of a pull request.

> dtest tools/jmxutils.py JolokiaAgent raises TypeError using json.loads on 
> bytes
> ---
>
> Key: CASSANDRA-14320
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14320
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Patrick Bannister
>Priority: Minor
>  Labels: Python3, dtest, python3, repair
> Fix For: 3.0.x, 3.11.x
>
>
> JolokiaAgent in tools/jmxutils.py raises a TypeError when used, because its 
> _query function tries to use json.loads (which only accepts string input) on 
> a bytes object.
> {code:java}
>     def _query(self, body, verbose=True):
>     request_data = json.dumps(body).encode("utf-8")
>     url = 'http://%s:8778/jolokia/' % 
> (self.node.network_interfaces['binary'][0],)
>     req = urllib.request.Request(url)
>     response = urllib.request.urlopen(req, data=request_data, 
> timeout=10.0)
>     if response.code != 200:
>     raise Exception("Failed to query Jolokia agent; HTTP response 
> code: %d; response: %s" % (response.code, response.readlines()))
>     raw_response = response.readline() # response is 
> http.client.HTTPResponse, which subclasses RawIOBase, which returns bytes 
> when read
>     response = json.loads(raw_response) # this raises a TypeError now
>     if response['status'] != 200:
>     stacktrace = response.get('stacktrace')
>     if stacktrace and verbose:
>     print("Stacktrace from Jolokia error follows:")
>     for line in stacktrace.splitlines():
>     print(line)
>     raise Exception("Jolokia agent returned non-200 status: %s" % 
> (response,))
>     return response{code}
> This can be seen clearly by running the deprecated repair tests 
> (repair_tests/deprecated_repair_test.py). They all fail right now because of 
> this TypeError.
> This is a side effect of the migration to Python 3, which makes bytes objects 
> fundamentally different from strings. This will also happen anytime we try to 
> json.loads data returned from stdout or stderr piped from subprocess. I need 
> to take a closer look at offline_tools_test.py and 
> cqlsh_tests/cqlsh_copy_tests.py, because I suspect they're impacted as well.
> We can fix this issue by decoding bytes objects to strings before calling 
> json.loads(). For example, in the above:
> {code:java}
>     response = json.loads(raw_response.decode(encoding='utf-8')){code}
> I have a fix for the JolokiaAgent problem - I'll submit a pull request to 
> cassandra-dtest once I have this issue number to reference.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14331) Unable to delete snapshot in Cassandra

2018-03-20 Thread Vikas Gupta (JIRA)
Vikas Gupta created CASSANDRA-14331:
---

 Summary: Unable to delete snapshot in Cassandra
 Key: CASSANDRA-14331
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14331
 Project: Cassandra
  Issue Type: Bug
  Components: Repair
Reporter: Vikas Gupta


I'm using Cassandra 3.11 on Windows 10. When I take a snapshot of the
database, I find that I am unable to delete the snapshot directory
while Cassandra is running using command nodetool clearsnapshot. If I stop 
Cassandra, then I can delete the directory with no problem but this is not a 
feasible option in production environment. .

Is there a reason why Cassandra must hold onto these files?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-20 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407376#comment-16407376
 ] 

Stefania commented on CASSANDRA-14308:
--

That's correct, whilst a compaction is in progress a txn log file keep tracks 
of the new and old sstables. The txn log is only committed once the compaction 
is finished. Upon startup, if txn log files are found, they are examined and if 
they were not committed then the compaction leftovers will be cleared. 

May I suggest checking if anything prevents the stats component from being 
written, or sync-ed to disk, before the compaction txn is committed?

> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
> The accumulated incompleted SSTables could take lots of space, especially for 
> STCS which could have very large SSTables.
> Here is the script we use to delete the SSTables after node is restarted:
> {noformat}
> grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' 
> > ~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
> $ss*; sudo rm $ss* ; done
> {noformat}
> I would suggest to remove these incompleted SSTables while startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-20 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407426#comment-16407426
 ] 

Stefania commented on CASSANDRA-14308:
--

I had a good look at the code, the stats seemed to be sync-ed to disk correctly.

Could you maybe, once the problem reproduce, start with logs at trace level and 
verify if the compactions leftovers are removed before that exception is shown? 
You should see in the logs {{Removing temporary or obsoleted files from 
unfinished operations for table...}}. Also, you could check whether there are 
any txn log files and if so check what the content is.

> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
> The accumulated incompleted SSTables could take lots of space, especially for 
> STCS which could have very large SSTables.
> Here is the script we use to delete the SSTables after node is restarted:
> {noformat}
> grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' 
> > ~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
> $ss*; sudo rm $ss* ; done
> {noformat}
> I would suggest to remove these incompleted SSTables while startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13121) repair progress message breaks legacy JMX support

2018-03-20 Thread Patrick Bannister (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Bannister updated CASSANDRA-13121:
--
Labels: jmx lhf notifications repair  (was: lhf)
Status: Patch Available  (was: Open)

I've developed a patch for 3.0. The same changes should also work for 3.11.

To effectively test the patch, I recommend using the 
TestDeprecatedRepairNotifications dtest available in the CASSANDRA-13121 branch 
of [https://github.com/ptbannister/cassandra-dtest|[http://example.com].] 
There's a pull request in for this dtest, but right now it's only available in 
my fork.

> repair progress message breaks legacy JMX support
> -
>
> Key: CASSANDRA-13121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13121
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Scott Bale
>Priority: Minor
>  Labels: lhf, repair, jmx, notifications
> Attachments: 13121-3.0.txt
>
>
> The error progress message in {{RepairRunnable}} is not compliant with the 
> {{LegacyJMXProgressSupport}} class, which uses a regex to match on the text 
> of a progress event. Therefore, actual failures slip through as successes if 
> using legacy JMX for repairs.
> In {{RepairRunnable}}
> {code}
> protected void fireErrorAndComplete(String tag, int progressCount, int 
> totalProgress, String message)
> {
> fireProgressEvent(tag, new ProgressEvent(ProgressEventType.ERROR, 
> progressCount, totalProgress, message));
> fireProgressEvent(tag, new ProgressEvent(ProgressEventType.COMPLETE, 
> progressCount, totalProgress, String.format("Repair command #%d finished with 
> error", cmd)));
> }
> {code}
> Note the {{"Repair command #%d finished with error"}}
> See 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairRunnable.java#L109
> In {{LegacyJMXProgressSupport}}:
> {code}
> protected static final Pattern SESSION_FAILED_MATCHER = 
> Pattern.compile("Repair session .* for range .* failed with error .*");
> protected static final Pattern SESSION_SUCCESS_MATCHER = 
> Pattern.compile("Repair session .* for range .* finished");
> {code}
> See 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/progress/jmx/LegacyJMXProgressSupport.java#L38
> Legacy JMX support was introduced for CASSANDRA-11430 (version 2.2.6) and the 
> bug was introduced as part of CASSANDRA-12279 (version 2.2.8).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13805) dtest failure: repair_tests.repair_test.TestRepair.simple_parallel_repair_test

2018-03-20 Thread Patrick Bannister (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407419#comment-16407419
 ] 

Patrick Bannister commented on CASSANDRA-13805:
---

Hello - since cassci was shut down in late 2017, the description is a dead 
link. There are no details in the ticket on the environment or C* version for 
the test failure.

Currently, repair_tests/repair_test.py::TestRepair::test_simple_parallel_repair 
runs successfully on Ubuntu 16.04 LTS on an x86_64 platform for cassandra-3.0 
and cassandra-3.11.

Does this bug still need attention?

> dtest failure: repair_tests.repair_test.TestRepair.simple_parallel_repair_test
> --
>
> Key: CASSANDRA-13805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13805
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Marcus Eriksson
>Priority: Major
>  Labels: dtest
>
> http://cassci.datastax.com/job/trunk_dtest/1621/testReport/repair_tests/repair_test/TestRepair/simple_parallel_repair_test



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14332) Fix unbounded validation compactions on repair

2018-03-20 Thread Kurt Greaves (JIRA)
Kurt Greaves created CASSANDRA-14332:


 Summary: Fix unbounded validation compactions on repair
 Key: CASSANDRA-14332
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14332
 Project: Cassandra
  Issue Type: Bug
Reporter: Kurt Greaves


After CASSANDRA-13797 it's possible to cause unbounded, simultaneous validation 
compactions as we no longer wait for validations to finish. Potential fix is to 
have a sane default for the # of concurrent validation compactions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14332) Fix unbounded validation compactions on repair

2018-03-20 Thread Kurt Greaves (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Greaves updated CASSANDRA-14332:
-
Description: After CASSANDRA-13797 it's possible to cause unbounded, 
simultaneous validation compactions as we no longer wait for validations to 
finish. Potential fix is to have a sane default for the # of concurrent 
validation compactions by backporting CASSANDRA-13521 and setting a sane 
default.  (was: After CASSANDRA-13797 it's possible to cause unbounded, 
simultaneous validation compactions as we no longer wait for validations to 
finish. Potential fix is to have a sane default for the # of concurrent 
validation compactions)

> Fix unbounded validation compactions on repair
> --
>
> Key: CASSANDRA-14332
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14332
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kurt Greaves
>Priority: Major
>
> After CASSANDRA-13797 it's possible to cause unbounded, simultaneous 
> validation compactions as we no longer wait for validations to finish. 
> Potential fix is to have a sane default for the # of concurrent validation 
> compactions by backporting CASSANDRA-13521 and setting a sane default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-20 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407273#comment-16407273
 ] 

Jay Zhuang commented on CASSANDRA-14308:


Here is one proposal Option1: delete SSTables without stats component: 
[14308-wip|https://github.com/cooldoger/cassandra/tree/14308-wip], but not sure 
if it's safe to delete these SSTables and may cause data lost if there's a bug.

Option2: as the incompleted SSTables are generated by compaction, how about 
storing the new SSTable in a temporary directory (e.g.: 
{{data/keyspace/table-id/compaction/}}). Once the compaction is done, move the 
new SSTable back to the data directory. So while Cassandra startup, it can 
clean up SSTables in the temporary directory (if there's any interrupted 
compaction). I'm not familar with compaction, not sure if it's possible or the 
right way to do it. Any suggestions are welcomed.

cc @[~jasobrown], @[~jjirsa]

> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
> The accumulated incompleted SSTables could take lots of space, especially for 
> STCS which could have very large SSTables.
> Here is the script we use to delete the SSTables after node is restarted:
> {noformat}
> grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' 
> > ~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
> $ss*; sudo rm $ss* ; done
> {noformat}
> I would suggest to remove these incompleted SSTables while startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14308) Remove invalid SSTables from interrupted compaction

2018-03-20 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407367#comment-16407367
 ] 

Jeremiah Jordan commented on CASSANDRA-14308:
-

We have compaction transaction log files now CASSANDRA-7066, so we should be 
able to tell if files were complete or not as far as I know.CC  [~Stefania]

> Remove invalid SSTables from interrupted compaction
> ---
>
> Key: CASSANDRA-14308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>
> If the JVM crash while compaction is in progress, the incompleted SSTable 
> won't be cleaned up, which causes "{{Stats component is missing for 
> sstable}}" error in the startup log:
> {noformat}
> ERROR [SSTableBatchOpen:3] 2018-03-11 00:17:35,597 CassandraDaemon.java:207 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /cassandra/data/keyspace/table-id/mc-12345-big
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:458)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:374)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:533)
>  ~[apache-cassandra-3.0.14.jar:3.0.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_121]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_121]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  [apache-cassandra-3.0.14.jar:3.0.14]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {noformat}
> The accumulated incompleted SSTables could take lots of space, especially for 
> STCS which could have very large SSTables.
> Here is the script we use to delete the SSTables after node is restarted:
> {noformat}
> grep 'Stats component is missing for sstable' $SYSTEM_LOG | awk '{print $8}' 
> > ~/invalid_sstables ; for ss in `cat ~/invalid_sstables`; do echo == $ss; ll 
> $ss*; sudo rm $ss* ; done
> {noformat}
> I would suggest to remove these incompleted SSTables while startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13121) repair progress message breaks legacy JMX support

2018-03-20 Thread Patrick Bannister (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Bannister updated CASSANDRA-13121:
--
Attachment: 13121-3.0.txt

> repair progress message breaks legacy JMX support
> -
>
> Key: CASSANDRA-13121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13121
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Scott Bale
>Priority: Minor
>  Labels: lhf
> Attachments: 13121-3.0.txt
>
>
> The error progress message in {{RepairRunnable}} is not compliant with the 
> {{LegacyJMXProgressSupport}} class, which uses a regex to match on the text 
> of a progress event. Therefore, actual failures slip through as successes if 
> using legacy JMX for repairs.
> In {{RepairRunnable}}
> {code}
> protected void fireErrorAndComplete(String tag, int progressCount, int 
> totalProgress, String message)
> {
> fireProgressEvent(tag, new ProgressEvent(ProgressEventType.ERROR, 
> progressCount, totalProgress, message));
> fireProgressEvent(tag, new ProgressEvent(ProgressEventType.COMPLETE, 
> progressCount, totalProgress, String.format("Repair command #%d finished with 
> error", cmd)));
> }
> {code}
> Note the {{"Repair command #%d finished with error"}}
> See 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairRunnable.java#L109
> In {{LegacyJMXProgressSupport}}:
> {code}
> protected static final Pattern SESSION_FAILED_MATCHER = 
> Pattern.compile("Repair session .* for range .* failed with error .*");
> protected static final Pattern SESSION_SUCCESS_MATCHER = 
> Pattern.compile("Repair session .* for range .* finished");
> {code}
> See 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/progress/jmx/LegacyJMXProgressSupport.java#L38
> Legacy JMX support was introduced for CASSANDRA-11430 (version 2.2.6) and the 
> bug was introduced as part of CASSANDRA-12279 (version 2.2.8).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406721#comment-16406721
 ] 

Alexander Dejanovski commented on CASSANDRA-14326:
--

 

I agree it would be nice to keep incremental loggings indeed so that verbose 
contains info + verbose, and debug contains info + verbose + debug, but then we 
would have to do 2 changes to enable debug logging at will : 
 * change  to 
 * uncomment the ASYNCDEBUGLOG appender

Otherwise :
 * if the appender is there we always have something that's written to 
debug.log (all INFO level stuff)
 * and if o.a.c is at DEBUG all the time, any call to logger.debug() will have 
to be in a conditional block to avoid the performance penalty of interpreting 
the calls and have the appender filter out debug stuff.

Unless there's a better way of achieving this ?

> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.x
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default in the root logger (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14155) [TRUNK] Gossiper somewhat frequently hitting an NPE on node startup with dtests at org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:769)

2018-03-20 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406751#comment-16406751
 ] 

Sam Tunnicliffe commented on CASSANDRA-14155:
-

I'm not sure that the scenario above can happen quite as described. When 
\{{loadRingState}} adds the endpoints to \{{endpointStateMap}} they're created 
with a brand new \{{HeartBeatState}}, one with \{{(generation, version) == (0, 
0)}}. In \{{Gossiper::examineGossiper}}, the empty digest list in a shadow SYN 
is replaced with a list containing one digest for every known endpoint and 
these are also initialized with {{(0,0)}}. So if a node were to finish its 
shadow round, load ring state, start gossip and immediately receive a shadow 
round SYN from a peer, it would not include any state for that peer as the 
generation/version in the digest would match the one in the local epState. 

Of course though, the stacktrace in the description certainly indicates that 
the epStates map obtained from the shadow round did contain a state for the 
node in question and that its {{HOST_ID}} appState is missing. So I'm all for 
adding the check & assertion error in {{isSafeForStartup}}, although I think we 
ought to log more detail here, probably the epStates map in its entireity. I'm 
less comfortable with changing the behaviour of the shadow round if we're not 
really clear on what's causing it. As we've only seen this sporadically in 
tests, how do you feel about adding the assertion (& any other error logging 
that may be useful) and seeing if that helps us track down the cause if/when we 
see the error in future test runs? My fear is that this is a symptom of a more 
pernicious race like the ones in CASSANDRA-13700 & CASSANDRA-11825.

> [TRUNK] Gossiper somewhat frequently hitting an NPE on node startup with 
> dtests at 
> org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:769)
> 
>
> Key: CASSANDRA-14155
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14155
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Kjellman
>Assignee: Jason Brown
>Priority: Major
>
> Gossiper is somewhat frequently hitting an NPE on node startup with dtests at 
> org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:769)
> {code}
> test teardown failure
> Unexpected error found in node logs (see stdout for full details). Errors: 
> [ERROR [main] 2018-01-08 21:41:01,832 CassandraDaemon.java:675 - Exception 
> encountered during startup
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:769) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:511)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:761)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:621)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:568)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:360) 
> [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569)
>  [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:658) 
> [main/:na], ERROR [main] 2018-01-08 21:41:01,832 CassandraDaemon.java:675 - 
> Exception encountered during startup
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.gms.Gossiper.isSafeForStartup(Gossiper.java:769) 
> ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:511)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:761)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:621)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:568)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:360) 
> [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569)
>  [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:658) 
> [main/:na]]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 

[jira] [Commented] (CASSANDRA-14197) SSTable upgrade should be automatic

2018-03-20 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406756#comment-16406756
 ] 

Ariel Weisberg commented on CASSANDRA-14197:


I am +1 on this formulation of the change.

> SSTable upgrade should be automatic
> ---
>
> Key: CASSANDRA-14197
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14197
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Major
> Fix For: 4.x
>
>
> Upgradesstables should run automatically on node upgrade



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14232) Add metric for coordinator writes per column family

2018-03-20 Thread Jaydeepkumar Chovatia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406707#comment-16406707
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-14232:
---

Thanks [~sumanth.pasupuleti]

Updated patch looks good to me.

> Add metric for coordinator writes per column family
> ---
>
> Key: CASSANDRA-14232
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14232
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Major
> Fix For: 4.0
>
> Attachments: 14232-trunk.txt
>
>
> Includes write ops and latencies at coordinator per column family.
> Relevant discussion in dev mailing list - 
> [https://lists.apache.org/thread.html/f68f694b13b670a1fa28fa75620304603fc89e94ec515933199f4c37@%3Cdev.cassandra.apache.org%3E]
> Below are a few advantages of having such metric
>  * Ability to identify specific column family that coordinator writes are 
> slow to
>  * Also useful in a multi-tenant cluster, where different column families are 
> owned by different teams



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14330) TBA

2018-03-20 Thread Aleksey Yeschenko (JIRA)
Aleksey Yeschenko created CASSANDRA-14330:
-

 Summary: TBA
 Key: CASSANDRA-14330
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14330
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 3.0.x, 3.11.x, 4.x






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14326) Handle verbose logging at a different level than DEBUG

2018-03-20 Thread Alexander Dejanovski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Dejanovski updated CASSANDRA-14326:
-
Description: 
CASSANDRA-10241 introduced debug logging turned on by default to act as a 
verbose system.log and help troubleshoot production issues. 

One of the consequence was to severely affect read performance in 2.2 as 
contributors weren't all up to speed on how to use logging levels 
(CASSANDRA-14318).

As DEBUG level has a very specific meaning in dev, it is confusing to use it 
for always on verbose logging and should probably not be used this way in 
Cassandra.

Options so far are :
 # Bring back common loggings to INFO level (compactions, flushes, etc...) and 
disable debug logging by default
 # Use files named as verbose-system.log instead of debug.log and use a custom 
logging level instead of DEBUG for verbose tracing, that would be enabled by 
default. Debug logging would still exist and be disabled by default in the root 
logger (not just filtered at the appender level).

  was:
CASSANDRA-10241 introduced debug logging turned on by default to act as a 
verbose system.log and help troubleshoot production issues. 

One of the consequence was to severely affect read performance in 2.2 as 
contributors weren't all up to speed on how to use logging levels 
(CASSANDRA-14318).

As DEBUG level has a very specific meaning in dev, it is confusing to use it 
for always on verbose logging and should probably not be used this way in 
Cassandra.

Options so far are :
 # Bring back common loggings to INFO level (compactions, flushes, etc...) and 
disable debug logging by default
 # Use files named as verbose-system.log instead of debug.log and use a custom 
logging level instead of DEBUG for verbose tracing, that would be enabled by 
default. Debug logging would still exist and be disabled by default and the 
root logger level (not just filtered at the appender level).


> Handle verbose logging at a different level than DEBUG
> --
>
> Key: CASSANDRA-14326
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14326
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Alexander Dejanovski
>Priority: Major
> Fix For: 4.x
>
>
> CASSANDRA-10241 introduced debug logging turned on by default to act as a 
> verbose system.log and help troubleshoot production issues. 
> One of the consequence was to severely affect read performance in 2.2 as 
> contributors weren't all up to speed on how to use logging levels 
> (CASSANDRA-14318).
> As DEBUG level has a very specific meaning in dev, it is confusing to use it 
> for always on verbose logging and should probably not be used this way in 
> Cassandra.
> Options so far are :
>  # Bring back common loggings to INFO level (compactions, flushes, etc...) 
> and disable debug logging by default
>  # Use files named as verbose-system.log instead of debug.log and use a 
> custom logging level instead of DEBUG for verbose tracing, that would be 
> enabled by default. Debug logging would still exist and be disabled by 
> default in the root logger (not just filtered at the appender level).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-03-20 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406896#comment-16406896
 ] 

Jon Haddad commented on CASSANDRA-13853:


Yes, that's a significant improvement.  

If you could put preformatted text in a code block next time it'll make it a 
bit easier to read.  Thanks!  Looking forward to the updated patch.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Fix For: 4.x
>
> Attachments: cassandra-13853.patch
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org