date:20180221


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14246:
---
Labels: proposed-wontfix  (was: )

> Cassandra fails to start after upgrade
> --
>
> Key: CASSANDRA-14246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14246
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Rajnesh Siwal
>Priority: Critical
>  Labels: proposed-wontfix
>
> We are unable to start cassandra after it has been upgraded.
> We have a cluster of three nodes. After upgrading the first node, we found 
> that the service failed to start on it because of incompatible SSTables:
> The older version: 2.0.17
> New Version: 2.1.20
> The cassandra fails to start as it found incompatible SSTable :
> INFO  08:38:18 Opening 
> /var/lib/cassandra/data/cp_orgstmt_keyspace/cp_event_processing/cp_orgstmt_keyspace-cp_event_processing-jb-2
>  (975 bytes)
> java.lang.RuntimeException: Incompatible SSTable found. Current version ka is 
> unable to read file: 
> /var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
>  Please run upgradesstables.
> java.lang.RuntimeException: Incompatible SSTable found. Current version ka is 
> unable to read file: 
> /var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
>  Please run upgradesstables.
> Exception encountered during startup: Incompatible SSTable found. Current 
> version ka is unable to read file: 
> /var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
>  Please run upgradesstables.
>  
> Till the time cassandra server is not running, we cannot run "nodetool 
> upgradesstables".
> Please suggest



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14246) Cassandra fails to start after upgrade


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372410#comment-16372410
 ] 

Jeff Jirsa commented on CASSANDRA-14246:


This doesn't appear to be a Cassandra bug, but rather, you missed running 
upgradesstables when you went from 1.2 to 2.0. You need to use the offline 
sstableupgrade tool from 2.0, which will convert the files with {{-ic}} into 
{{-jb}}, which can be read by 2.1.

> Cassandra fails to start after upgrade
> --
>
> Key: CASSANDRA-14246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14246
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Rajnesh Siwal
>Priority: Critical
>
> We are unable to start cassandra after it has been upgraded.
> We have a cluster of three nodes. After upgrading the first node, we found 
> that the service failed to start on it because of incompatible SSTables:
> The older version: 2.0.17
> New Version: 2.1.20
> The cassandra fails to start as it found incompatible SSTable :
> INFO  08:38:18 Opening 
> /var/lib/cassandra/data/cp_orgstmt_keyspace/cp_event_processing/cp_orgstmt_keyspace-cp_event_processing-jb-2
>  (975 bytes)
> java.lang.RuntimeException: Incompatible SSTable found. Current version ka is 
> unable to read file: 
> /var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
>  Please run upgradesstables.
> java.lang.RuntimeException: Incompatible SSTable found. Current version ka is 
> unable to read file: 
> /var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
>  Please run upgradesstables.
> Exception encountered during startup: Incompatible SSTable found. Current 
> version ka is unable to read file: 
> /var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
>  Please run upgradesstables.
>  
> Till the time cassandra server is not running, we cannot run "nodetool 
> upgradesstables".
> Please suggest



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372397#comment-16372397
 ] 

Jeff Jirsa commented on CASSANDRA-14247:


cc [~jrwest] who's somewhat familiar with this part of the codebase.


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372394#comment-16372394
 ] 

Jeff Jirsa commented on CASSANDRA-14239:


Memtables auto-size to 25% of the heap - that means you're getting 25GB 
memtables, which is just silly big. Cap them to a few GB (probably 1-2GB), so 
the flushes are more frequent but smaller. 

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12813) NPE in auth for bootstrapping node

2018-02-21 Thread Varsha Mahadevan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372381#comment-16372381
 ] 

Varsha Mahadevan commented on CASSANDRA-12813:
--

Is there a known workaround to get past this issue? I have a node to bootstrap 
into my cluster but cannot immediately undertake a version upgrade of 
cassandra. My node bootstrap does not succeed due to this error.

> NPE in auth for bootstrapping node
> --
>
> Key: CASSANDRA-12813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12813
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Charles Mims
>Assignee: Alex Petrov
>Priority: Major
> Fix For: 2.2.9, 3.0.10, 3.10
>
>
> {code}
> ERROR [SharedPool-Worker-1] 2016-10-19 21:40:25,991 Message.java:617 - 
> Unexpected exception during request; channel = [id: 0x15eb017f, / omitted>:40869 => /10.0.0.254:9042]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.auth.PasswordAuthenticator.doAuthenticate(PasswordAuthenticator.java:144)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>   at 
> org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:86)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>   at 
> org.apache.cassandra.auth.PasswordAuthenticator.access$100(PasswordAuthenticator.java:54)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>   at 
> org.apache.cassandra.auth.PasswordAuthenticator$PlainTextSaslAuthenticator.getAuthenticatedUser(PasswordAuthenticator.java:182)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>   at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:78)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>   at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
>  [apache-cassandra-3.0.9.jar:3.0.9]
>   at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
>  [apache-cassandra-3.0.9.jar:3.0.9]
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_101]
>   at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  [apache-cassandra-3.0.9.jar:3.0.9]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.9.jar:3.0.9]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
> {code}
> I have a node that has been joining for around 24 hours.  My application is 
> configured with the IP address of the joining node in the list of nodes to 
> connect to (ruby driver), and I have been getting around 200 events of this 
> NPE per hour.  I removed the IP of the joining node from the list of nodes 
> for my app to connect to and the errors stopped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-11559) Enhance node representation

2018-02-21 Thread Alex Lourie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372373#comment-16372373
 ] 

Alex Lourie commented on CASSANDRA-11559:
-

I'm going to take a swing at this one. 

I'm going to tentatively call the interface IVirtualEndpoint (but clearly open 
to other suggestions), and will start using the implementation in all instances 
instead of InetAddressAndPort where relevant. I'd love to hear other's opinions 
before I go too deep into refactoring spree.

 

Thanks!

> Enhance node representation
> ---
>
> Key: CASSANDRA-11559
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11559
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Distributed Metadata
>Reporter: Paulo Motta
>Assignee: Alex Lourie
>Priority: Minor
>
> We currently represent nodes as {{InetAddress}} objects on {{TokenMetadata}}, 
> what causes difficulties when replacing a node with the same address (see 
> CASSANDRA-8523 and CASSANDRA-9244).
> Since CASSANDRA-4120 we index hosts by {{UUID}} in gossip, so I think it's 
> time to move that representation to {{TokenMetadata}}.
> I propose representing nodes as {{InetAddress, UUID}} pairs on 
> {{TokenMetadata}}, encapsulated in a {{VirtualNode}} interface, so it will 
> backward compatible with the current representation, while still allowing us 
> to enhance it in the future with additional metadata (and improved vnode 
> handling) if needed.
> This change will probably affect interfaces of internal classes like 
> {{TokenMetadata}} and {{AbstractReplicationStrategy}}, so I'd like to hear 
> from integrators and other developers if it's possible to change these 
> without major hassle or if we need to wait until 4.0.
> Besides updating {{TokenMetadata}} and {{AbstractReplicationStrategy}} (and 
> subclasses),  we will also need to replace all {{InetAddress}} uses with 
> {{VirtualNode.getEndpoint()}} calls on {{StorageService}} and related classes 
> and tests. We would probably already be able to replace some 
> {{TokenMetadata.getHostId(InetAddress endpoint)}} calls with 
> {{VirtualNode.getHostId()}}.
> While we will still be dealing with {{InetAddress}} on {{StorageService}} in 
> this initial stage, in the future I think we should pass {{VirtualNode}} 
> instances around and only translate from {{VirtualNode}} to {{InetAddress}} 
> in the network layer.
> Public interfaces like {{IEndpointSnitch}} will not be affected by this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-11559) Enhance node representation

2018-02-21 Thread Alex Lourie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Lourie reassigned CASSANDRA-11559:
---

Assignee: Alex Lourie

> Enhance node representation
> ---
>
> Key: CASSANDRA-11559
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11559
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Distributed Metadata
>Reporter: Paulo Motta
>Assignee: Alex Lourie
>Priority: Minor
>
> We currently represent nodes as {{InetAddress}} objects on {{TokenMetadata}}, 
> what causes difficulties when replacing a node with the same address (see 
> CASSANDRA-8523 and CASSANDRA-9244).
> Since CASSANDRA-4120 we index hosts by {{UUID}} in gossip, so I think it's 
> time to move that representation to {{TokenMetadata}}.
> I propose representing nodes as {{InetAddress, UUID}} pairs on 
> {{TokenMetadata}}, encapsulated in a {{VirtualNode}} interface, so it will 
> backward compatible with the current representation, while still allowing us 
> to enhance it in the future with additional metadata (and improved vnode 
> handling) if needed.
> This change will probably affect interfaces of internal classes like 
> {{TokenMetadata}} and {{AbstractReplicationStrategy}}, so I'd like to hear 
> from integrators and other developers if it's possible to change these 
> without major hassle or if we need to wait until 4.0.
> Besides updating {{TokenMetadata}} and {{AbstractReplicationStrategy}} (and 
> subclasses),  we will also need to replace all {{InetAddress}} uses with 
> {{VirtualNode.getEndpoint()}} calls on {{StorageService}} and related classes 
> and tests. We would probably already be able to replace some 
> {{TokenMetadata.getHostId(InetAddress endpoint)}} calls with 
> {{VirtualNode.getHostId()}}.
> While we will still be dealing with {{InetAddress}} on {{StorageService}} in 
> this initial stage, in the future I think we should pass {{VirtualNode}} 
> instances around and only translate from {{VirtualNode}} to {{InetAddress}} 
> in the network layer.
> Public interfaces like {{IEndpointSnitch}} will not be affected by this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-14252:
--
Status: Patch Available  (was: Open)

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 3.11.x
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372232#comment-16372232
 ] 

Dikang Gu commented on CASSANDRA-14252:
---

|[patch | 
https://github.com/DikangGu/cassandra/commit/6cfa9f1ee05a76dd4b510bff6f8bea347d118d42]|[unit
 test | https://circleci.com/gh/DikangGu/cassandra/9]|

> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 3.11.x
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test cluster, it improved the client latency in single 
> slow data node case significantly.  
> I flag this as a Bug, because it caused problems to our use cases multiple 
> times.
>   logs ===
> _2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [0.0]_
>  _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
> sortByProximityWithBadness: after sorting by proximity, addresses order 
> change to [ip1, ip2], with scores [1.0]_
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-14252:
--
Description: 
The problem I want to solve is that I found in our deployment, one slow but 
alive data node can slow down the whole cluster, even caused timeout of our 
requests. 

We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
latency is too high.

I added some debug log, and figured out that in a lot of cases, the score from 
remote data node was not populated, so the fallback to sortByProximityWithScore 
never happened. That's why a single slow data node, can cause huge problems to 
the whole cluster.

In this jira, I'd like to use zero as default score, so that we will get a 
chance to try remote data node, if local one is slow. 

I tested it in our test cluster, it improved the client latency in single slow 
data node case significantly.  

I flag this as a Bug, because it caused problems to our use cases multiple 
times.

  logs ===

_2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [ip1, ip2], with scores [1.0]_
 _2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [ip1, ip2], with scores [0.0]_
 _2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [ip1, ip2], with scores [1.0]_
 _2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [ip1, ip2], with scores [1.0]_

 

 

 

  was:
The problem I want to solve is that I found in our deployment, one slow but 
alive data node can slow down the whole cluster, even caused timeout of our 
requests. 

We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
latency is too high.

I added some debug log, and figured out that in a lot of cases, the score from 
remote data node was not populated, so the fallback to sortByProximityWithScore 
never happened. That's why a single slow data node, can cause huge problems to 
the whole cluster.

In this jira, I'd like to use zero as default score, so that we will get a 
chance to try remote data node, if local one is slow. 

I tested it in our test cluster, it improved the client latency in single slow 
data node case significantly.  

I flag this as a Bug, because it caused problems to our use cases multiple 
times.

  logs ===

_2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:5113:face:0:2d:0, /2401:db00:1030:90fb:face:0:5:0], with 
scores [1.0]_
_2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:510d:face:0:5:0, /2401:db00:1030:a119:face:0:b:0], with 
scores [0.0]_
_2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:5113:face:0:2d:0, /2401:db00:1030:a119:face:0:b:0], with 
scores [1.0]_
_2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:5113:face:0:2d:0, /2401:db00:1030:90fb:face:0:5:0], with 
scores [1.0]_

 

 

 


> Use zero as default score in DynamicEndpointSnitch
> --
>
> Key: CASSANDRA-14252
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
> Project: Cassandra
>  Issue Type: Bug
>  Components: Coordination
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
> Fix For: 3.11.x
>
>
> The problem I want to solve is that I found in our deployment, one slow but 
> alive data node can slow down the whole cluster, even caused timeout of our 
> requests. 
> We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
> DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
> latency is too high.
> I added some debug log, and figured out that in a lot of cases, the score 
> from remote data node was not populated, so the fallback to 
> sortByProximityWithScore never happened. That's why a single slow data node, 
> can cause huge problems to the whole cluster.
> In this jira, I'd like to use zero as default score, so that we will get a 
> chance to try remote data node, if local one is slow. 
> I tested it in our test

[jira] [Created] (CASSANDRA-14252) Use zero as default score in DynamicEndpointSnitch

Dikang Gu created CASSANDRA-14252:
-

 Summary: Use zero as default score in DynamicEndpointSnitch
 Key: CASSANDRA-14252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14252
 Project: Cassandra
  Issue Type: Bug
  Components: Coordination
Reporter: Dikang Gu
Assignee: Dikang Gu
 Fix For: 3.11.x


The problem I want to solve is that I found in our deployment, one slow but 
alive data node can slow down the whole cluster, even caused timeout of our 
requests. 

We are using DynamicEndpointSnitch, with badness_threshold 0.1. I expect the 
DynamicEndpointSnitch switch to sortByProximityWithScore, if local data node 
latency is too high.

I added some debug log, and figured out that in a lot of cases, the score from 
remote data node was not populated, so the fallback to sortByProximityWithScore 
never happened. That's why a single slow data node, can cause huge problems to 
the whole cluster.

In this jira, I'd like to use zero as default score, so that we will get a 
chance to try remote data node, if local one is slow. 

I tested it in our test cluster, it improved the client latency in single slow 
data node case significantly.  

I flag this as a Bug, because it caused problems to our use cases multiple 
times.

  logs ===

_2018-02-21_23:08:57.54145 WARN 23:08:57 [RPC-Thread:978]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:5113:face:0:2d:0, /2401:db00:1030:90fb:face:0:5:0], with 
scores [1.0]_
_2018-02-21_23:08:57.54319 WARN 23:08:57 [RPC-Thread:967]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:510d:face:0:5:0, /2401:db00:1030:a119:face:0:b:0], with 
scores [0.0]_
_2018-02-21_23:08:57.55111 WARN 23:08:57 [RPC-Thread:453]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:5113:face:0:2d:0, /2401:db00:1030:a119:face:0:b:0], with 
scores [1.0]_
_2018-02-21_23:08:57.55687 WARN 23:08:57 [RPC-Thread:753]: 
sortByProximityWithBadness: after sorting by proximity, addresses order change 
to [/2401:db00:30:5113:face:0:2d:0, /2401:db00:1030:90fb:face:0:5:0], with 
scores [1.0]_

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14251) View replica is not written to pending endpoint when base replica is also view replica


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372203#comment-16372203
 ] 

Paulo Motta edited comment on CASSANDRA-14251 at 2/21/18 11:22 PM:
---

Added regression 
[dtest|https://github.com/pauloricardomg/cassandra-dtest/commit/dbf6eba59da67edaa50930046fc440430fe61d2d]
 with different RFs and 
[restored|https://github.com/pauloricardomg/cassandra/commit/1a06c97a7896621f346b806e6369a820b75fdd75]
 correct behavior along with a NEWS.txt asking MV users potentially affected by 
this to run repair on the view to ensure all replicas are up to date.

||3.0||3.11||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14251]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:3.11-14251]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14251]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14251]|

Will run internal CI and post results here when ready.


was (Author: pauloricardomg):
Added regression 
[dtest|https://github.com/pauloricardomg/cassandra-dtest/commit/dbf6eba59da67edaa50930046fc440430fe61d2d]
 and 
[restored|https://github.com/pauloricardomg/cassandra/commit/1a06c97a7896621f346b806e6369a820b75fdd75]
 correct behavior along with a NEWS.txt asking MV users potentially affected by 
this to run repair on the view to ensure all replicas are up to date.

||3.0||3.11||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14251]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:3.11-14251]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14251]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14251]|

Will run internal CI and post results here when ready.

> View replica is not written to pending endpoint when base replica is also 
> view replica
> --
>
> Key: CASSANDRA-14251
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14251
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> From the [dev 
> list|https://www.mail-archive.com/dev@cassandra.apache.org/msg12084.html]:
> bq. There's an optimization that when we're lucky enough that the paired view 
> replica is the same as this base replica, mutateMV doesn't use the normal 
> view-mutation-sending code (wrapViewBatchResponseHandler) and just writes the 
> mutation locally. In particular, in this case we do NOT write to the pending 
> node (unless I'm missing something). But, sometimes all replicas will be 
> paired with themselves - this can happen for example when number of nodes is 
> equal to RF, or when the base and view table have the same partition keys 
> (but different clustering keys). In this case, it seems the pending node will 
> not be written at all...
> This was a regression from CASSANDRA-13069 and the original behavior should 
> be restored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14251) View replica is not written to pending endpoint when base replica is also view replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14251:

Reproduced In: 3.11.1, 3.0.15  (was: 3.0.15, 3.11.1)
   Status: Patch Available  (was: Open)

Added regression 
[dtest|https://github.com/pauloricardomg/cassandra-dtest/commit/dbf6eba59da67edaa50930046fc440430fe61d2d]
 and 
[restored|https://github.com/pauloricardomg/cassandra/commit/1a06c97a7896621f346b806e6369a820b75fdd75]
 correct behavior along with a NEWS.txt asking MV users potentially affected by 
this to run repair on the view to ensure all replicas are up to date.

||3.0||3.11||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-14251]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:3.11-14251]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14251]|[branch|https://github.com/apache/cassandra-dtest/compare/master...pauloricardomg:14251]|

Will run internal CI and post results here when ready.

> View replica is not written to pending endpoint when base replica is also 
> view replica
> --
>
> Key: CASSANDRA-14251
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14251
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> From the [dev 
> list|https://www.mail-archive.com/dev@cassandra.apache.org/msg12084.html]:
> bq. There's an optimization that when we're lucky enough that the paired view 
> replica is the same as this base replica, mutateMV doesn't use the normal 
> view-mutation-sending code (wrapViewBatchResponseHandler) and just writes the 
> mutation locally. In particular, in this case we do NOT write to the pending 
> node (unless I'm missing something). But, sometimes all replicas will be 
> paired with themselves - this can happen for example when number of nodes is 
> equal to RF, or when the base and view table have the same partition keys 
> (but different clustering keys). In this case, it seems the pending node will 
> not be written at all...
> This was a regression from CASSANDRA-13069 and the original behavior should 
> be restored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-11163:
---
Reviewer: Chris Lohfink  (was: Jeff Jirsa)

> Summaries are needlessly rebuilt when the BF FP ratio is changed
> 
>
> Key: CASSANDRA-11163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11163
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is from trunk, but I also saw this happen on 2.0:
> Before:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 221460
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-6-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-7-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root104178 Feb 11 23:50 ma-5-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> root@bw-1:/srv/cassandra# md5sum 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/ma-5-big-Summary.db
> 5fca154fc790f7cfa37e8ad6d1c7552c
> {noformat}
> BF ratio changed, node restarted:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 242168
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
> -rw-r--r-- 1 root root  2156 Feb 11 23:50 ma-6-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-7-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-7-big-Statistics.db
> -rw-r--r-- 1 root root   2607614 Feb 11 23:50 ma-7-big-Index.db
> -rw-r--r-- 1 root root192432 Feb 11 23:50 ma-7-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-7-big-Digest.crc32
> -rw-r--r-- 1 root root  35190400 Feb 11 23:50 ma-7-big-Data.db
> -rw-r--r-- 1 root root  2152 Feb 11 23:50 ma-7-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-5-big-TOC.txt
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-5-big-Statistics.db
> -rw-r--r-- 1 root root  10289077 Feb 11 23:50 ma-5-big-Index.db
> -rw-r--r-- 1 root root757384 Feb 11 23:50 ma-5-big-Filter.db
> -rw-r--r-- 1 root root 9 Feb 11 23:50 ma-5-big-Digest.crc32
> -rw-r--r-- 1 root root 139201355 Feb 11 23:50 ma-5-big-Data.db
> -rw-r--r-- 1 root root  8508 Feb 11 23:50 ma-5-big-CRC.db
> -rw-r--r-- 1 root root80 Feb 12 00:03 ma-8-big-TOC.txt
> -rw-r--r-- 1 root root 14902 Feb 12 00:03 ma-8-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 12 00:03 ma-8-big-Statistics.db
> -rw-r--r-- 1 root root   1458631 Feb 12 00:03 ma-8-big-Index.db
> -rw-r--r-- 1 root root 10808 Feb 12 00:03 ma-8-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 12 00:03 ma-8-big-Digest.crc32
> -rw-r--r-- 1 root root  19660275 Feb 12 00:03 ma-8-big-Data.db
> -rw-r--r-- 1 root root  1204 Feb 12 00:03 ma-8-big-CRC.db
> -rw-r--r-- 1

[jira] [Updated] (CASSANDRA-14251) View replica is not written to pending endpoint when base replica is also view replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14251:

Component/s: Materialized Views

> View replica is not written to pending endpoint when base replica is also 
> view replica
> --
>
> Key: CASSANDRA-14251
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14251
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Major
> Fix For: 4.0, 3.0.17, 3.11.3
>
>
> From the [dev 
> list|https://www.mail-archive.com/dev@cassandra.apache.org/msg12084.html]:
> bq. There's an optimization that when we're lucky enough that the paired view 
> replica is the same as this base replica, mutateMV doesn't use the normal 
> view-mutation-sending code (wrapViewBatchResponseHandler) and just writes the 
> mutation locally. In particular, in this case we do NOT write to the pending 
> node (unless I'm missing something). But, sometimes all replicas will be 
> paired with themselves - this can happen for example when number of nodes is 
> equal to RF, or when the base and view table have the same partition keys 
> (but different clustering keys). In this case, it seems the pending node will 
> not be written at all...
> This was a regression from CASSANDRA-13069 and the original behavior should 
> be restored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14251) View replica is not written to pending endpoint when base replica is also view replica

Paulo Motta created CASSANDRA-14251:
---

 Summary: View replica is not written to pending endpoint when base 
replica is also view replica
 Key: CASSANDRA-14251
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14251
 Project: Cassandra
  Issue Type: Bug
Reporter: Paulo Motta
Assignee: Paulo Motta
 Fix For: 4.0, 3.0.17, 3.11.3


>From the [dev 
>list|https://www.mail-archive.com/dev@cassandra.apache.org/msg12084.html]:

bq. There's an optimization that when we're lucky enough that the paired view 
replica is the same as this base replica, mutateMV doesn't use the normal 
view-mutation-sending code (wrapViewBatchResponseHandler) and just writes the 
mutation locally. In particular, in this case we do NOT write to the pending 
node (unless I'm missing something). But, sometimes all replicas will be paired 
with themselves - this can happen for example when number of nodes is equal to 
RF, or when the base and view table have the same partition keys (but different 
clustering keys). In this case, it seems the pending node will not be written 
at all...

This was a regression from CASSANDRA-13069 and the original behavior should be 
restored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

cassandra-builds git commit: create a separate jenkins job for running dtests in docker

2018-02-21 Thread marcuse

Repository: cassandra-builds
Updated Branches:
  refs/heads/master d3cd2e8ce -> 27bc20645


create a separate jenkins job for running dtests in docker


Project: http://git-wip-us.apache.org/repos/asf/cassandra-builds/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-builds/commit/27bc2064
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-builds/tree/27bc2064
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-builds/diff/27bc2064

Branch: refs/heads/master
Commit: 27bc2064575d236c9b2f5878f8fafed35b8ba6fc
Parents: d3cd2e8
Author: Marcus Eriksson 
Authored: Wed Feb 21 14:02:38 2018 -0800
Committer: Marcus Eriksson 
Committed: Wed Feb 21 14:02:38 2018 -0800

--
 docker/jenkins/dtest.sh   |  9 +
 docker/jenkins/jenkinscommand.sh  | 13 ++
 jenkins-dsl/cassandra_job_dsl_seed.groovy | 55 ++
 3 files changed, 77 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra-builds/blob/27bc2064/docker/jenkins/dtest.sh
--
diff --git a/docker/jenkins/dtest.sh b/docker/jenkins/dtest.sh
new file mode 100644
index 000..87dd77f
--- /dev/null
+++ b/docker/jenkins/dtest.sh
@@ -0,0 +1,9 @@
+#!/bin/sh
+export WORKSPACE=/home/cassandra/cassandra
+export LANG=en_US.UTF-8
+export PYTHONIOENCODING=utf-8
+export PYTHONUNBUFFERED=true
+git clone --depth=1 --branch=$BRANCH https://github.com/$REPO/cassandra.git
+cd cassandra
+git clone --branch=$DTEST_BRANCH $DTEST_REPO
+../cassandra-builds/build-scripts/cassandra-dtest-pytest.sh

http://git-wip-us.apache.org/repos/asf/cassandra-builds/blob/27bc2064/docker/jenkins/jenkinscommand.sh
--
diff --git a/docker/jenkins/jenkinscommand.sh b/docker/jenkins/jenkinscommand.sh
new file mode 100644
index 000..f218ff9
--- /dev/null
+++ b/docker/jenkins/jenkinscommand.sh
@@ -0,0 +1,13 @@
+#!/bin/sh
+cat > env.list

[jira] [Resolved] (CASSANDRA-14249) Dtests aren't working on python3.5

2018-02-21 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson resolved CASSANDRA-14249.
-
Resolution: Not A Bug

Turns out this is user error. It happens when `python3-dev` isn't installed (or 
failed to install), this is what you get.

> Dtests aren't working on python3.5
> --
>
> Key: CASSANDRA-14249
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14249
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Philip Thompson
>Assignee: Ariel Weisberg
>Priority: Major
>
> I'm running the default python3.5 on a fresh xenial box (so 3.5.1, I think). 
> I install a python3.5 virtualenv with `python3 -m venv venv`. From in that 
> virtualenv, I install the dtest requirements.txt, then I run
> {code}
> pytest --cassandra-dir=\$CASSANDRA_DIR --use-vnodes 
> --skip-resource-intensive-tests  --verbose
> {code}
> This gives me the following error:
> {code:java}
> 00:06:47.056   File 
> "/home/user/venv/lib/python3.5/site-packages/_pytest/config.py", line 329, in 
> _getconftestmodules
> 00:06:47.056 return self._path2confmods[path]
> 00:06:47.056 KeyError: local('/home/user/cassandra-dtest')
> 00:06:47.056 
> 00:06:47.056 During handling of the above exception, another exception 
> occurred:
> 00:06:47.056 Traceback (most recent call last):
> 00:06:47.056   File 
> "/home/user/venv/lib/python3.5/site-packages/_pytest/config.py", line 360, in 
> _importconftest
> 00:06:47.056 return self._conftestpath2mod[conftestpath]
> 00:06:47.056 KeyError: local('/home/user/cassandra-dtest/conftest.py')
>  {code}
> This same process works flawlessly on python 3.6.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2018-02-21 Thread Simone Franzini (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371947#comment-16371947
 ] 

Simone Franzini commented on CASSANDRA-8675:


[~vonalim] Thanks, the additional change to formatting.py did the trick! This 
is working for us now.

> COPY TO/FROM broken for newline characters
> --
>
> Key: CASSANDRA-8675
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
> Project: Cassandra
>  Issue Type: Bug
> Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
> protocol v3]
> Ubuntu 14.04 64-bit
>Reporter: Lex Lythius
>Priority: Major
>  Labels: cqlsh
> Fix For: 2.1.3
>
> Attachments: copytest.csv
>
>
> Exporting/importing does not preserve contents when texts containing newline 
> (and possibly other) characters are involved:
> {code:sql}
> cqlsh:test> create table if not exists copytest (id int primary key, t text);
> cqlsh:test> insert into copytest (id, t) values (1, 'This has a newline
> ... character');
> cqlsh:test> insert into copytest (id, t) values (2, 'This has a quote " 
> character');
> cqlsh:test> insert into copytest (id, t) values (3, 'This has a fake tab \t 
> character (typed backslash, t)');
> cqlsh:test> select * from copytest;
>  id | t
> +-
>   1 |   This has a newline\ncharacter
>   2 |This has a quote " character
>   3 | This has a fake tab \t character (entered slash-t text)
> (3 rows)
> cqlsh:test> copy copytest to '/tmp/copytest.csv';
> 3 rows exported in 0.034 seconds.
> cqlsh:test> copy copytest from '/tmp/copytest.csv';
> 3 rows imported in 0.005 seconds.
> cqlsh:test> select * from copytest;
>  id | t
> +---
>   1 |  This has a newlinencharacter
>   2 |  This has a quote " character
>   3 | This has a fake tab \t character (typed backslash, t)
> (3 rows)
> {code}
> I tried replacing \n in the CSV file with \\n, which just expands to \n in 
> the table; and with an actual newline character, which fails with error since 
> it prematurely terminates the record.
> It seems backslashes are only used to take the following character as a 
> literal
> Until this is fixed, what would be the best way to refactor an old table with 
> a new, incompatible structure maintaining its content and name, since we 
> can't rename tables?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371877#comment-16371877
 ] 

Chris Lohfink commented on CASSANDRA-14239:
---

Have cfstats/schema (names obfuscated or something) output from one of the 
replicas?

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14250) ERROR 1815 (HY000) at line 28: Internal error: TimedOutException: Default TException.

2018-02-21 Thread Silvio Amorim (JIRA)

Silvio Amorim created CASSANDRA-14250:
-

 Summary: ERROR 1815 (HY000) at line 28: Internal error: 
TimedOutException: Default TException.
 Key: CASSANDRA-14250
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14250
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
 Environment: VMware Linux server CentOs 6

[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]

Server version: 10.0.33-MariaDB MariaDB Server 

 
Reporter: Silvio Amorim
 Fix For: 3.11.3


Hello,

Good day,

 

Please I need your support for solve this problem and I thank you for all the 
help you can give.

I'm facing a very strange problem with cassandradb.

I installed cassandra version 3.11.1 on a virtual machine with linux CentOS, 
where I had already converted a mysql database to MariaDb (Server version: 
10.0.33-MariaDB MariaDB Server).

 

In MariaDB I created the following table:

 

CREATE TABLE `tbl_paciente_crm_anexo` (

  `id` int (11) NOT NULL PRIMARY KEY,

  `id_crm` int (11) DEFAULT NULL,

  `patient_id` int (11) DEFAULT NULL,

  `int_organization` int (11) NOT NULL DEFAULT '0',

  `id_pasta_pai` int (11) DEFAULT '0',

  `file` longblob,

  `file_file` longblob,

  `subtitle` varchar (255) DEFAULT NULL,

  `key words` text,

  `filename` varchar (100) DEFAULT NULL,

  `filename` varchar (20) DEFAULT NULL,

  `char_type` char (1) DEFAULT NULL,

  blob data_foto,

  `date_inclusion` blob,

  `data_alteracao` blob,

  `data_ult_access` blob,

  `log_user` varchar (10) DEFAULT NULL,

  `sign` char (1) DEFAULT NULL,

  `login_id` int (11) DEFAULT NULL,

  `xml` blob,

  `hash_xml` varchar (64) DEFAULT NULL,

  `hash_verif` varchar (64) DEFAULT NULL,

  `status_assinatura` varchar (2) DEFAULT NULL

) ENGINE = CASSANDRA thrift_host = `localhost` keyspace =` md_paciente` 
column_family = `cf_crm_anexo`;

 

 

I'm trying to load 147Gb of data into this table, and there are 2 blob fields 
in this table .

 
 * +During the loading, and after inserting several records, I receive the 
following error:+

 

[root@srvmeddbh01 dbbkp]# mysql -p -u root –pxx 
medicina_intramed

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371803#comment-16371803
 ] 

Chris Lohfink commented on CASSANDRA-14239:
---

Do you have a secondary index? Sometimes on bootstraping the applying of 
mutations on the 2i rebuild will overrun the write throughput the node is 
capable of.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14249) Dtests aren't working on python3.5

2018-02-21 Thread Philip Thompson (JIRA)

Philip Thompson created CASSANDRA-14249:
---

 Summary: Dtests aren't working on python3.5
 Key: CASSANDRA-14249
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14249
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Philip Thompson
Assignee: Ariel Weisberg


I'm running the default python3.5 on a fresh xenial box (so 3.5.1, I think). I 
install a python3.5 virtualenv with `python3 -m venv venv`. From in that 
virtualenv, I install the dtest requirements.txt, then I run
{code}
pytest --cassandra-dir=\$CASSANDRA_DIR --use-vnodes 
--skip-resource-intensive-tests  --verbose
{code}
This gives me the following error:
{code:java}
00:06:47.056   File 
"/home/user/venv/lib/python3.5/site-packages/_pytest/config.py", line 329, in 
_getconftestmodules
00:06:47.056 return self._path2confmods[path]
00:06:47.056 KeyError: local('/home/user/cassandra-dtest')
00:06:47.056 
00:06:47.056 During handling of the above exception, another exception occurred:
00:06:47.056 Traceback (most recent call last):
00:06:47.056   File 
"/home/user/venv/lib/python3.5/site-packages/_pytest/config.py", line 360, in 
_importconftest
00:06:47.056 return self._conftestpath2mod[conftestpath]
00:06:47.056 KeyError: local('/home/user/cassandra-dtest/conftest.py')
 {code}

This same process works flawlessly on python 3.6.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13993) Add optional startup delay to wait until peers are ready

2018-02-21 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371686#comment-16371686
 ] 

Ariel Weisberg commented on CASSANDRA-13993:


I am generally +1 other than I would like to see it spin more aggressively on 
checking whether the responses came back.

I'm not sure about Joseph's point. I mean this is going to improve the 
situation just by virtue of priming all the connections even if it doesn't wait 
for all of them to complete setup. For nodes that are going to be available 
they might now be available within the timeout budget of subsequent reads and 
writes. For nodes that aren't available in time they might not have become 
available anyways.

> Add optional startup delay to wait until peers are ready
> 
>
> Key: CASSANDRA-13993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13993
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Lifecycle
>Reporter: Jason Brown
>Assignee: Jason Brown
>Priority: Minor
> Fix For: 4.x
>
>
> When bouncing a node in a large cluster, is can take a while to recognize the 
> rest of the cluster as available. This is especially true if using TLS on 
> internode messaging connections. The bouncing node (and any clients connected 
> to it) may see a series of Unavailable or Timeout exceptions until the node 
> is 'warmed up' as connecting to the rest of the cluster is asynchronous from 
> the rest of the startup process.
> There are two aspects that drive a node's ability to successfully communicate 
> with a peer after a bounce:
> - marking the peer as 'alive' (state that is held in gossip). This affects 
> the unavailable exceptions
> - having both open outbound and inbound connections open and ready to each 
> peer. This affects timeouts.
> Details of each of these mechanisms are described in the comments below.
> This ticket proposes adding a mechanism, optional and configurable, to delay 
> opening the client native protocol port until some percentage of the peers in 
> the cluster is marked alive and connected to/from. Thus while we potentially 
> slow down startup (delay opening the client port), we alleviate the chance 
> that queries made by clients don't hit transient unavailable/timeout 
> exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14248) SSTableIndex should not use Ref#globalCount() to determine when to delete index file

2018-02-21 Thread Jordan West (JIRA)

Jordan West created CASSANDRA-14248:
---

 Summary: SSTableIndex should not use Ref#globalCount() to 
determine when to delete index file
 Key: CASSANDRA-14248
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14248
 Project: Cassandra
  Issue Type: Bug
  Components: sasi
Reporter: Jordan West
Assignee: Jordan West
 Fix For: 3.11.x


{{SSTableIndex}} instances maintain a {{Ref}} to the underlying 
{{SSTableReader}} instance. When determining whether or not to delete the file 
after the last {{SSTableIndex}} reference is released, the implementation uses 
{{sstableRef.globalCount()}}: 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/sasi/SSTableIndex.java#L135.]
 This is incorrect because {{sstableRef.globalCount()}} returns the number of 
references to the specific instance of {{SSTableReader}}. However, in cases 
like index summary redistribution, there can be more than one instance of 
{{SSTableReader}}. Further, since the reader is shared across multiple indexes, 
not all indexes see the count go to 0. This can lead to cases where the 
{{SSTableIndex}} file is incorrectly deleted or not deleted when it should be.

 

A more correct implementation would be to either:
 * Tie into the existing {{SSTableTidier}}. SASI indexes already are SSTable 
components but are not cleaned up by the {{SSTableTidier}} because they are not 
found with the currently cleanup implementation
 * Revamp {{SSTableIndex}} reference counting to use {{Ref}} and implement a 
new tidier. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14055) Index redistribution breaks SASI index

2018-02-21 Thread Jordan West (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371687#comment-16371687
 ] 

Jordan West commented on CASSANDRA-14055:
-

[~lboutros] Great! Thanks for taking a look. I've created 
https://issues.apache.org/jira/browse/CASSANDRA-14248. 

> Index redistribution breaks SASI index
> --
>
> Key: CASSANDRA-14055
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14055
> Project: Cassandra
>  Issue Type: Bug
>  Components: sasi
>Reporter: Ludovic Boutros
>Assignee: Ludovic Boutros
>Priority: Major
>  Labels: patch
> Fix For: 3.11.x, 4.x
>
> Attachments: 14055-jrwest-3.11.patch, 14055-jrwest-trunk.patch, 
> CASSANDRA-14055-jrwest.patch, CASSANDRA-14055.patch, CASSANDRA-14055.patch, 
> CASSANDRA-14055.patch
>
>
> During index redistribution process, a new view is created.
> During this creation, old indexes should be released.
> But, new indexes are "attached" to the same SSTable as the old indexes.
> This leads to the deletion of the last SASI index file and breaks the index.
> The issue is in this function : 
> [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14167) IndexOutOfBoundsException when selecting column counter and consistency quorum

2018-02-21 Thread Dhawal Mody (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371519#comment-16371519
 ] 

Dhawal Mody commented on CASSANDRA-14167:
-

[~slebresne] [~iamaleksey] - Hi Aleksey and Sylvain - I saw 
https://issues.apache.org/jira/browse/CASSANDRA-11726 which is similar to the 
issue I'm facing - can you please let me know on how can I fix this? Should I 
downgrade to 3.11.0 from 3.11.1 - would that help in fixing this issue? 

> IndexOutOfBoundsException when selecting column counter and consistency quorum
> --
>
> Key: CASSANDRA-14167
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14167
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.11.1
> Ubuntu 14-04
>Reporter: Tristan Last
>Priority: Major
>
> This morning I upgraded my cluster from 3.11.0 to 3.11.1 and it appears when 
> I perform a query on a counter specifying the column name cassandra throws 
> the following exception:
> {code:java}
> WARN [ReadStage-1] 2018-01-15 10:58:30,121 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[ReadStage-1,5,main]: {}
> java.lang.IndexOutOfBoundsException: null
> java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_144]
> java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) ~[na:1.8.0_144]
> org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:173)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.context.CounterContext.updateDigest(CounterContext.java:696)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.rows.AbstractCell.digest(AbstractCell.java:126) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.rows.AbstractRow.digest(AbstractRow.java:73) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:181)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators.digest(UnfilteredPartitionIterators.java:263)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.ReadResponse.makeDigest(ReadResponse.java:120) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.ReadResponse.createDigestResponse(ReadResponse.java:87)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:345) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:50)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) 
> ~[apache-cassandra-3.11.1.jar:3.11.1]
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_144]
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  [apache-cassandra-3.11.1.jar:3.11.1]
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.1.jar:3.11.1]
> java.lang.Thread.run(Thread.java:748) [na:1.8.0_144]
> {code}
> Query works completely find on consistency level ONE but not on QUORUM. 
> Is this possibly related to CASSANDRA-11726?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14223) Provide ability to do custom certificate validations (e.g. hostname validation, certificate revocation checks)

2018-02-21 Thread Ron Blechman (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371518#comment-16371518
 ] 

Ron Blechman commented on CASSANDRA-14223:
--

I'm not sure the approach of using one's own security provider will work in a 
FIPS environment (e.g. Bouncy Castle FIPS mode)?

> Provide ability to do custom certificate validations (e.g. hostname 
> validation, certificate revocation checks)
> --
>
> Key: CASSANDRA-14223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14223
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ron Blechman
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
>
> Cassandra server should be to be able do additional certificate validations, 
> such as hostname validatation and certificate revocation checking against 
> CRLs and/or using OCSP. 
> One approach couild be to have SSLFactory use SSLContext.getDefault() instead 
> of forcing the creation of a new SSLContext using SSLContext.getInstance().  
> Using the default SSLContext would allow a user to plug in their own custom 
> SSLSocketFactory via the java.security properties file. The custom 
> SSLSocketFactory could create a default SSLContext  that was customized to do 
> any extra validation such as certificate revocation, host name validation, 
> etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-02-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371508#comment-16371508
 ] 

Jürgen Albersdorfer commented on CASSANDRA-14239:
-

Of course I tried with smaller Heaps, too, and also with CMS GC before. Didn't 
work either. The amount of Heap just delays the Point on which it will fail to 
bootstrap.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14239) OutOfMemoryError when bootstrapping with less than 100GB RAM

2018-02-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371508#comment-16371508
 ] 

Jürgen Albersdorfer edited comment on CASSANDRA-14239 at 2/21/18 2:55 PM:
--

Of course I tried with smaller Heaps, too, and also with CMS GC before. Didn't 
work either. Increasing the amount of Heap just delays the Point on which it 
will fail to bootstrap.


was (Author: jalbersdorfer):
Of course I tried with smaller Heaps, too, and also with CMS GC before. Didn't 
work either. The amount of Heap just delays the Point on which it will fail to 
bootstrap.

> OutOfMemoryError when bootstrapping with less than 100GB RAM
> 
>
> Key: CASSANDRA-14239
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14239
> Project: Cassandra
>  Issue Type: Bug
> Environment: Details of the bootstrapping Node
>  * ProLiant BL460c G7
>  * 56GB RAM
>  * 2x 146GB 10K HDD (One dedicated for Commitlog, one for Data, Hints and 
> saved_caches)
>  * CentOS 7.4 on SD-Card
>  * /tmp and /var/log on tmpfs
>  * Oracle JDK 1.8.0_151
>  * Cassandra 3.11.1
> Cluster
>  * 10 existing Nodes (Up and Normal)
>Reporter: Jürgen Albersdorfer
>Priority: Major
> Attachments: Objects-by-class.csv, 
> Objects-with-biggest-retained-size.csv, cassandra-env.sh, cassandra.yaml, 
> jvm.options, jvm_opts.txt, stack-traces.txt
>
>
> Hi, I face an issue when bootstrapping a Node having less than 100GB RAM on 
> our 10 Node C* 3.11.1 Cluster.
> During bootstrap, when I watch the cassandra.log I observe a growth in JVM 
> Heap Old Gen which gets not significantly freed up any more.
> I know that JVM collects on Old Gen only when really needed. I can see 
> collections, but there is always a remainder which seems to grow forever 
> without ever getting freed.
> After the Node successfully Joined the Cluster, I can remove the extra RAM I 
> have given it for bootstrapping without any further effect.
> It feels like Cassandra will not forget about every single byte streamed over 
> the Network over time during bootstrapping, - which would be a memory leak 
> and a major problem, too.
> I was able to produce a HeapDumpOnOutOfMemoryError from a 56GB Node (40 GB 
> assigned JVM Heap). YourKit Profiler shows huge amount of Memory allocated 
> for org.apache.cassandra.db.Memtable (22 GB) 
> org.apache.cassandra.db.rows.BufferCell (19 GB) and java.nio.HeapByteBuffer 
> (11 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14223) Provide ability to do custom certificate validations (e.g. hostname validation, certificate revocation checks)

2018-02-21 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371424#comment-16371424
 ] 

Jeremiah Jordan commented on CASSANDRA-14223:
-

I can confirm that the above works to do custom SSL validations in Cassandra.  
I have done exactly that in the past.

> Provide ability to do custom certificate validations (e.g. hostname 
> validation, certificate revocation checks)
> --
>
> Key: CASSANDRA-14223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14223
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ron Blechman
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
>
> Cassandra server should be to be able do additional certificate validations, 
> such as hostname validatation and certificate revocation checking against 
> CRLs and/or using OCSP. 
> One approach couild be to have SSLFactory use SSLContext.getDefault() instead 
> of forcing the creation of a new SSLContext using SSLContext.getInstance().  
> Using the default SSLContext would allow a user to plug in their own custom 
> SSLSocketFactory via the java.security properties file. The custom 
> SSLSocketFactory could create a default SSLContext  that was customized to do 
> any extra validation such as certificate revocation, host name validation, 
> etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14064) Allow using file based certificates instead of keystores

2018-02-21 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14064:
---
Status: Patch Available  (was: In Progress)

> Allow using file based certificates instead of keystores
> 
>
> Key: CASSANDRA-14064
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14064
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Streaming and Messaging
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
>
> The requirement of having to use a secure archive (keystore) for your 
> certificates and keys is not very common outside the Java ecosystem. Most 
> servers will accept individual certificate and key files and will come with 
> instructions how to generate those using openssl. This should also be an 
> option for Cassandra for users who see no reason in additionally having to 
> deal with keystores.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14223) Provide ability to do custom certificate validations (e.g. hostname validation, certificate revocation checks)

2018-02-21 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371205#comment-16371205
 ] 

Stefan Podkowinski commented on CASSANDRA-14223:


It is already possible to use your own trust manager implementation that will 
validate certificates using your custom validation logic. You'll have to 
register your own security provider in {{java.security}}. The provider needs to 
specify your TrustManagerFactorySpi implementation using a specific algorithm 
name that you afterwards have to set as 
{{server_encryption_options.algorithms}} in {{cassandra.yaml}}. But this should 
not involve any SSLSocket code related changes and is not really Cassandra 
specific at all.

> Provide ability to do custom certificate validations (e.g. hostname 
> validation, certificate revocation checks)
> --
>
> Key: CASSANDRA-14223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14223
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ron Blechman
>Priority: Major
>  Labels: security
> Fix For: 4.x
>
>
> Cassandra server should be to be able do additional certificate validations, 
> such as hostname validatation and certificate revocation checking against 
> CRLs and/or using OCSP. 
> One approach couild be to have SSLFactory use SSLContext.getDefault() instead 
> of forcing the creation of a new SSLContext using SSLContext.getInstance().  
> Using the default SSLContext would allow a user to plug in their own custom 
> SSLSocketFactory via the java.security properties file. The custom 
> SSLSocketFactory could create a default SSLContext  that was customized to do 
> any extra validation such as certificate revocation, host name validation, 
> etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Description: 
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround of 
CASSANDRA-11182, and to avoid the disk usage explosion when having to resort to 
{{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan

  was:
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> of CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371188#comment-16371188
 ] 

mck commented on CASSANDRA-14247:
-

WIP…

|| branch || testall || dtest ||
| 
[cassandra-3.11_14247|https://github.com/thelastpickle/cassandra/tree/mck/cassandra-3.11_14247]
   | 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Fcassandra-3.11_14247]
 | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XXX]
 |
| [trunk_14247|https://github.com/thelastpickle/cassandra/tree/mck/trunk_14247] 
| 
[testall|https://circleci.com/gh/thelastpickle/cassandra/tree/mck%2Ftrunk_14247]
  | 
[dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/XXX]
 |

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Description: 
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

Original credit for this work goes to https://github.com/zuochangan

  was:
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}
> Original credit for this work goes to https://github.com/zuochangan



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Description: 
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861

Example use of this would be:
{code}
CREATE CUSTOM INDEX span_annotation_query_idx 
ON zipkin2.span (annotation_query) USING 
'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = {
'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
'delimiter': '░',
'case_sensitive': 'true', 
'mode': 'prefix', 
'analyzed': 'true'};
{code}

  was:
Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861


> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861
> Example use of this would be:
> {code}
> CREATE CUSTOM INDEX span_annotation_query_idx 
> ON zipkin2.span (annotation_query) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex' 
> WITH OPTIONS = {
> 'analyzer_class': 
> 'org.apache.cassandra.index.sasi.analyzer.DelimiterTokenizerAnalyzer', 
> 'delimiter': '░',
> 'case_sensitive': 'true', 
> 'mode': 'prefix', 
> 'analyzed': 'true'};
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries

mck created CASSANDRA-14247:
---

 Summary: SASI tokenizer for simple delimiter based entries
 Key: CASSANDRA-14247
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
 Project: Cassandra
  Issue Type: Improvement
  Components: sasi
Reporter: mck


Currently SASI offers only two tokenizer options:
 - NonTokenizerAnalyser
 - StandardAnalyzer

The latter is built upon Snowball, powerful for human languages but overkill 
for simple tokenization.

A simple tokenizer is proposed here. The need for this arose as a workaround 
around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14247:

Fix Version/s: 3.11.x
   4.0

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-14247) SASI tokenizer for simple delimiter based entries


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck reassigned CASSANDRA-14247:
---

Assignee: mck

> SASI tokenizer for simple delimiter based entries
> -
>
> Key: CASSANDRA-14247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14247
> Project: Cassandra
>  Issue Type: Improvement
>  Components: sasi
>Reporter: mck
>Assignee: mck
>Priority: Major
> Fix For: 4.0, 3.11.x
>
>
> Currently SASI offers only two tokenizer options:
>  - NonTokenizerAnalyser
>  - StandardAnalyzer
> The latter is built upon Snowball, powerful for human languages but overkill 
> for simple tokenization.
> A simple tokenizer is proposed here. The need for this arose as a workaround 
> around CASSANDRA-11182, and to avoid the disk usage explosion when having to 
> resort to {{CONTAINS}}. See https://github.com/openzipkin/zipkin/issues/1861



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14246) Cassandra fails to start after upgrade

2018-02-21 Thread Rajnesh Siwal (JIRA)

Rajnesh Siwal created CASSANDRA-14246:
-

Summary: Cassandra fails to start after upgrade
Key: CASSANDRA-14246
URL: https://issues.apache.org/jira/browse/CASSANDRA-14246
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Rajnesh Siwal

We are unable to start cassandra after it has been upgraded.

We have a cluster of three nodes. After upgrading the first node, we found that
the service failed to start on it because of incompatible SSTables:

The older version: 2.0.17

New Version: 2.1.20

The cassandra fails to start as it found incompatible SSTable :

INFO 08:38:18 Opening
/var/lib/cassandra/data/cp_orgstmt_keyspace/cp_event_processing/cp_orgstmt_keyspace-cp_event_processing-jb-2
(975 bytes)
java.lang.RuntimeException: Incompatible SSTable found. Current version ka is
unable to read file:
/var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
Please run upgradesstables.
java.lang.RuntimeException: Incompatible SSTable found. Current version ka is
unable to read file:
/var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
Please run upgradesstables.
Exception encountered during startup: Incompatible SSTable found. Current
version ka is unable to read file:
/var/lib/cassandra/data/cp_emb_us_keyspace/cp_event_processing/cp_emb_us_keyspace-cp_event_processing-ic-3.
Please run upgradesstables.

Till the time cassandra server is not running, we cannot run "nodetool
upgradesstables".

Please suggest

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-14245) SELECT JSON prints null on empty strings

2018-02-21 Thread Norbert Schultz (JIRA)

Norbert Schultz created CASSANDRA-14245:
---

 Summary: SELECT JSON prints null on empty strings
 Key: CASSANDRA-14245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14245
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
 Environment: Cassandra 3.11.2, Ubuntu 16.04 LTS

 
Reporter: Norbert Schultz


SELECT JSON reports an empty string as null.

 

Example:
{code:java}
cqlsh:unittest> create table test(id INT, name TEXT, PRIMARY KEY(id));
cqlsh:unittest> insert into test (id, name) VALUES (1, 'Foo');
cqlsh:unittest> insert into test (id, name) VALUES (2, '');
cqlsh:unittest> insert into test (id, name) VALUES (3, null);

cqlsh:unittest> select * from test;

id | name
+--
  1 |  Foo
  2 |     
  3 | null

(3 rows)

cqlsh:unittest> select JSON * from test;

[json]
--
{"id": 1, "name": "Foo"}
{"id": 2, "name": null}
{"id": 3, "name": null}

(3 rows){code}
 

This even happens, if the string is part of the Primary Key, which makes the 
generated string not insertable.

 
{code:java}
cqlsh:unittest> create table test2 (id INT, name TEXT, age INT, PRIMARY KEY(id, 
name));
cqlsh:unittest> insert into test2 (id, name, age) VALUES (1, '', 42);
cqlsh:unittest> select JSON * from test2;

[json]

{"id": 1, "name": null, "age": 42}

(1 rows)

cqlsh:unittest> insert into test2 JSON '{"id": 1, "name": null, "age": 42}';
InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid 
null value in condition for column name"{code}
 

On an older version of Cassandra (3.0.8) does not have this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed


[ 
https://issues.apache.org/jira/browse/CASSANDRA-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371081#comment-16371081
 ] 

Chris Lohfink edited comment on CASSANDRA-11163 at 2/21/18 8:26 AM:


* In {{load(ValidationMetadata validation, boolean isOffline)}} everywhere your 
calling {{load( bool , true )}} you can instead call \{{ load( bool, 
!isOffline) }} since you never want to save the summary in those other 
situations either. This will break your test but IMHO thats checking that the 
wrong case occurs. If the summary file is not there, it should not create it. 
Tools and such may be running with a different user, if someone runs this on a 
data directory and this occurs it will create a file that C* would be unable to 
delete, causing compaction threads to die and backup etc. I think, in offline 
mode the tools should _never_ delete, touch or create unnecessary files, 
especially the summary/bf files since they are mostly there to speed up startup 
and not necessary for the reader to work anyway. You can also make the 
"recreateBloomFilter" always false in offline mode (whenever its true, instead 
put !isOffline) since it will then just use whats there. With one exception of 
where the FILTER component is missing, where you can just put AlwaysPresent bf 
and skip so that code that uses it doesn't NPE.

 * In unit tests, is the 1000ms sleep necessary? the lastModified is in ms so I 
thought it may be ok to set lower

 * Just checking it out and running it over and over, the unit tests fails 
occasionally (rarely) (line 407 check {{assertNotEquals(bloomModified, 
bloomFile.lastModified());}} is the same)

 * NP: I think you can reuse the last option (track hotness) since its only 
false currently in situations where we dont want or need to recreate currently. 
If rename it to like "allowChanges". That way we are not adding additional 
booleans to end of that load function.


was (Author: cnlwsu):
* In {{load(ValidationMetadata validation, boolean isOffline)}} everywhere your 
calling {{load( bool , true )}} you can instead call \{{ load( bool, 
!isOffline) }} since you never want to save the summary in those other 
situations either. This will break your test but IMHO thats checking that the 
wrong case occurs. If the summary file is not there, it should not create it. 
Tools and such may be running with a different user, if someone runs this on a 
data directory and this occurs it will create a file that C* would be unable to 
delete, causing compaction threads to die and backup etc. I think, in offline 
mode the tools should _never_ delete, touch or create unnecessary files, 
especially the summary/bf files since they are mostly there to speed up startup 
and not necessary for the reader to work anyway. You can also make the 
"recreateBloomFilter" always false in offline mode (whenever its true, instead 
put !isOffline) since it will then just use whats there. With one exception of 
where the FILTER component is missing, where you can just put AlwaysPresent bf 
and skip so that code that uses it doesn't NPE.

 * In unit tests, is the 1000ms sleep necessary? the lastModified is in ms I 
thought so it may be ok to set lower

 * Just checking it out and running it over and over, the unit tests fails 
occasionally (rarely) (line 407 check {{assertNotEquals(bloomModified, 
bloomFile.lastModified());}} is the same)

 * NP: I think you can reuse the last option (track hotness) since its only 
false currently in situations where we dont want or need to recreate currently. 
If rename it to like "allowChanges". That way we are not adding additional 
booleans to end of that load function.

> Summaries are needlessly rebuilt when the BF FP ratio is changed
> 
>
> Key: CASSANDRA-11163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11163
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Kurt Greaves
>Priority: Major
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> This is from trunk, but I also saw this happen on 2.0:
> Before:
> {noformat}
> root@bw-1:/srv/cassandra# ls -ltr 
> /var/lib/cassandra/data/keyspace1/standard1-071efdc0d11811e590c3413ee28a6c90/
> total 221460
> drwxr-xr-x 2 root root  4096 Feb 11 23:34 backups
> -rw-r--r-- 1 root root80 Feb 11 23:50 ma-6-big-TOC.txt
> -rw-r--r-- 1 root root 26518 Feb 11 23:50 ma-6-big-Summary.db
> -rw-r--r-- 1 root root 10264 Feb 11 23:50 ma-6-big-Statistics.db
> -rw-r--r-- 1 root root   2607705 Feb 11 23:50 ma-6-big-Index.db
> -rw-r--r-- 1 root root192440 Feb 11 23:50 ma-6-big-Filter.db
> -rw-r--r-- 1 root root10 Feb 11 23:50 ma-6-big-Digest.crc32
> -rw-r--r-- 1 root root  35212125 Feb 11 23:50 ma-6-big-Data.db
>

[jira] [Commented] (CASSANDRA-11163) Summaries are needlessly rebuilt when the BF FP ratio is changed