[jira] [Created] (CASSANDRA-14813) Crash frequently due to fatal error caused by "StubRoutines::updateBytesCRC32"

2018-10-10 Thread Jinchao Zhang (JIRA)
Jinchao Zhang created CASSANDRA-14813:
-

 Summary: Crash frequently due to fatal error caused by 
"StubRoutines::updateBytesCRC32"
 Key: CASSANDRA-14813
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14813
 Project: Cassandra
  Issue Type: Bug
 Environment: *OS:* 
{code:java}
CentOS release 6.9 (Final){code}
*JAVA:*
{noformat}
java version "1.8.0_101"
 Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
 Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode){noformat}
 

*Memory:*
{noformat}
256GB{noformat}
*CPU:*
{noformat}
4*  Intel® Xeon® Processor E5-2620 v4{noformat}
*DISK:*
{noformat}
Filesystem Size Used Avail Use% Mounted on
 /dev/sda3 423G 28G 374G 7% /
 tmpfs 126G 68K 126G 1% /dev/shm
 /dev/sda1 240M 40M 188M 18% /boot
 /dev/sdb 3.7T 33M 3.7T 1% /mpp-data/c/cache
 /dev/sdc 3.7T 2.7T 984G 74% /mpp-data/c/data00
 /dev/sdd 3.7T 2.5T 1.2T 68% /mpp-data/c/data01
 /dev/sde 3.7T 2.7T 1.1T 72% /mpp-data/c/data02
 /dev/sdf 3.7T 2.5T 1.2T 68% /mpp-data/c/data03
 /dev/sdg 3.7T 2.4T 1.3T 66% /mpp-data/c/data04
 /dev/sdh 3.7T 2.6T 1.2T 69% /mpp-data/c/data05
 /dev/sdi 3.7T 2.6T 1.2T 70% /mpp-data/c/data06{noformat}
Reporter: Jinchao Zhang
 Attachments: hs1.png, hs2.png, hs_err_pid26350.log

Recently, we encountered the same problem described by 
[CASSANDRA-14283|https://issues.apache.org/jira/browse/CASSANDRA-14283] in our 
production system, which runs on Cassandra 3.11.2. We noticed that this issue 
has been resolved in Cassandra 3.11.3 
([CASSANDRA-14284|https://issues.apache.org/jira/browse/CASSANDRA-14284]), thus 
we upgrade our system to 3.11.3. However, this induce more frequent crash,as 
show in the following screenshots, and the reduced hs file is posted here as 
well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14765) Evaluate Recovery Time on Single Token Cluster Test

2018-10-10 Thread Vinay Chella (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella reassigned CASSANDRA-14765:


Assignee: Sumanth Pasupuleti

> Evaluate Recovery Time on Single Token Cluster Test
> ---
>
> Key: CASSANDRA-14765
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14765
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Joseph Lynch
>Assignee: Sumanth Pasupuleti
>Priority: Major
>
> *Setup:*
>  * Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB 
> ram) running cassandra trunk with Jason's 14503 changes vs the same footprint 
> running 3.0.17
>  * One datacenter, single tokens
>  * No compression, encryption, or coalescing turned on
> *Test #1:*
> ndbench loaded ~150GB of data per node into a LCS table. Then we killed a 
> node and let a new node stream. With a single token this should be a worst 
> case recovery scenario (only  a few peers to stream from).
> *Result:*
> As the table used LCS and we didn't not have encryption on, the zero copy 
> transfer was used via CASSANDRA-14556. We recovered *150GB in 5 minutes,* 
> going at a consistent rate of about 3 gigabit per second. Theoretically we 
> should be able to get 10 gigabit, but this is still something like an 
> estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for 
> a hard comparison.
> *Follow Ups:*
> We need to get more rigorous measurements (over more terminations), as well 
> as finishing the 3.0.x test. [~sumanth.pasupuleti] and [~djoshi3] are driving 
> this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14808) Support ORDER BY with 2ndary Indexes

2018-10-10 Thread Yan Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645498#comment-16645498
 ] 

Yan Yao commented on CASSANDRA-14808:
-

[~blerer] , you are right about result order after removing ORDER BY. However, 
what if users need queries for both ASC and DESC order? 

> Support ORDER BY with 2ndary Indexes
> 
>
> Key: CASSANDRA-14808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14808
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Apache Cassandra 3.11.3
>Reporter: Yan Yao
>Priority: Major
> Fix For: 3.11.4, 4.0
>
>
>  Suppose we have a generic table:
> {code:java}
> CREATE TABLE base_table(
> partition1 uuid,
> ...
> partitionN uuid,
> static_column text static,
> clustering1 uuid,
> ...
> clusteringM uuid,
> regular text,
> list_text list,
> set_text set,
> map_int_text map,
> PRIMARY KEY((partition1, ..., partitionN), clustering1, ... , clusteringN)
> );
> {code}
>  
> And create an index on _regular text_ column, the schema of the index table 
> will be:
> {code:java}
> CREATE TABLE regular_idx(
> regular text,
> partitionColumns blob,
> clustering1 uuid,
> ...
> clusteringM uuid,
> PRIMARY KEY((regular), partitionColumns, clustering1, ..., clusteringM)
> );
> {code}
>  
> Then it's possible to execute queries like:
> {code:java}
> SELECT * FROM base_table WHERE regular =  AND partition1 = 
>  AND ... AND partitionN =  ORDER BY 
> clustering1;{code}
>  
> However, CQL3 would check if the _secondary index_ is used WITH _order by_ 
> during prepare a select statement, and throw an exception at once for queries 
> like above. 
> Could we support ORDER BY with 2ndary Indexes at least for above query 
> pattern?   
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14812) Multiget Thrift query skip records in case of DigestMismatch

2018-10-10 Thread Sivukhin Nikita (JIRA)
Sivukhin Nikita created CASSANDRA-14812:
---

 Summary: Multiget Thrift query skip records in case of 
DigestMismatch
 Key: CASSANDRA-14812
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14812
 Project: Cassandra
  Issue Type: Bug
Reporter: Sivukhin Nikita
 Attachments: repro_script.py, requirements.txt, small_repro_script.py

Since Cassandra 3.0.0 there is a subtle bug that relates to the {{multiget}} 
Thrift query. It appears in the case when you try to read many partitions and 
this read cause {{DigestMismatch}} for some partition. When this situation 
happened, Cassandra cut your response stream right at the point when the first 
{{DigestMismatch}} error occurred.

This bug reproduced in all versions of Cassandra since 3.0.0. The pre-release 
version 3.0.0-rc2 works fine (also, there is no such problem in Cassandra 2.x 
versions). Looks like the big refactoring related to the task CASSANDRA-9975 
([link to 
commit|https://github.com/apache/cassandra/commit/609497471441273367013c09a1e0e1c990726ec7])
 in partition iterator architecture causes the wrong behavior.

When concatenated iterator returned from the 
[StorageProxy.fetchRows(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/service/StorageProxy.java#L1770],
 Cassandra starts to consume this combined iterator. Because of 
{{DigestMismatch}} some elements of this combined iterator contains additional 
{{ThriftCounter}}, that was added during 
[DataResolver.resolve(...)|https://github.com/apache/cassandra/blob/ee9e06b5a75c0be954694b191ea4170456015b98/src/java/org/apache/cassandra/service/reads/DataResolver.java#L120]
 execution. While consuming iterator for many partitions, Cassandra calls 
[BaseIterator.tryGetMoreContents(...)|https://github.com/apache/cassandra/blob/a05785d82c621c9cd04d8a064c38fd2012ef981c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L115]
 method that must switch from one partition iterator to another in case of the 
devastation of former. In this case, all transformations for next iterator 
applied to the whole BaseIterator that enumerate many partitions sequence. This 
behavior causes iterator to stop enumeration after it fully consumes partition 
with the {{DigestMismatch}} error because this partition has addition 
{{ThriftCounter}} data limit that was applied to the whole composite iterator.

The attachment contains the python2 script [^small_repro_script.py] that 
reproduces this bug within 3-nodes ccmlib controlled cluster. Also, there is an 
extended version of this script - [^repro_script.py] - that contains more 
logging information and provides the ability to test behavior for many 
Cassandra versions (to run all test cases from repro_script.py you can call 
{{python -m unittest2 -v repro_script.ThriftMultigetTestCase}}). All the 
necessary dependencies contained in the [^requirements.txt]

 
This bug is critical in our production environment because we can't permit any 
data skip.

Any ideas about a patch for this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10304) support running on JRE compact3 profile

2018-10-10 Thread vincent royer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645226#comment-16645226
 ] 

vincent royer commented on CASSANDRA-10304:
---

For logback, this is fixed in [https://jira.qos.ch/browse/LOGBACK-1071] !

> support running on JRE compact3 profile
> ---
>
> Key: CASSANDRA-10304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10304
> Project: Cassandra
>  Issue Type: Wish
>Reporter: Adrian Cole
>Priority: Minor
>
> It would be nice if cassandra could run unaltered on compact3 profile, as it 
> leads to far smaller images (like docker images)
> http://openjdk.java.net/jeps/161
> Right now, the snakeyaml and logback dependencies use java.beans package 
> which is defined outside of that.
> I've verified cassandra works, if you use a different slf4j binding, and make 
> a configuration manually. Since our default cassandra.yaml doesn't use any 
> advanced yaml features, we could consider just making it json or adjusting 
> defaults so that a yaml file isn't needed.
> https://github.com/openzipkin/docker-zipkin/blob/master/cassandra/ZipkinConfigurationLoader.java
> http://jira.qos.ch/browse/LOGBACK-1071 
> https://bitbucket.org/asomov/snakeyaml/issues/315/make-javabeans-optional-or-not-used



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14811) RPC_READY flag handled inconsistently

2018-10-10 Thread Tibor Repasi (JIRA)
Tibor Repasi created CASSANDRA-14811:


 Summary: RPC_READY flag handled inconsistently
 Key: CASSANDRA-14811
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14811
 Project: Cassandra
  Issue Type: Bug
  Components: Distributed Metadata
Reporter: Tibor Repasi


With version 3.0.17 we experience an inconsistent handling of the RPC_READY 
flag. We identified 3 scenarios with an unexpected behaviour.
 # Using {{nodetool disablebinary}} on a node in normal operation
 ** Observed behaviour:
 *** (/) the CQL listener is closed and Netty shut down
 *** (x) gossipinfo still contain the {{RPC_READY}} *true* advertisement.
 ** Expected behaviour: gossip announcement of the {{RPC_READY}} flag should 
switch to *false.* This is what we observe with version 2.2.13
 # Starting up a node with JVM option 
{{-Dcassandra.start_native_transport=false}}
 ** Observed behaviour:
 *** (/) Netty is not started to listen for CQL clients
 *** (/) logging {{cassandra[1765]: INFO 13:58:46 Not starting native transport 
as requested. Use JMX (StorageService->startNativeTransport()) or nodetool 
(enablebinary) to start it}}
 *** (?) the gossipinfo does not contain the {{RPC_READY}} flag at all, 
however, this is also observed on 2.2.13.
 # Issuing {{nodetool enablebinary}} command on a node started with 
{{-Dcassandra.start_native_transport=false}}
 ** Observed behaviour:
 *** (/) Netty is started up and open the CQL port for listening
 *** (x) the {{RPC_READY}} flag is not announced for this node any more, 
causing clients to not consider this note up and not trying to connect.
 ** Expected behaviour: gossip flag {{RPC_READY}} should be added to announce 
*true*, as observed with version 2.2.13.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14566) Cache CSM.onlyPurgeRepairedTombstones()

2018-10-10 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645032#comment-16645032
 ] 

Stefan Podkowinski commented on CASSANDRA-14566:


Thanks for your contribution, Thomas! We're meanwhile testing the upcoming 
Cassandra 4.0 release and have a code freeze for non-critical fixes in place 
for the other branches. I'll look into accepting the patch for 4.0, depending 
how the testing goes, or in a later version, as the expected performance gain 
should not be significant. 

> Cache CSM.onlyPurgeRepairedTombstones()
> ---
>
> Key: CASSANDRA-14566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14566
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Stefan Podkowinski
>Assignee: Thomas Pouget-Abadie
>Priority: Minor
>  Labels: lhf, performance
> Attachments: 14566-3.11.txt
>
>
> We currently call {{CompactionStrategyManager.onlyPurgeRepairedTombstones()}} 
> *a lot* during compaction, I think at least for every key. So we should 
> probably cache the value, instead of constantly reading from a volatile and 
> calling parseBoolean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14566) Cache CSM.onlyPurgeRepairedTombstones()

2018-10-10 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14566:
---
Status: Patch Available  (was: Open)

> Cache CSM.onlyPurgeRepairedTombstones()
> ---
>
> Key: CASSANDRA-14566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14566
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Stefan Podkowinski
>Assignee: Thomas Pouget-Abadie
>Priority: Minor
>  Labels: lhf, performance
> Attachments: 14566-3.11.txt
>
>
> We currently call {{CompactionStrategyManager.onlyPurgeRepairedTombstones()}} 
> *a lot* during compaction, I think at least for every key. So we should 
> probably cache the value, instead of constantly reading from a volatile and 
> calling parseBoolean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14566) Cache CSM.onlyPurgeRepairedTombstones()

2018-10-10 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski reassigned CASSANDRA-14566:
--

Assignee: Thomas Pouget-Abadie

> Cache CSM.onlyPurgeRepairedTombstones()
> ---
>
> Key: CASSANDRA-14566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14566
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Stefan Podkowinski
>Assignee: Thomas Pouget-Abadie
>Priority: Minor
>  Labels: lhf, performance
> Attachments: 14566-3.11.txt
>
>
> We currently call {{CompactionStrategyManager.onlyPurgeRepairedTombstones()}} 
> *a lot* during compaction, I think at least for every key. So we should 
> probably cache the value, instead of constantly reading from a volatile and 
> calling parseBoolean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14808) Support ORDER BY with 2ndary Indexes

2018-10-10 Thread Benjamin Lerer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644962#comment-16644962
 ] 

Benjamin Lerer commented on CASSANDRA-14808:


If you remove the {{ORDER BY}} clause the query will be executed and the rows 
should be returned in the clustering order which seems to be what you want.

> Support ORDER BY with 2ndary Indexes
> 
>
> Key: CASSANDRA-14808
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14808
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Apache Cassandra 3.11.3
>Reporter: Yan Yao
>Priority: Major
> Fix For: 3.11.4, 4.0
>
>
>  Suppose we have a generic table:
> {code:java}
> CREATE TABLE base_table(
> partition1 uuid,
> ...
> partitionN uuid,
> static_column text static,
> clustering1 uuid,
> ...
> clusteringM uuid,
> regular text,
> list_text list,
> set_text set,
> map_int_text map,
> PRIMARY KEY((partition1, ..., partitionN), clustering1, ... , clusteringN)
> );
> {code}
>  
> And create an index on _regular text_ column, the schema of the index table 
> will be:
> {code:java}
> CREATE TABLE regular_idx(
> regular text,
> partitionColumns blob,
> clustering1 uuid,
> ...
> clusteringM uuid,
> PRIMARY KEY((regular), partitionColumns, clustering1, ..., clusteringM)
> );
> {code}
>  
> Then it's possible to execute queries like:
> {code:java}
> SELECT * FROM base_table WHERE regular =  AND partition1 = 
>  AND ... AND partitionN =  ORDER BY 
> clustering1;{code}
>  
> However, CQL3 would check if the _secondary index_ is used WITH _order by_ 
> during prepare a select statement, and throw an exception at once for queries 
> like above. 
> Could we support ORDER BY with 2ndary Indexes at least for above query 
> pattern?   
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14809) cluster initialization was aborted after timing out

2018-10-10 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas resolved CASSANDRA-14809.
--
Resolution: Information Provided

> cluster initialization was aborted after timing out
> ---
>
> Key: CASSANDRA-14809
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14809
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Max
>Priority: Major
> Fix For: 3.11.1
>
>
> i have error "cluster initialization was aborted after timing out". It showed 
> on UI.
> We have ReleaseVersion: 3.11.1 and there is a lot of errors. Pls help.
> in system log we see this errors:
> INFO [epollEventLoopGroup-2-4] 2018-10-09 19:30:50,113 Message.java:623 - 
> Unexpected exception during request; channel = [id: 0x295ce677, 
> L:/192.168.xx.xxx:9042 - R:/192.168.xx.xxx:58644]
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
> at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) 
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14809) cluster initialization was aborted after timing out

2018-10-10 Thread Max (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644884#comment-16644884
 ] 

Max commented on CASSANDRA-14809:
-

Thanks, please delete this post.

> cluster initialization was aborted after timing out
> ---
>
> Key: CASSANDRA-14809
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14809
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Max
>Priority: Major
> Fix For: 3.11.1
>
>
> i have error "cluster initialization was aborted after timing out". It showed 
> on UI.
> We have ReleaseVersion: 3.11.1 and there is a lot of errors. Pls help.
> in system log we see this errors:
> INFO [epollEventLoopGroup-2-4] 2018-10-09 19:30:50,113 Message.java:623 - 
> Unexpected exception during request; channel = [id: 0x295ce677, 
> L:/192.168.xx.xxx:9042 - R:/192.168.xx.xxx:58644]
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
> at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) 
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14778) Merge DigestResolver / DataResolver

2018-10-10 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644871#comment-16644871
 ] 

Benedict commented on CASSANDRA-14778:
--

Hi [~abhishagarwal]

It's great that you're looking to participate in the project, but this is 
probably a challenging thing to take on as your first contribution.  These 
classes are critical to the central function of the database, and a wider 
knowledge of the systems they interact with is essential.  

Don't be deceived by the 'minor' priority tag - this is a complicated task.  A 
good starting point would be to search for tickets with the 'lhf' label.


> Merge DigestResolver / DataResolver
> ---
>
> Key: CASSANDRA-14778
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14778
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Benedict
>Priority: Minor
>
> These classes are unnecessarily distinct, and this complicates read-repair 
> logic, as well as transient replication merging, with complex nesting of 
> different kinds of resolver within each other.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14809) cluster initialization was aborted after timing out

2018-10-10 Thread C. Scott Andreas (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644863#comment-16644863
 ] 

C. Scott Andreas commented on CASSANDRA-14809:
--

Hi Max, thanks for reaching out. This bug tracker is used by developers of the 
database while building Cassandra.

The best forum for community help and support deploying and operating the 
database is via the Apache Cassandra user email list, or via IRC in the 
#cassandra channel on Freenode.

When reaching out via these channels, please include additional information 
such as the client you're using to connect (and its version), along with a full 
stacktrace for the issue you've seen, and any relevant configuration parameters 
for client/server communication.

The error message you've shared suggests that your client is able to at least 
establish an initial connection to the Cassandra server, but that something may 
be configured incorrectly causing it to disconnect quickly after. You may try 
establishing a connection and issuing queries via cqlsh or another client to 
rule out the possibility of an issue with the server or database itself.

Links to the proper support channels are here: 
[http://cassandra.apache.org/community/]

Cheers!

> cluster initialization was aborted after timing out
> ---
>
> Key: CASSANDRA-14809
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14809
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Max
>Priority: Major
> Fix For: 3.11.1
>
>
> i have error "cluster initialization was aborted after timing out". It showed 
> on UI.
> We have ReleaseVersion: 3.11.1 and there is a lot of errors. Pls help.
> in system log we see this errors:
> INFO [epollEventLoopGroup-2-4] 2018-10-09 19:30:50,113 Message.java:623 - 
> Unexpected exception during request; channel = [id: 0x295ce677, 
> L:/192.168.xx.xxx:9042 - R:/192.168.xx.xxx:58644]
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
> at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) 
> ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14778) Merge DigestResolver / DataResolver

2018-10-10 Thread Abhish Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644852#comment-16644852
 ] 

Abhish Agarwal commented on CASSANDRA-14778:


[~benedict] can you assign this to myself? I am not able to do this. I will 
submit the patch in sometime.

> Merge DigestResolver / DataResolver
> ---
>
> Key: CASSANDRA-14778
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14778
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Benedict
>Priority: Minor
>
> These classes are unnecessarily distinct, and this complicates read-repair 
> logic, as well as transient replication merging, with complex nesting of 
> different kinds of resolver within each other.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14810) Upgrade dtests to pytest-3.8

2018-10-10 Thread Stefan Podkowinski (JIRA)
Stefan Podkowinski created CASSANDRA-14810:
--

 Summary: Upgrade dtests to pytest-3.8
 Key: CASSANDRA-14810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14810
 Project: Cassandra
  Issue Type: Improvement
  Components: Testing
Reporter: Stefan Podkowinski


The [dtest project|https://github.com/apache/cassandra-dtest] uses pytest as 
test runner of choice for executing tests on builds.apache.org or CircleCI. The 
pytest dependency has recently been upgrade to 3.6, but couldn't be upgrade to 
the most recent 3.8 version, due to issues described below.

Before test execution, the {{run_dtests.py}} script will gather a list of all 
tests:
 {{./run_dtests.py --dtest-print-tests-only}}

Afterwards pytest can be started with any of the output lines as argument. With 
pytest-3.8 however, the output format changed and preventing pytest to find the 
test:
 {{pytest 
upgrade_tests/upgrade_supercolumns_test.py::TestSCUpgrade::test_upgrade_super_columns_through_limited_versions}}
 vs
 {{pytest 
upgrade_supercolumns_test.py::TestSCUpgrade::test_upgrade_super_columns_through_limited_versions}}

The underlying issue appears to be the changes in the {{pytest --collect-only}} 
output, consumed in {{run_dtests.py}}, which now includes a  element 
that needs to be parsed as well to derive at the path as we did before. We'd 
have to parse the new output and assemble the correct paths again, so we can 
use run_dtests.py as we did before with pytest 3.6.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14713) Update docker image used for testing

2018-10-10 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644722#comment-16644722
 ] 

Stefan Podkowinski commented on CASSANDRA-14713:


The dtests changes have now been committed as {{77be87ecf51a}}. These changes 
should also work with the old docker image from Michael and I'll let them 
become effective for a while, before activating the new docker image as well. 
This should make it will be easier to track down any unexpected regressions 
from this ticket.

> Update docker image used for testing
> 
>
> Key: CASSANDRA-14713
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14713
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Attachments: Dockerfile
>
>
> Tests executed on builds.apache.org ({{docker/jenkins/jenkinscommand.sh}}) 
> and circleCI ({{.circleci/config.yml}}) will currently use the same 
> [cassandra-test|https://hub.docker.com/r/kjellman/cassandra-test/] docker 
> image ([github|https://github.com/mkjellman/cassandra-test-docker]) by 
> [~mkjellman].
> We should manage this image on our own as part of cassandra-builds, to keep 
> it updated. There's also a [Apache 
> user|https://hub.docker.com/u/apache/?page=1] on docker hub for publishing 
> images.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14566) Cache CSM.onlyPurgeRepairedTombstones()

2018-10-10 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-14566:
---
Reviewer: Stefan Podkowinski

> Cache CSM.onlyPurgeRepairedTombstones()
> ---
>
> Key: CASSANDRA-14566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14566
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Stefan Podkowinski
>Priority: Minor
>  Labels: lhf, performance
> Attachments: 14566-3.11.txt
>
>
> We currently call {{CompactionStrategyManager.onlyPurgeRepairedTombstones()}} 
> *a lot* during compaction, I think at least for every key. So we should 
> probably cache the value, instead of constantly reading from a volatile and 
> calling parseBoolean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14561) Make token-generator Python3 compatible

2018-10-10 Thread Stefan Podkowinski (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski resolved CASSANDRA-14561.

Resolution: Not A Problem

The {{token-generator}} script is part of cassandra-2.x and can be found in the 
tools/bin directory. The {{token_generator_test.py}} script is part of the 
dtest project. We should probably just get rid of the dtest py script, after 
4.0 has been released and we stop supporting 2.x. Thanks for pointing me to it, 
I guess I was just assuming that {{token-generator}} is shipped with more 
recent versions as well, but it's not and I'm just going to close this ticket 
then.

> Make token-generator Python3 compatible
> ---
>
> Key: CASSANDRA-14561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14561
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing, Tools
>Reporter: Stefan Podkowinski
>Priority: Major
>  Labels: lhf
>
> Among all scripts in {{tools/bin/}}, {{token-generator}} is bit of an oddball 
> there. While all other scripts are simple shell script wrappers around Java 
> tools, the generator is a standalone Python script. Although it seems to be 
> still working just fine, it's not Python3 compatible yet, which causes 
> permanent test failures for {{token_generator_test.py}}, after we migrated to 
> Python3.
> It would be great if we could make the script work with both Python2+3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14809) cluster initialization was aborted after timing out

2018-10-10 Thread Max (JIRA)
Max created CASSANDRA-14809:
---

 Summary: cluster initialization was aborted after timing out
 Key: CASSANDRA-14809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14809
 Project: Cassandra
  Issue Type: Bug
Reporter: Max
 Fix For: 3.11.1


i have error "cluster initialization was aborted after timing out". It showed 
on UI.

We have ReleaseVersion: 3.11.1 and there is a lot of errors. Pls help.

in system log we see this errors:

INFO [epollEventLoopGroup-2-4] 2018-10-09 19:30:50,113 Message.java:623 - 
Unexpected exception during request; channel = [id: 0x295ce677, 
L:/192.168.xx.xxx:9042 - R:/192.168.xx.xxx:58644]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
Connection reset by peer
at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) 
~[netty-all-4.0.44.Final.jar:4.0.44.Final]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra-dtest git commit: Migrate to pytest 3.6 and fix Python 3 warnings

2018-10-10 Thread spod
Repository: cassandra-dtest
Updated Branches:
  refs/heads/master 2415aa0ef -> 77be87ecf


Migrate to pytest 3.6 and fix Python 3 warnings

patch by Stefan Podkowinski; reviewed by Marcus Eriksson for CASSANDRA-14713


Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/77be87ec
Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/77be87ec
Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/77be87ec

Branch: refs/heads/master
Commit: 77be87ecf51aa6a3cb61c43f3d41e33f5344fddd
Parents: 2415aa0
Author: Stefan Podkowinski 
Authored: Fri Sep 21 10:37:29 2018 +0200
Committer: Stefan Podkowinski 
Committed: Wed Oct 10 10:23:34 2018 +0200

--
 auth_test.py   | 32 ++--
 batch_test.py  |  4 +-
 commitlog_test.py  |  6 +--
 concurrent_schema_changes_test.py  |  2 +-
 conftest.py| 27 +--
 cqlsh_tests/cqlsh_tests.py |  2 +-
 dtest.py   |  4 +-
 internode_messaging_test.py|  2 +-
 jmx_test.py|  2 +-
 materialized_views_test.py | 14 ++
 repair_tests/deprecated_repair_test.py |  8 +--
 repair_tests/repair_test.py| 22 -
 replication_test.py|  8 +--
 requirements.txt   |  2 +-
 run_dtests.py  |  4 +-
 secondary_indexes_test.py  |  2 +-
 thrift_test.py |  8 +--
 tools/assertions.py|  2 +-
 tools/datahelp.py  |  2 +-
 tools/jmxutils.py  | 54 ++---
 topology_test.py   |  2 +-
 transient_replication_test.py  |  6 +--
 upgrade_tests/regression_test.py   |  2 +-
 upgrade_tests/upgrade_schema_agreement_test.py |  2 +-
 24 files changed, 107 insertions(+), 112 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/77be87ec/auth_test.py
--
diff --git a/auth_test.py b/auth_test.py
index f2cd844..7c275e0 100644
--- a/auth_test.py
+++ b/auth_test.py
@@ -123,7 +123,7 @@ class TestAuth(Tester):
 cassandra.execute("CREATE USER jackob WITH PASSWORD '12345' 
NOSUPERUSER")
 
 jackob = self.get_session(user='jackob', password='12345')
-assert_unauthorized(jackob, "CREATE USER james WITH PASSWORD '54321' 
NOSUPERUSER", 'Only superusers are allowed to perform CREATE 
(\[ROLE\|USER\]|USER) queries', )
+assert_unauthorized(jackob, "CREATE USER james WITH PASSWORD '54321' 
NOSUPERUSER", 'Only superusers are allowed to perform CREATE 
(\\[ROLE\\|USER\\]|USER) queries', )
 
 @since('1.2', max_version='2.1.x')
 def test_password_authenticator_create_user_requires_password(self):
@@ -256,7 +256,7 @@ class TestAuth(Tester):
 assert 3 == len(rows)
 
 cathy = self.get_session(user='cathy', password='12345')
-assert_unauthorized(cathy, 'DROP USER dave', 'Only superusers are 
allowed to perform DROP (\[ROLE\|USER\]|USER) queries')
+assert_unauthorized(cathy, 'DROP USER dave', 'Only superusers are 
allowed to perform DROP (\\[ROLE\\|USER\\]|USER) queries')
 
 rows = list(cassandra.execute("LIST USERS"))
 assert 3 == len(rows)
@@ -2202,7 +2202,7 @@ class TestAuthRoles(Tester):
 self.superuser.execute("GRANT EXECUTE ON FUNCTION ks.func_one(int) TO 
mike")
 as_mike.execute(select_one)
 assert_unauthorized(as_mike, select_two,
-"User mike has no EXECUTE permission on  or any of its parents")
+r"User mike has no EXECUTE permission on  or any of its parents")
 # granting EXECUTE on all of the parent keyspace's should enable mike 
to use both functions
 self.superuser.execute("GRANT EXECUTE ON ALL FUNCTIONS IN KEYSPACE ks 
TO mike")
 as_mike.execute(select_one)
@@ -2211,7 +2211,7 @@ class TestAuthRoles(Tester):
 self.superuser.execute("REVOKE EXECUTE ON ALL FUNCTIONS IN KEYSPACE ks 
FROM mike")
 as_mike.execute(select_one)
 assert_unauthorized(as_mike, select_two,
-"User mike has no EXECUTE permission on  or any of its parents")
+r"User mike has no EXECUTE permission on  or any of its parents")
 # now check that EXECUTE on ALL FUNCTIONS works in the same way
 self.superuser.execute("GRANT EXECUTE ON ALL FUNCTIONS TO mike")
 as_m

[jira] [Commented] (CASSANDRA-13649) Uncaught exceptions in Netty pipeline

2018-10-10 Thread Max (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644644#comment-16644644
 ] 

Max commented on CASSANDRA-13649:
-

We have ReleaseVersion: 3.11.1 and there is a lot of errors. Pls help.

INFO [epollEventLoopGroup-2-4] 2018-10-09 19:30:50,113 Message.java:623 - 
Unexpected exception during request; channel = [id: 0x295ce677, 
L:/192.168.xx.xxx:9042 - R:/192.168.xx.xxx:58644]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
Connection reset by peer
 at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) 
~[netty-all-4.0.44.Final.jar:4.0.44.Final]

 

Also sometimes i have error "cluster initialization was aborted after timing 
out"

> Uncaught exceptions in Netty pipeline
> -
>
> Key: CASSANDRA-13649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13649
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging, Testing
>Reporter: Stefan Podkowinski
>Assignee: Norman Maurer
>Priority: Major
>  Labels: patch
> Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0
>
> Attachments: 
> 0001-CASSANDRA-13649-Ensure-all-exceptions-are-correctly-.patch, 
> test_stdout.txt
>
>
> I've noticed some netty related errors in trunk in [some of the dtest 
> results|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/106/#showFailuresLink].
>  Just want to make sure that we don't have to change anything related to the 
> exception handling in our pipeline and that this isn't a netty issue. 
> Actually if this causes flakiness but is otherwise harmless, we should do 
> something about it, even if it's just on the dtest side.
> {noformat}
> WARN  [epollEventLoopGroup-2-9] 2017-06-28 17:23:49,699 Slf4JLogger.java:151 
> - An exceptionCaught() event was fired, and it reached at the tail of the 
> pipeline. It usually means the last handler in the pipeline did not handle 
> the exception.
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
>   at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown 
> Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> {noformat}
> And again in another test:
> {noformat}
> WARN  [epollEventLoopGroup-2-8] 2017-06-29 02:27:31,300 Slf4JLogger.java:151 
> - An exceptionCaught() event was fired, and it reached at the tail of the 
> pipeline. It usually means the last handler in the pipeline did not handle 
> the exception.
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: 
> Connection reset by peer
>   at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown 
> Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> {noformat}
> Edit:
> The {{io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() 
> failed}} error also causes tests to fail for 3.0 and 3.11. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14373) Allow using custom script for chronicle queue BinLog archival

2018-10-10 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644633#comment-16644633
 ] 

Sam Tunnicliffe commented on CASSANDRA-14373:
-

Couple of tiny things:
* I think the help text for --archive-command should show the command quoted, 
not in backticks (it's probably obvious to a user that the command should be 
quoted, but it doesn't hurt to be explicit)
* the title param of the archiveRetries annotation is wrong (think this is used 
by the help command)

otherwise LGTM

> Allow using custom script for chronicle queue BinLog archival
> -
>
> Key: CASSANDRA-14373
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14373
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stefan Podkowinski
>Assignee: Pramod K Sivaraju
>Priority: Major
>  Labels: lhf, pull-request-available
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It would be nice to allow the user to configure an archival script that will 
> be executed in {{BinLog.onReleased(cycle, file)}} for every deleted bin log, 
> just as we do in {{CommitLogArchiver}}. The script should be able to copy the 
> released file to an external location or do whatever the author hand in mind. 
> Deleting the log file should be delegated to the script as well.
> See CASSANDRA-13983, CASSANDRA-12151 for use cases.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org