date:20201105

[cassandra-website] branch asf-staging updated: add .asf.yaml to get asf deployment working

2020-11-05 Thread mck

This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


The following commit(s) were added to refs/heads/asf-staging by this push:
 new 0aba4d9  add .asf.yaml to get asf deployment working
0aba4d9 is described below

commit 0aba4d9d25a4731041fc2d477c699a44ced43cfc
Author: mck 
AuthorDate: Fri Nov 6 08:19:46 2020 +0100

add .asf.yaml to get asf deployment working
---
 .asf.yaml | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/.asf.yaml b/.asf.yaml
new file mode 100644
index 000..928a49d
--- /dev/null
+++ b/.asf.yaml
@@ -0,0 +1,18 @@
+notifications:
+  commits:  commits@cassandra.apache.org
+  issues:   commits@cassandra.apache.org
+  pullrequests: p...@cassandra.apache.org
+
+github:
+  enabled_merge_buttons:
+squash:  false
+merge:   false
+rebase:  true
+
+staging:
+  profile: ~
+  whoami:  asf-staging
+
+publish:
+  whoami: asf-site
+


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-11-05 Thread Yifan Cai (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227175#comment-17227175
 ] 

Yifan Cai commented on CASSANDRA-16146:
---

Hi [~brandon.williams] and [~bdeggleston],

Would you please take another look at the fixup to each branch? Links are 
posted above. Although the CI result looks good when committing, several newly 
added tests were actually broken since we did not rebase and rerun the CI. 

In the 3.x branches, the fixup sets the operationMode to normal directly in 
StorageService when using mock gossip.

In the trunk, the fixup corrects the bytebuddy in the failed test class in 
addition to the operationMode fix in test. 

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16246) Unexpected warning "Ignoring Unrecognized strategy option" for NetworkTopologyStrategy when restarting

2020-11-05 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16246:
--
Test and Documentation Plan: ci, jvm dtest
 Status: Patch Available  (was: Open)

The cause is that when creating NetworkTopologyStategy of keyspaces during 
{{CassandraDaemon#setup}}, the tokenMetadata does not have hostIds populated 
from the saved hostIds. {{Datacenters#getValidDatacenters}} is unable to lookup 
the remote DCs and {{ConfigurationException}} is thrown later. 

To fix the unexpected warning message, the 
{{StorageService#populateTokenMetadata}} is updated to also load the {{hostId 
-> endpoint}} map from the saved hostIds. 

PR: https://github.com/apache/cassandra/pull/810
CI: 
https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=C-16246-warn-networktopologystrategy-restart

> Unexpected warning "Ignoring Unrecognized strategy option" for 
> NetworkTopologyStrategy when restarting
> --
>
> Key: CASSANDRA-16246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16246
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Logging
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During restarting, bunch of warning messages like 
> "AbstractReplicationStrategy.java:364 - Ignoring Unrecognized strategy option 
> {datacenter2} passed to NetworkTopologyStrategy for keyspace 
> distributed_test_keyspace" are logged. 
> The warnings are not expected since the mentioned DC exist. 
> It seems to be caused by the improper order during startup, so that when 
> opening keyspaces it is unaware of DCs. 
> The warning can be reproduced using the test below. 
> {code:java}
> @Test
> public void testEmitsWarningsForNetworkTopologyStategyConfigOnRestart() 
> throws Exception {
> int nodesPerDc = 2;
> try (Cluster cluster = builder().withConfig(c -> c.with(GOSSIP, NETWORK))
> .withRacks(2, 1, nodesPerDc)
> .start()) {
> cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE +
>  " WITH replication = {'class': 
> 'NetworkTopologyStrategy', " +
>  "'datacenter1' : " + nodesPerDc + ", 
> 'datacenter2' : " + nodesPerDc + " };");
> cluster.get(2).nodetool("flush");
> System.out.println("Stop node 2 in datacenter 1");
> cluster.get(2).shutdown().get();
> System.out.println("Start node 2 in datacenter 1");
> cluster.get(2).startup();
> List result = cluster.get(2).logs().grep("Ignoring 
> Unrecognized strategy option \\{datacenter2\\}").getResult();
> Assert.assertFalse(result.isEmpty());
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227168#comment-17227168
 ] 

Berenguer Blasi edited comment on CASSANDRA-16121 at 11/6/20, 5:04 AM:
---

[~dcapwell] thanks for looking into this one. Yes that's the point running 
these tests. If there are hidden flakies they will surface now. Also as this 
hadn't been rebased in a long time good catch changing 'master' to 'trunk' 
after the recent change in branch names.

One question: Why was the fixVersion removed?


was (Author: bereng):
[~dcapwell] thanks for looking into this one. Yes that's the point running 
these tests. If there are hidden flakies they will surface now. Also as this 
hadn't been rebased in a long time good catch changing 'master' to 'trunk' 
after the recent change in branch names.

One questions. Why was the fixVersion removed?

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: NA
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227168#comment-17227168
 ] 

Berenguer Blasi edited comment on CASSANDRA-16121 at 11/6/20, 5:04 AM:
---

[~dcapwell] thanks for looking into this one. Yes that's the point running 
these tests. If there are hidden flakies they will surface now. Also as this 
hadn't been rebased in a long time good catch changing 'master' to 'trunk' 
after the recent change in branch names.

One questions. Why was the fixVersion removed?


was (Author: bereng):
[~dcapwell] thanks for looking into this one. Yes that's the point running 
these tests. If there are hidden flakies they will surface now. Also as this 
hadn't been rebased in a long time good catch changing 'master' to 'trunk' 
after the recent change in branch names.

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: NA
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread Berenguer Blasi (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227168#comment-17227168
 ] 

Berenguer Blasi commented on CASSANDRA-16121:
-

[~dcapwell] thanks for looking into this one. Yes that's the point running 
these tests. If there are hidden flakies they will surface now. Also as this 
hadn't been rebased in a long time good catch changing 'master' to 'trunk' 
after the recent change in branch names.

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: NA
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16241) ArrayClustering does not properly handle null clustering key elements left over from tables created WITH COMPACT STORAGE

2020-11-05 Thread Jordan West (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227035#comment-17227035
 ] 

Jordan West edited comment on CASSANDRA-16241 at 11/6/20, 12:21 AM:


+1 LGTM. Ran the test locally as well and verified it failed without the fix. 
Also looked at the call-sites of {{getBufferArrray}}. 


was (Author: jrwest):
+1 LGTM. Ran the test locally as well and verified it vailed without the fix. 
Also looked at the call-sites of {{getBufferArrray}}. 

> ArrayClustering does not properly handle null clustering key elements left 
> over from tables created WITH COMPACT STORAGE
> 
>
> Key: CASSANDRA-16241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16241
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: compaction
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The only way we can produce null clustering key elements is leaving them 
> empty on insert while a table is still compact. If we subsequently DROP 
> COMPACT STORAGE, those null elements linger, and {{ArrayClustering}} does not 
> handle them appropriately on compaction. 
> If you run the test 
> [here|https://github.com/maedhroz/cassandra/commit/e247b7868cae383168153bbe8bbbaa47a660f76b],
>  you should be able to observe an exception that looks roughly like this:
> {noformat}
> java.lang.NullPointerException
>   at java.base/java.nio.ByteBuffer.wrap(ByteBuffer.java:422)
>   at 
> org.apache.cassandra.db.AbstractArrayClusteringPrefix.getBufferArray(AbstractArrayClusteringPrefix.java:45)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataCollector.finalizeMetadata(MetadataCollector.java:246)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.finalizeMetadata(SSTableWriter.java:315)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.access$200(BigTableWriter.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:415)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:283)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:380)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:118)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:179)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:128)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:225)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> {noformat}
> There are already numerous places where we respect the fact that clustering 
> elements may be null, so this should be pretty straightforward to fix, and 
> the tests that accompany it will probably be more complex than the fix itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227083#comment-17227083
 ] 

David Capwell commented on CASSANDRA-16121:
---

Saw test failures, but they pass the next time or on the other jdk pipeline.  
Since this patch just runs them and didn't change them, I committed knowing 
that this will cause more flaky tests; can't fix if we don't know them.

Thanks for your work [~Bereng]!

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: NA
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227073#comment-17227073
 ] 

David Capwell edited comment on CASSANDRA-16121 at 11/5/20, 11:44 PM:
--

CI Results: Yellow, seems these tests can be flaky.
||Branch||Source||Circle CI||Jenkins||
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|unknown]|



was (Author: dcapwell):
Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|unknown]|


> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: NA
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16121:
--
  Fix Version/s: (was: 4.0-beta)
 NA
  Since Version: NA
Source Control Link: 
ttps://github.com/apache/cassandra/commit/530179c7afe6e86c0bd84e5f4557345c93fcbc0a
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: NA
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated: Circleci should run cqlshlib tests as well

2020-11-05 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 0700d79  Circleci should run cqlshlib tests as well
0700d79 is described below

commit 0700d795bcc4d79c3f2e52872ac865fa735917d8
Author: Berenguer Blasi 
AuthorDate: Thu Nov 5 15:12:56 2020 -0800

Circleci should run cqlshlib tests as well

patch by Berenguer Blasi; reviewed by David Capwell, Ekaterina Dimitrova 
for CASSANDRA-16121
---
 .circleci/config-2_1.yml| 49 +
 .circleci/config-2_1.yml.high_res.patch | 96 +
 .circleci/config-2_1.yml.mid_res.patch  | 52 +-
 .circleci/config.yml| 79 +++
 .circleci/config.yml.HIGHRES| 79 +++
 .circleci/config.yml.LOWRES | 79 +++
 .circleci/config.yml.MIDRES | 79 +++
 pylib/cassandra-cqlsh-tests.sh  |  4 +-
 8 files changed, 457 insertions(+), 60 deletions(-)

diff --git a/.circleci/config-2_1.yml b/.circleci/config-2_1.yml
index 076f7ad..9e46fe2 100644
--- a/.circleci/config-2_1.yml
+++ b/.circleci/config-2_1.yml
@@ -28,6 +28,12 @@ j8_small_par_executor: _small_par_executor
 #exec_resource_class: xlarge
   parallelism: 1
 
+j8_small_executor: _small_executor
+  executor:
+name: java8-executor
+exec_resource_class: medium
+  parallelism: 1
+
 j8_medium_par_executor: _medium_par_executor
   executor:
 name: java8-executor
@@ -52,6 +58,12 @@ j11_small_par_executor: _small_par_executor
 #exec_resource_class: xlarge
   parallelism: 1
 
+j11_small_executor: _small_executor
+  executor:
+name: java11-executor
+#exec_resource_class: medium
+  parallelism: 1
+
 j8_with_dtests_jobs: _with_dtests_jobs
   jobs:
 - j8_build
@@ -70,6 +82,9 @@ j8_with_dtests_jobs: _with_dtests_jobs
   - start_j11_unit_tests
   - j8_build
 # specialized unit tests (all run on request using Java 8)
+- j8_cqlshlib_tests:
+requires:
+  - j8_build
 - start_utests_long:
 type: approval
 - utests_long:
@@ -202,6 +217,9 @@ j11_with_dtests_jobs: _with_dtests_jobs
 - j11_jvm_dtests:
 requires:
   - j11_build
+- j11_cqlshlib_tests:
+requires:
+  - j11_build
 # Java 11 dtests (on request)
 - start_j11_dtests:
 type: approval
@@ -389,6 +407,20 @@ jobs:
   - log_environment
   - run_parallel_junit_tests
 
+  j8_cqlshlib_tests:
+<<: *j8_small_executor
+steps:
+  - attach_workspace:
+  at: /home/cassandra
+  - run_cqlshlib_tests
+
+  j11_cqlshlib_tests:
+<<: *j11_small_executor
+steps:
+  - attach_workspace:
+  at: /home/cassandra
+  - run_cqlshlib_tests
+
   utests_long:
 <<: *j8_seq_executor
 steps:
@@ -866,6 +898,23 @@ commands:
 path: /tmp/cassandra/build/test/logs
 destination: logs
 
+  run_cqlshlib_tests:
+parameters:
+  no_output_timeout:
+type: string
+default: 15m
+steps:
+- run:
+name: Run cqlshlib Unit Tests
+command: |
+  export PATH=$JAVA_HOME/bin:$PATH
+  time mv ~/cassandra /tmp
+  cd /tmp/cassandra/pylib
+  ./cassandra-cqlsh-tests.sh ..
+no_output_timeout: <>
+- store_test_results:
+path: /tmp/cassandra/pylib
+
   run_parallel_junit_tests:
 parameters:
   target:
diff --git a/.circleci/config-2_1.yml.high_res.patch 
b/.circleci/config-2_1.yml.high_res.patch
index d0799e4..1a0ba53 100644
--- a/.circleci/config-2_1.yml.high_res.patch
+++ b/.circleci/config-2_1.yml.high_res.patch
@@ -1,34 +1,62 @@
-22,23c22,23
-< #exec_resource_class: xlarge
-<   parallelism: 4

-> exec_resource_class: xlarge
->   parallelism: 100
-28,29c28,29
-< #exec_resource_class: xlarge
-<   parallelism: 1

-> exec_resource_class: xlarge
->   parallelism: 5
-34,35c34,35
-< #exec_resource_class: xlarge
-<   parallelism: 1

-> exec_resource_class: xlarge
->   parallelism: 2
-40c40
-< #exec_resource_class: xlarge

-> exec_resource_class: xlarge
-46,47c46,47
-< #exec_resource_class: xlarge
-<   parallelism: 4

-> exec_resource_class: xlarge
->   parallelism: 100
-52,53c52,53
-< #exec_resource_class: xlarge
-<   parallelism: 1

-> exec_resource_class: xlarge
->   parallelism: 2
+@@ -19,14 +19,14 @@ default_env_vars: _env_vars
+ j8_par_executor: _par_executor
+   executor:
+ name: java8-executor
+-#exec_resource_class: xlarge
+-  parallelism: 4
++exec_resource_class: xlarge
++  parallelism: 100
+ 
+ j8_small_par_executor: _small_par_executor
+   executor:
+ name: java8-executor
+-#exec_resource_class: xlarge
+-

[jira] [Assigned] (CASSANDRA-16189) Add tests for the Hint service metrics

2020-11-05 Thread Mohamed Zafraan (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohamed Zafraan reassigned CASSANDRA-16189:
---

Assignee: Mohamed Zafraan  (was: Uchenna)

> Add tests for the Hint service metrics
> --
>
> Key: CASSANDRA-16189
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16189
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/python
>Reporter: Benjamin Lerer
>Assignee: Mohamed Zafraan
>Priority: Normal
> Fix For: 4.0-beta
>
>
> There are currently no tests for the hint metrics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16192) Add more tests to cover compaction metrics

2020-11-05 Thread Mohamed Zafraan (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohamed Zafraan updated CASSANDRA-16192:

Reviewers: Adam Holmberg, Mohamed Zafraan
   Status: Review In Progress  (was: Patch Available)

> Add more tests to cover compaction metrics
> --
>
> Key: CASSANDRA-16192
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16192
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Benjamin Lerer
>Assignee: Mohamed Zafraan
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 0001-added-unit-tests-to-cover-compaction-metrics.patch
>
>
> Some compaction metrics do not seems to be tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227073#comment-17227073
 ] 

David Capwell edited comment on CASSANDRA-16121 at 11/5/20, 11:24 PM:
--

Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|unknown]|



was (Author: dcapwell):
Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[LOWER
 
build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|unknown]|


> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16192) Add more tests to cover compaction metrics

2020-11-05 Thread Mohamed Zafraan (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohamed Zafraan updated CASSANDRA-16192:

Impacts: None
Test and Documentation Plan: 
Compaction Metrics were divided into three categories for testing:

1. testSimpleCompactionMetricsForCompletedTasks: 
(totalCompactionsCompleted, completedTasks, bytesCompacted)

2. testCompactionMetricsForPendingTasks: 
(pendingTasks, pendingTasksByTableName)

3. testCompactionMetricsForFailedTasks: 
   (compactionsAborted, compactionsReduced, sstablesDropppedFromCompactions)

Test 2 and 3 make use of TestCompactionTask written in the same class to 
trigger metrics for workflows for pending tasks and aborted sstables during 
compaction. 
The class was written so we did not have to simulate scenarios were compactions 
requests were stalled or disk was full.
 Status: Patch Available  (was: In Progress)

> Add more tests to cover compaction metrics
> --
>
> Key: CASSANDRA-16192
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16192
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Benjamin Lerer
>Assignee: Mohamed Zafraan
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 0001-added-unit-tests-to-cover-compaction-metrics.patch
>
>
> Some compaction metrics do not seems to be tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16192) Add more tests to cover compaction metrics

2020-11-05 Thread Mohamed Zafraan (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohamed Zafraan updated CASSANDRA-16192:

Attachment: 0001-added-unit-tests-to-cover-compaction-metrics.patch

> Add more tests to cover compaction metrics
> --
>
> Key: CASSANDRA-16192
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16192
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Benjamin Lerer
>Assignee: Mohamed Zafraan
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 0001-added-unit-tests-to-cover-compaction-metrics.patch
>
>
> Some compaction metrics do not seems to be tested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16246) Unexpected warning "Ignoring Unrecognized strategy option" for NetworkTopologyStrategy when restarting

2020-11-05 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16246:
--
Fix Version/s: 4.0-beta

> Unexpected warning "Ignoring Unrecognized strategy option" for 
> NetworkTopologyStrategy when restarting
> --
>
> Key: CASSANDRA-16246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16246
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Logging
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>
> During restarting, bunch of warning messages like 
> "AbstractReplicationStrategy.java:364 - Ignoring Unrecognized strategy option 
> {datacenter2} passed to NetworkTopologyStrategy for keyspace 
> distributed_test_keyspace" are logged. 
> The warnings are not expected since the mentioned DC exist. 
> It seems to be caused by the improper order during startup, so that when 
> opening keyspaces it is unaware of DCs. 
> The warning can be reproduced using the test below. 
> {code:java}
> @Test
> public void testEmitsWarningsForNetworkTopologyStategyConfigOnRestart() 
> throws Exception {
> int nodesPerDc = 2;
> try (Cluster cluster = builder().withConfig(c -> c.with(GOSSIP, NETWORK))
> .withRacks(2, 1, nodesPerDc)
> .start()) {
> cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE +
>  " WITH replication = {'class': 
> 'NetworkTopologyStrategy', " +
>  "'datacenter1' : " + nodesPerDc + ", 
> 'datacenter2' : " + nodesPerDc + " };");
> cluster.get(2).nodetool("flush");
> System.out.println("Stop node 2 in datacenter 1");
> cluster.get(2).shutdown().get();
> System.out.println("Start node 2 in datacenter 1");
> cluster.get(2).startup();
> List result = cluster.get(2).logs().grep("Ignoring 
> Unrecognized strategy option \\{datacenter2\\}").getResult();
> Assert.assertFalse(result.isEmpty());
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16246) Unexpected warning "Ignoring Unrecognized strategy option" for NetworkTopologyStrategy when restarting

2020-11-05 Thread Yifan Cai (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-16246:
--
 Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear 
Impact(13164)
   Complexity: Low Hanging Fruit
Discovered By: User Report
 Severity: Low
   Status: Open  (was: Triage Needed)

> Unexpected warning "Ignoring Unrecognized strategy option" for 
> NetworkTopologyStrategy when restarting
> --
>
> Key: CASSANDRA-16246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16246
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Logging
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> During restarting, bunch of warning messages like 
> "AbstractReplicationStrategy.java:364 - Ignoring Unrecognized strategy option 
> {datacenter2} passed to NetworkTopologyStrategy for keyspace 
> distributed_test_keyspace" are logged. 
> The warnings are not expected since the mentioned DC exist. 
> It seems to be caused by the improper order during startup, so that when 
> opening keyspaces it is unaware of DCs. 
> The warning can be reproduced using the test below. 
> {code:java}
> @Test
> public void testEmitsWarningsForNetworkTopologyStategyConfigOnRestart() 
> throws Exception {
> int nodesPerDc = 2;
> try (Cluster cluster = builder().withConfig(c -> c.with(GOSSIP, NETWORK))
> .withRacks(2, 1, nodesPerDc)
> .start()) {
> cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE +
>  " WITH replication = {'class': 
> 'NetworkTopologyStrategy', " +
>  "'datacenter1' : " + nodesPerDc + ", 
> 'datacenter2' : " + nodesPerDc + " };");
> cluster.get(2).nodetool("flush");
> System.out.println("Stop node 2 in datacenter 1");
> cluster.get(2).shutdown().get();
> System.out.println("Start node 2 in datacenter 1");
> cluster.get(2).startup();
> List result = cluster.get(2).logs().grep("Ignoring 
> Unrecognized strategy option \\{datacenter2\\}").getResult();
> Assert.assertFalse(result.isEmpty());
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227075#comment-17227075
 ] 

David Capwell commented on CASSANDRA-16121:
---

testing out commit but used a slight change; seems that the commit was 
reverting the python dtest change to use the trunk branch, so corrected that.

{code}
diff --git a/.circleci/config.yml.MIDRES b/.circleci/config.yml.MIDRES
index 691cc2886f..823a550e04 100644
--- a/.circleci/config.yml.MIDRES
+++ b/.circleci/config.yml.MIDRES
@@ -1459,7 +1459,7 @@ jobs:
 - CASS_DRIVER_NO_CYTHON: true
 - CASSANDRA_SKIP_SYNC: true
 - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
-- DTEST_BRANCH: master
+- DTEST_BRANCH: trunk
 - CCM_MAX_HEAP_SIZE: 1024M
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64
@@ -2002,7 +2002,7 @@ jobs:
 - CASS_DRIVER_NO_CYTHON: true
 - CASSANDRA_SKIP_SYNC: true
 - DTEST_REPO: git://github.com/apache/cassandra-dtest.git
-- DTEST_BRANCH: master
+- DTEST_BRANCH: trunk
 - CCM_MAX_HEAP_SIZE: 1024M
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
{code}

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227073#comment-17227073
 ] 

David Capwell commented on CASSANDRA-16121:
---

Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[LOWER
 
build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16121-trunk-13B96DF0-43D0-4161-977E-5D1FEFDE4DE8]|[build|unknown]|


> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16247) Fix flaky test testTPStats - org.apache.cassandra.tools.NodeToolGossipInfoTest

2020-11-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16247:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: Test 
Failure(12990)
   Complexity: Normal
Discovered By: Unit Test
Fix Version/s: 4.0-beta
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Fix flaky test testTPStats - org.apache.cassandra.tools.NodeToolGossipInfoTest
> --
>
> Key: CASSANDRA-16247
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16247
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/764/workflows/6d7a6adc-59d1-4f3c-baae-1f8329dca9b7/jobs/4363
> {code}
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.tools.NodeToolGossipInfoTest.testTPStats(NodeToolGossipInfoTest.java:128)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226985#comment-17226985
 ] 

David Capwell edited comment on CASSANDRA-16144 at 11/5/20, 10:50 PM:
--

CI Results: Yellow, known issues and CASSANDRA-16247
||Branch||Source||Circle CI||Jenkins||
|cassandra-2.2|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-2.2-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-2.2-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/178/]|
|cassandra-3.0|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.0-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.0-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/179/]|
|cassandra-3.11|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.11-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.11-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/180/]|
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-trunk-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-trunk-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/181/]|



was (Author: dcapwell):
Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|cassandra-2.2|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-2.2-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-2.2-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/178/]|
|cassandra-3.0|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.0-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.0-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/179/]|
|cassandra-3.11|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.11-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.11-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/180/]|
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-trunk-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-trunk-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/181/]|


> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta4
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
>

[jira] [Created] (CASSANDRA-16247) Fix flaky test testTPStats - org.apache.cassandra.tools.NodeToolGossipInfoTest

2020-11-05 Thread David Capwell (Jira)

David Capwell created CASSANDRA-16247:
-

 Summary: Fix flaky test testTPStats - 
org.apache.cassandra.tools.NodeToolGossipInfoTest
 Key: CASSANDRA-16247
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16247
 Project: Cassandra
  Issue Type: Bug
  Components: Test/unit
Reporter: David Capwell


https://app.circleci.com/pipelines/github/dcapwell/cassandra/764/workflows/6d7a6adc-59d1-4f3c-baae-1f8329dca9b7/jobs/4363

{code}
junit.framework.AssertionFailedError
at 
org.apache.cassandra.tools.NodeToolGossipInfoTest.testTPStats(NodeToolGossipInfoTest.java:128)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-11-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16144:
--
  Fix Version/s: (was: 4.0-beta)
 4.0-beta4
  Since Version: 4.0-beta1
Source Control Link:  
https://github.com/apache/cassandra/commit/76555088561d3412ca6a06c0b6359cd01f07326c
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta4
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
> org.apache.cassandra.security.SSLFactory.buildKeyManagerFactory(SSLFactory.java:232)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.createNettySslContext(SSLFactory.java:300)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:276)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:257)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:107)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:71)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
>

[jira] [Updated] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-11-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16144:
--
Status: Ready to Commit  (was: Changes Suggested)

> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
> org.apache.cassandra.security.SSLFactory.buildKeyManagerFactory(SSLFactory.java:232)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.createNettySslContext(SSLFactory.java:300)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:276)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:257)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:107)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:71)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.AbstractChannelHandlerContext.callHandlerAdded(AbstractChannelHandlerContext.java:938)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:609)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
>

[cassandra] branch cassandra-2.2 updated: TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-11-05 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch cassandra-2.2
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-2.2 by this push:
 new f293376  TLS connections to the storage port on a node without server 
encryption configured causes java.io.IOException accessing missing keystore
f293376 is described below

commit f293376aa8dd315a208ef2f03bdcb7a84dcc675c
Author: Jon Meredith 
AuthorDate: Thu Nov 5 12:58:07 2020 -0800

TLS connections to the storage port on a node without server encryption 
configured causes java.io.IOException accessing missing keystore

patch by Jon Meredith; reviewed by David Capwell, Dinesh Joshi for 
CASSANDRA-16144
---
 .../cassandra/config/YamlConfigurationLoader.java|  4 +++-
 .../cassandra/distributed/impl/InstanceConfig.java   | 20 +++-
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java 
b/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java
index 07b149c..2ca978f 100644
--- a/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java
+++ b/src/java/org/apache/cassandra/config/YamlConfigurationLoader.java
@@ -143,7 +143,9 @@ public class YamlConfigurationLoader implements 
ConfigurationLoader
 return node;
 }
 });
-return (T) constructor.getSingleData(klass);
+T value = (T) constructor.getSingleData(klass);
+propertiesChecker.check();
+return value;
 }
 
 static class CustomConstructor extends CustomClassLoaderConstructor
diff --git 
a/test/distributed/org/apache/cassandra/distributed/impl/InstanceConfig.java 
b/test/distributed/org/apache/cassandra/distributed/impl/InstanceConfig.java
index 861e2ea..b71f22c 100644
--- a/test/distributed/org/apache/cassandra/distributed/impl/InstanceConfig.java
+++ b/test/distributed/org/apache/cassandra/distributed/impl/InstanceConfig.java
@@ -57,6 +57,7 @@ public class InstanceConfig implements IInstanceConfig
 public final UUID hostId;
 public UUID hostId() { return hostId; }
 private final Map params = new TreeMap<>();
+private final Map dtestParams = new TreeMap<>();
 
 private final EnumSet featureFlags;
 
@@ -119,6 +120,7 @@ public class InstanceConfig implements IInstanceConfig
 this.num = copy.num;
 this.networkTopology = new NetworkTopology(copy.networkTopology);
 this.params.putAll(copy.params);
+this.dtestParams.putAll(copy.dtestParams);
 this.hostId = copy.hostId;
 this.featureFlags = copy.featureFlags;
 this.broadcastAddressAndPort = copy.broadcastAddressAndPort;
@@ -185,7 +187,7 @@ public class InstanceConfig implements IInstanceConfig
 if (value == null)
 value = NULL;
 
-params.put(fieldName, value);
+getParams(fieldName).put(fieldName, value);
 return this;
 }
 
@@ -195,10 +197,18 @@ public class InstanceConfig implements IInstanceConfig
 value = NULL;
 
 // test value
-params.put(fieldName, value);
+getParams(fieldName).put(fieldName, value);
 return this;
 }
 
+private Map getParams(String fieldName)
+{
+Map map = params;
+if (fieldName.startsWith("dtest"))
+map = dtestParams;
+return map;
+}
+
 public void propagate(Object writeToConfig, Map, Function> mapping)
 {
 throw new IllegalStateException("In-JVM dtests no longer support 
propagate");
@@ -212,17 +222,17 @@ public class InstanceConfig implements IInstanceConfig
 
 public Object get(String name)
 {
-return params.get(name);
+return getParams(name).get(name);
 }
 
 public int getInt(String name)
 {
-return (Integer)params.get(name);
+return (Integer) get(name);
 }
 
 public String getString(String name)
 {
-return (String)params.get(name);
+return (String) get(name);
 }
 
 public Map getParams()


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch cassandra-3.0 updated (fa9bbd4 -> e74bd9f)

2020-11-05 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a change to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from fa9bbd4  remove bad import from CASSANDRA-15789
 new f293376  TLS connections to the storage port on a node without server 
encryption configured causes java.io.IOException accessing missing keystore
 new e74bd9f  Merge branch 'cassandra-2.2' into cassandra-3.0

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../cassandra/config/YamlConfigurationLoader.java|  4 +++-
 .../cassandra/distributed/impl/InstanceConfig.java   | 20 +++-
 2 files changed, 18 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch trunk updated (d68c45e -> 001767d)

2020-11-05 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from d68c45e  Merge branch 'cassandra-3.11' into trunk
 new f293376  TLS connections to the storage port on a node without server 
encryption configured causes java.io.IOException accessing missing keystore
 new e74bd9f  Merge branch 'cassandra-2.2' into cassandra-3.0
 new 3200bcf  Merge branch 'cassandra-3.0' into cassandra-3.11
 new 001767d  Merge branch 'cassandra-3.11' into trunk

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   1 +
 .../cassandra/config/DatabaseDescriptor.java   |  23 +-
 .../apache/cassandra/config/EncryptionOptions.java | 234 
 .../cassandra/config/YamlConfigurationLoader.java  |   4 +-
 .../apache/cassandra/db/virtual/SettingsTable.java |   2 +-
 .../cassandra/net/InboundConnectionInitiator.java  |  83 +++---
 .../cassandra/net/InboundConnectionSettings.java   |   6 +-
 .../org/apache/cassandra/net/InboundSockets.java   |   4 +-
 .../apache/cassandra/net/OutboundConnection.java   |   3 +-
 .../cassandra/net/OutboundConnectionSettings.java  |   3 +-
 .../org/apache/cassandra/net/SocketFactory.java|  52 ++--
 .../org/apache/cassandra/security/SSLFactory.java  |  14 +-
 .../cassandra/service/NativeTransportService.java  |  50 ++--
 .../org/apache/cassandra/tools/LoaderOptions.java  |  12 +-
 .../org/apache/cassandra/transport/Client.java |   4 +-
 .../org/apache/cassandra/transport/Server.java |  43 ++-
 .../apache/cassandra/transport/SimpleClient.java   |   2 +-
 .../cassandra/distributed/impl/InstanceConfig.java |  21 +-
 .../test/AbstractEncryptionOptionsImpl.java| 295 +
 .../distributed/test/IncRepairTruncationTest.java  |   3 +-
 .../test/InternodeEncryptionOptionsTest.java   | 218 +++
 .../test/NativeTransportEncryptionOptionsTest.java | 137 ++
 .../distributed/test/PreviewRepairTest.java|   3 +-
 .../cassandra/config/EncryptionOptionsTest.java| 178 +
 .../config/YamlConfigurationLoaderTest.java|  11 +-
 test/unit/org/apache/cassandra/cql3/CQLTester.java |   2 +-
 .../cassandra/db/virtual/SettingsTableTest.java|   6 +-
 .../apache/cassandra/net/MessagingServiceTest.java |   6 +-
 .../service/NativeTransportServiceTest.java|  86 +-
 .../stress/settings/SettingsTransport.java |   2 +-
 .../cassandra/stress/util/JavaDriverClient.java|   2 +-
 31 files changed, 1298 insertions(+), 212 deletions(-)
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/AbstractEncryptionOptionsImpl.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/InternodeEncryptionOptionsTest.java
 create mode 100644 
test/distributed/org/apache/cassandra/distributed/test/NativeTransportEncryptionOptionsTest.java
 create mode 100644 
test/unit/org/apache/cassandra/config/EncryptionOptionsTest.java


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] 01/01: Merge branch 'cassandra-2.2' into cassandra-3.0

2020-11-05 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit e74bd9fa264b863ccc5221020b8aa7ec4b70406e
Merge: fa9bbd4 f293376
Author: David Capwell 
AuthorDate: Thu Nov 5 14:43:54 2020 -0800

Merge branch 'cassandra-2.2' into cassandra-3.0

 .../cassandra/config/YamlConfigurationLoader.java|  4 +++-
 .../cassandra/distributed/impl/InstanceConfig.java   | 20 +++-
 2 files changed, 18 insertions(+), 6 deletions(-)



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] 01/01: Merge branch 'cassandra-3.0' into cassandra-3.11

2020-11-05 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 3200bcfb0e2813680b3f535ab416eab3f13bf5ce
Merge: 9960cf1 e74bd9f
Author: David Capwell 
AuthorDate: Thu Nov 5 14:44:59 2020 -0800

Merge branch 'cassandra-3.0' into cassandra-3.11

 .../cassandra/config/YamlConfigurationLoader.java|  4 +++-
 .../cassandra/distributed/impl/InstanceConfig.java   | 20 +++-
 2 files changed, 18 insertions(+), 6 deletions(-)



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[cassandra] branch cassandra-3.11 updated (9960cf1 -> 3200bcf)

2020-11-05 Thread dcapwell

This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 9960cf1  Merge branch 'cassandra-3.0' into cassandra-3.11
 new f293376  TLS connections to the storage port on a node without server 
encryption configured causes java.io.IOException accessing missing keystore
 new e74bd9f  Merge branch 'cassandra-2.2' into cassandra-3.0
 new 3200bcf  Merge branch 'cassandra-3.0' into cassandra-3.11

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../cassandra/config/YamlConfigurationLoader.java|  4 +++-
 .../cassandra/distributed/impl/InstanceConfig.java   | 20 +++-
 2 files changed, 18 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16183) Add tests to cover ClientRequest metrics

2020-11-05 Thread Adam Holmberg (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Holmberg updated CASSANDRA-16183:
--
Impacts: None
Test and Documentation Plan: Added new python dtest.
 Status: Patch Available  (was: In Progress)

> Add tests to cover ClientRequest metrics 
> -
>
> Key: CASSANDRA-16183
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16183
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Benjamin Lerer
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> We do not have test that covers the ClientRequest metrics.
> * ClientRequestMetrics
> * CASClientRequestMetrics
> * CASClientWriteRequestMetrics
> * ClientWriteRequestMetrics
> * ViewWriteMetrics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16241) ArrayClustering does not properly handle null clustering key elements left over from tables created WITH COMPACT STORAGE

2020-11-05 Thread Caleb Rackliffe (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16241:

Reviewers: Jordan West, Caleb Rackliffe  (was: Caleb Rackliffe, Jordan West)
   Jordan West, Caleb Rackliffe  (was: Jordan West)
   Status: Review In Progress  (was: Patch Available)

> ArrayClustering does not properly handle null clustering key elements left 
> over from tables created WITH COMPACT STORAGE
> 
>
> Key: CASSANDRA-16241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16241
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: compaction
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The only way we can produce null clustering key elements is leaving them 
> empty on insert while a table is still compact. If we subsequently DROP 
> COMPACT STORAGE, those null elements linger, and {{ArrayClustering}} does not 
> handle them appropriately on compaction. 
> If you run the test 
> [here|https://github.com/maedhroz/cassandra/commit/e247b7868cae383168153bbe8bbbaa47a660f76b],
>  you should be able to observe an exception that looks roughly like this:
> {noformat}
> java.lang.NullPointerException
>   at java.base/java.nio.ByteBuffer.wrap(ByteBuffer.java:422)
>   at 
> org.apache.cassandra.db.AbstractArrayClusteringPrefix.getBufferArray(AbstractArrayClusteringPrefix.java:45)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataCollector.finalizeMetadata(MetadataCollector.java:246)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.finalizeMetadata(SSTableWriter.java:315)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.access$200(BigTableWriter.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:415)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:283)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:380)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:118)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:179)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:128)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:225)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> {noformat}
> There are already numerous places where we respect the fact that clustering 
> elements may be null, so this should be pretty straightforward to fix, and 
> the tests that accompany it will probably be more complex than the fix itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16241) ArrayClustering does not properly handle null clustering key elements left over from tables created WITH COMPACT STORAGE

2020-11-05 Thread Caleb Rackliffe (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16241:

Reviewers: Jordan West

> ArrayClustering does not properly handle null clustering key elements left 
> over from tables created WITH COMPACT STORAGE
> 
>
> Key: CASSANDRA-16241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16241
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: compaction
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The only way we can produce null clustering key elements is leaving them 
> empty on insert while a table is still compact. If we subsequently DROP 
> COMPACT STORAGE, those null elements linger, and {{ArrayClustering}} does not 
> handle them appropriately on compaction. 
> If you run the test 
> [here|https://github.com/maedhroz/cassandra/commit/e247b7868cae383168153bbe8bbbaa47a660f76b],
>  you should be able to observe an exception that looks roughly like this:
> {noformat}
> java.lang.NullPointerException
>   at java.base/java.nio.ByteBuffer.wrap(ByteBuffer.java:422)
>   at 
> org.apache.cassandra.db.AbstractArrayClusteringPrefix.getBufferArray(AbstractArrayClusteringPrefix.java:45)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataCollector.finalizeMetadata(MetadataCollector.java:246)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.finalizeMetadata(SSTableWriter.java:315)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.access$200(BigTableWriter.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:415)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:283)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:380)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:118)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:179)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:128)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:225)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> {noformat}
> There are already numerous places where we respect the fact that clustering 
> elements may be null, so this should be pretty straightforward to fix, and 
> the tests that accompany it will probably be more complex than the fix itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16241) ArrayClustering does not properly handle null clustering key elements left over from tables created WITH COMPACT STORAGE

2020-11-05 Thread Caleb Rackliffe (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16241:

Reviewers: Jordan West  (was: Caleb Rackliffe, Jordan West)

> ArrayClustering does not properly handle null clustering key elements left 
> over from tables created WITH COMPACT STORAGE
> 
>
> Key: CASSANDRA-16241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16241
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: compaction
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The only way we can produce null clustering key elements is leaving them 
> empty on insert while a table is still compact. If we subsequently DROP 
> COMPACT STORAGE, those null elements linger, and {{ArrayClustering}} does not 
> handle them appropriately on compaction. 
> If you run the test 
> [here|https://github.com/maedhroz/cassandra/commit/e247b7868cae383168153bbe8bbbaa47a660f76b],
>  you should be able to observe an exception that looks roughly like this:
> {noformat}
> java.lang.NullPointerException
>   at java.base/java.nio.ByteBuffer.wrap(ByteBuffer.java:422)
>   at 
> org.apache.cassandra.db.AbstractArrayClusteringPrefix.getBufferArray(AbstractArrayClusteringPrefix.java:45)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataCollector.finalizeMetadata(MetadataCollector.java:246)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.finalizeMetadata(SSTableWriter.java:315)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.access$200(BigTableWriter.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:415)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:283)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:380)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:118)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:179)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:128)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:225)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> {noformat}
> There are already numerous places where we respect the fact that clustering 
> elements may be null, so this should be pretty straightforward to fix, and 
> the tests that accompany it will probably be more complex than the fix itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16241) ArrayClustering does not properly handle null clustering key elements left over from tables created WITH COMPACT STORAGE

2020-11-05 Thread Jordan West (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227035#comment-17227035
 ] 

Jordan West commented on CASSANDRA-16241:
-

+1 LGTM. Ran the test locally as well and verified it vailed without the fix. 
Also looked at the call-sites of {{getBufferArrray}}. 

> ArrayClustering does not properly handle null clustering key elements left 
> over from tables created WITH COMPACT STORAGE
> 
>
> Key: CASSANDRA-16241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16241
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: compaction
> Fix For: 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The only way we can produce null clustering key elements is leaving them 
> empty on insert while a table is still compact. If we subsequently DROP 
> COMPACT STORAGE, those null elements linger, and {{ArrayClustering}} does not 
> handle them appropriately on compaction. 
> If you run the test 
> [here|https://github.com/maedhroz/cassandra/commit/e247b7868cae383168153bbe8bbbaa47a660f76b],
>  you should be able to observe an exception that looks roughly like this:
> {noformat}
> java.lang.NullPointerException
>   at java.base/java.nio.ByteBuffer.wrap(ByteBuffer.java:422)
>   at 
> org.apache.cassandra.db.AbstractArrayClusteringPrefix.getBufferArray(AbstractArrayClusteringPrefix.java:45)
>   at 
> org.apache.cassandra.io.sstable.metadata.MetadataCollector.finalizeMetadata(MetadataCollector.java:246)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.finalizeMetadata(SSTableWriter.java:315)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.access$200(BigTableWriter.java:52)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:415)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:283)
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:380)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.doPrepare(CompactionAwareWriter.java:118)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168)
>   at 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:179)
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.finish(CompactionAwareWriter.java:128)
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:225)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> {noformat}
> There are already numerous places where we respect the fact that clustering 
> elements may be null, so this should be pretty straightforward to fix, and 
> the tests that accompany it will probably be more complex than the fix itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-05 Thread Stefan Miklosovic (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227009#comment-17227009
 ] 

Stefan Miklosovic commented on CASSANDRA-14013:
---

??SSTableLoader does not rely on the code of 
Descriptor::fromFilenameWithComponent for creating the Descriptor instances, it 
has its own mechanism which assume that there will be no TableID.??

That is not true, it does, follow the rabbit hole:

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableLoader.java#L88
 

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16183) Add tests to cover ClientRequest metrics

2020-11-05 Thread Adam Holmberg (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227005#comment-17227005
 ] 

Adam Holmberg commented on CASSANDRA-16183:
---

That was... more involved than I was expecting. Lots of metrics, interference, 
and developing techniques to make them happen. I think I came up with something 
reasonably compact that exercises them. 
[Here|https://github.com/apache/cassandra-dtest/compare/trunk...aholmberg:CASSANDRA-16183]
 is a potential patch containing a new metrics test.

CI against this dtest branch 
[here|https://app.circleci.com/pipelines/github/aholmberg/cassandra?branch=CASSANDRA-16183].

One thing I noted while creating this test: While {{ViewWriteMetrics}} inherits 
from {{ClientRequestMetrics}}, none of the [additional 
members|https://github.com/aholmberg/cassandra/blob/93c2d763eb5107c2001384700be4b240ecf7a4b8/src/java/org/apache/cassandra/metrics/ClientRequestMetrics.java#L31-L33]
 from that class are used. Would it be worth deprecating 
{{ViewWriteMetrics.timeouts}}, etc and making that class derive directly from 
{{LatencyMetrics}} in a later release, or should we just leave it?

> Add tests to cover ClientRequest metrics 
> -
>
> Key: CASSANDRA-16183
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16183
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Benjamin Lerer
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> We do not have test that covers the ClientRequest metrics.
> * ClientRequestMetrics
> * CASClientRequestMetrics
> * CASClientWriteRequestMetrics
> * ClientWriteRequestMetrics
> * ViewWriteMetrics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16246) Unexpected warning "Ignoring Unrecognized strategy option" for NetworkTopologyStrategy when restarting

2020-11-05 Thread Yifan Cai (Jira)

Yifan Cai created CASSANDRA-16246:
-

 Summary: Unexpected warning "Ignoring Unrecognized strategy 
option" for NetworkTopologyStrategy when restarting
 Key: CASSANDRA-16246
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16246
 Project: Cassandra
  Issue Type: Bug
  Components: Observability/Logging
Reporter: Yifan Cai
Assignee: Yifan Cai


During restarting, bunch of warning messages like 
"AbstractReplicationStrategy.java:364 - Ignoring Unrecognized strategy option 
{datacenter2} passed to NetworkTopologyStrategy for keyspace 
distributed_test_keyspace" are logged. 
The warnings are not expected since the mentioned DC exist. 
It seems to be caused by the improper order during startup, so that when 
opening keyspaces it is unaware of DCs. 

The warning can be reproduced using the test below. 

{code:java}
@Test
public void testEmitsWarningsForNetworkTopologyStategyConfigOnRestart() throws 
Exception {
int nodesPerDc = 2;
try (Cluster cluster = builder().withConfig(c -> c.with(GOSSIP, NETWORK))
.withRacks(2, 1, nodesPerDc)
.start()) {
cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE +
 " WITH replication = {'class': 
'NetworkTopologyStrategy', " +
 "'datacenter1' : " + nodesPerDc + ", 'datacenter2' 
: " + nodesPerDc + " };");
cluster.get(2).nodetool("flush");
System.out.println("Stop node 2 in datacenter 1");
cluster.get(2).shutdown().get();
System.out.println("Start node 2 in datacenter 1");
cluster.get(2).startup();
List result = cluster.get(2).logs().grep("Ignoring Unrecognized 
strategy option \\{datacenter2\\}").getResult();
Assert.assertFalse(result.isEmpty());
}
}
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226985#comment-17226985
 ] 

David Capwell commented on CASSANDRA-16144:
---

Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|cassandra-2.2|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-2.2-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-2.2-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/178/]|
|cassandra-3.0|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.0-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.0-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/179/]|
|cassandra-3.11|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.11-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.11-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/180/]|
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-trunk-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-trunk-EACE893A-0011-4FDD-B14A-CDF9AFCA71BD]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/181/]|


> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
>

[jira] [Commented] (CASSANDRA-15299) CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta

2020-11-05 Thread Sam Tunnicliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226898#comment-17226898
 ] 

Sam Tunnicliffe commented on CASSANDRA-15299:
-

Thanks, [~ifesdjeen], the flow chart is definitely useful (and looks correct to 
me, aside from the missing labels that Caleb mentioned).  Re: your comments...

{quote}in CQLMessageHeader#processLargeMessage, we seem to be creating a 
LargeMessage object to call processCqlFrame() over a frame that is assembled 
from a single byte buffer. Maybe we can just call processCqlFrame 
directly?{quote}

thanks, removed in {{ce6223cb}}

{quote}this is probably something that we should address outside this ticket, 
but still: "corrupt frame recovered" seems to be slightly misleading wording, 
same as Frame#recoverable. We can not recover the frame itself, we just can 
skip/drop it. Maybe we can rename this, along with metrics in Messaging, before 
they become public in 4.0.{quote}

I tend to agree, maybe something which conveys that the frames are 
skipped/dropped rather than recovered would be better? We can open a new JIRA 
for this.

{quote}in CQLMessageHeader#processCqlFrame, we only call handleErrorAndRelease. 
However, it may theoretically happen that we fail before we finish 
messageDecoder.decode(channel, frame). Maybe we can do something like the [1], 
to make it consistent with what we do in ProtocolDecoder#decode?{quote}

Good call, added in {{8dbf7e42}}.

{quote}in CQLMessageHeader$LargeMessage#onComplete, we wrap 
processCqlFrame(assembleFrame() call in try/catch which is identical to 
try/catch block from processCqlFrame itself. Just checking if this is 
intended.{quote}

no, it was a redundancy left over from earlier, removed it in {{8dbf7e42}}.

{quote}in Dispatcher#processRequest, we can slightly simplify code by declaring 
FlushItem flushItem variable outside try/catch block and assigning a 
corresponding value in try or catch, and only calling flush once.{quote}

done in {{8dbf7e42}}

{quote}during sever bootstrap initialization, we're using deprecated low/high 
watermark child options, probably we should use 
.childOption(ChannelOption.WRITE_BUFFER_WATER_MARK, new WriteBufferWaterMark(8 
* 1024, 32 * 1024)) instead.{quote}

Thanks, fixed in {{8dbf7e42}}

{quote}SimpleClientBurnTest#random is unused{quote}

thanks, removed.


[~maedhroz], thanks, those are decent improvements and seem to me worth making. 

{quote}testChangingLimitsAtRuntime has a bunch of places where it checks the 
same value twice, for instance from ClientResourceLimits.getGlobalLimit() and 
then DatabaseDescriptor.getNativeTransportMaxConcurrentRequestsInBytes(). Seems 
like we could do away with the second?{quote}

Thanks, it is redundant for the global limit, as the {{ClientResourceLimits}} 
method just delegates to {{DatabaseDescriptor}},  which was always the case I 
think. For the per-endpoint limits, it's definitely worth checking both places 
to ensure existing and new limits observe the same cap. 


> CASSANDRA-13304 follow-up: improve checksumming and compression in protocol 
> v5-beta
> ---
>
> Key: CASSANDRA-15299
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15299
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Messaging/Client
>Reporter: Aleksey Yeschenko
>Assignee: Sam Tunnicliffe
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-alpha
>
> Attachments: Process CQL Frame.png, V5 Flow Chart.png
>
>
> CASSANDRA-13304 made an important improvement to our native protocol: it 
> introduced checksumming/CRC32 to request and response bodies. It’s an 
> important step forward, but it doesn’t cover the entire stream. In 
> particular, the message header is not covered by a checksum or a crc, which 
> poses a correctness issue if, for example, {{streamId}} gets corrupted.
> Additionally, we aren’t quite using CRC32 correctly, in two ways:
> 1. We are calculating the CRC32 of the *decompressed* value instead of 
> computing the CRC32 on the bytes written on the wire - losing the properties 
> of the CRC32. In some cases, due to this sequencing, attempting to decompress 
> a corrupt stream can cause a segfault by LZ4.
> 2. When using CRC32, the CRC32 value is written in the incorrect byte order, 
> also losing some of the protections.
> See https://users.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf for 
> explanation for the two points above.
> Separately, there are some long-standing issues with the protocol - since 
> *way* before CASSANDRA-13304. Importantly, both checksumming and compression 
> operate on individual message bodies rather than frames of multiple complete 
> messages. In reality, this has several important

[jira] [Updated] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16121:
--
Status: Ready to Commit  (was: Review In Progress)

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226880#comment-17226880
 ] 

David Capwell commented on CASSANDRA-16121:
---

thanks for replying [~Bereng], LGTM, will commit today (unless [~e.dimitrova] 
has anything open).

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-11-05 Thread David Capwell (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16121:
--
Reviewers: David Capwell, Ekaterina Dimitrova, David Capwell  (was: David 
Capwell, Ekaterina Dimitrova)
   David Capwell, Ekaterina Dimitrova, David Capwell  (was: David 
Capwell, Ekaterina Dimitrova)
   Status: Review In Progress  (was: Patch Available)

> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226879#comment-17226879
 ] 

David Capwell commented on CASSANDRA-16144:
---

Starting commit

CI Results (pending):
||Branch||Source||Circle CI||Jenkins||
|cassandra-2.2|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-2.2-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-2.2-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/174/]|
|cassandra-3.0|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.0-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.0-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/175/]|
|cassandra-3.11|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-cassandra-3.11-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-cassandra-3.11-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/176/]|
|trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-16144-trunk-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16144-trunk-32A0C15C-7EBF-4373-87D5-72DAFE6BC340]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/177/]|


> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
>

[jira] [Commented] (CASSANDRA-16159) Reduce the Severity of Errors Reported in FailureDetector#isAlive()

2020-11-05 Thread Caleb Rackliffe (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226872#comment-17226872
 ] 

Caleb Rackliffe commented on CASSANDRA-16159:
-

[~shubhamaro] [~ua038697] Hi there! I noticed this switched assignees. Let me 
know if there's anything I can do to help.

> Reduce the Severity of Errors Reported in FailureDetector#isAlive()
> ---
>
> Key: CASSANDRA-16159
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16159
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Caleb Rackliffe
>Assignee: Shubham Arora
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Noticed the following error in the failure detector during a host replacement:
> {noformat}
> java.lang.IllegalArgumentException: Unknown endpoint: 10.38.178.98:7000
>   at 
> org.apache.cassandra.gms.FailureDetector.isAlive(FailureDetector.java:281)
>   at 
> org.apache.cassandra.service.StorageService.handleStateBootreplacing(StorageService.java:2502)
>   at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:2182)
>   at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:3145)
>   at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1242)
>   at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1368)
>   at 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
>   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:884)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> {noformat}
> This particular error looks benign, given that even if it occurs, the node 
> continues to handle the {{BOOT_REPLACE}} state. There are two things we might 
> be able to do to improve {{FailureDetector#isAlive()}} though:
> 1.) We don’t short circuit in the case that the endpoint in question is in 
> quarantine after being removed. It may be useful to check for this so we can 
> avoid logging an ERROR when the endpoint is clearly doomed/dead. (Quarantine 
> works great when the gossip message is _from_ a quarantined endpoint, but in 
> this case, that would be the new/replacing and not the old/replaced one.)
> 2.) We can reduce the severity of the logging from ERROR to WARN and provide 
> better context around how to determine whether or not there’s actually a 
> problem. (ex. “If this occurs while trying to determine liveness for a node 
> that is currently being replaced, it can be safely ignored.”)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-11-05 Thread David Capwell (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226861#comment-17226861
 ] 

David Capwell commented on CASSANDRA-16144:
---

Looks like 
https://github.com/apache/cassandra/pull/763/commits/24f8507294b79f1d2e551adc09353c3aeb4a2465
 and 
https://github.com/apache/cassandra/pull/763/commits/dd52e2ff391d95e418e0a033bde687bbf8c0e7a8
 are missing, something must have gone wrong with my merge attempt, will retry.

> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
> org.apache.cassandra.security.SSLFactory.buildKeyManagerFactory(SSLFactory.java:232)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.createNettySslContext(SSLFactory.java:300)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:276)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:257)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:107)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:71)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
>

[jira] [Commented] (CASSANDRA-16217) Minimal 4.0 COMPACT STORAGE backport

2020-11-05 Thread Alex Petrov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226716#comment-17226716
 ] 

Alex Petrov commented on CASSANDRA-16217:
-

[~marcuse] thank you for a review.

I think it's a great idea to backport [CASSANDRA-10857] tests, did find several 
edge-cases with CQL generation. I've also made changes similar to 
[CASSANDRA-13917], although it turned out we can implement them a bit simpler 
in 4.0.

I've triggered another test run just now after fixing the last failure, so this 
should be ready for another round.

> Minimal 4.0 COMPACT STORAGE backport
> 
>
> Key: CASSANDRA-16217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16217
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
>
> There are several behavioural changes related to compact storage, and these 
> differences are larger than most of us have anticipated: we first thought 
> there’ll be that “appearing column”, but there’s implicit nulls in 
> clusterings thing, and row vs column deletion.
> Some of the recent issues on the subject are: CASSANDRA-16048, which allows 
> to ignore these differences. The other one was trying to improve user 
> experience of anyone still using compact storage: CASSANDRA-15811.
> Easily reproducible differernces are:
> (1) hidden columns show up, which breaks SELECT * queries
>  (2) DELETE v and UPDATE v WITH TTL would result into row removals in 
> non-dense compact tables (CASSANDRA-16069)
>  (3) INSERT allows skipping clusterings, which are filled with nulls by 
> default.
> Some of these are tricky to support, as 15811 has shown. Anyone on OSS side 
> who might want to upgrade to 4.0 while still using compact storage might be 
> affected by being forced into one of these behaviours.
> Possible solutions are to document these behaviours, or to bring back a 
> minimal set of COMPACT STORAGE to keep supporting these.
> It looks like it is possible to leave some of the functionality related to 
> DENSE flag and allow it to be present in 4.0, but only for these three (and 
> potential related, however not direrclty visible) cases.
> [~e.dimitrova] since you were working on removal on compact storage, wanted 
> to reassure that this is not a revert of your patch. On contrary: your patch 
> was instrumental in identifying the right places.
> cc [~slebresne] [~aleksey] [~benedict] [~marcuse]
> |[patch|https://github.com/apache/cassandra/pull/785]|[ci|https://app.circleci.com/pipelines/github/ifesdjeen/cassandra?branch=13994-followup]|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-11-05 Thread Benjamin Lerer (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226714#comment-17226714
 ] 

Benjamin Lerer commented on CASSANDRA-12126:


{quote}I hope that I will be able to provide the community with an alternative 
solution in the near future, without these (and many other existing) 
pitfalls.{quote}

[~benedict] Few questions regarding your comment:
* What timeframe do you have in mind? 
* Is it a solution only for 4.0 or for all the branches?
* Can we help you with that?

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
> Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-11-05 Thread Stefan Miklosovic (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226704#comment-17226704
 ] 

Stefan Miklosovic commented on CASSANDRA-16161:
---

https://github.com/apache/cassandra/pull/809

> Validation Compactions causing Java GC pressure
> ---
>
> Key: CASSANDRA-16161
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Local/Config, Tool/nodetool
>Reporter: Cameron Zemek
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 3.11.x
>
> Attachments: 16161.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Validation Compactions are not rate limited which can cause Java GC pressure 
> and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-11-05 Thread Benedict Elliott Smith (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226668#comment-17226668
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-12126 at 11/5/20, 11:56 AM:
---

So, before we commit this I wanted to share that some experimentation found 
that this can lead to a significant increase in timeouts, particularly for 
read-heavy workloads, that previously would not have competed with each other. 
I think committing this to a patch release is honestly problematic, as it could 
surprise users with a service outage. At the very least, there should be HUGE 
warnings in {{NEWS.txt}}, but honestly I would prefer to have users opt-in for 
patch releases.

As much as I agree that it is problematic to provide the wrong semantics, I 
think it is also problematic to force a decision between stability and 
correctness onto our users without their informed and positive consent.

I hope that I will be able to provide the community with an alternative 
solution in the near future, without these (and many other existing) pitfalls. 
However I'm not sure how that should affect this decision.


was (Author: benedict):
So, before we commit this I wanted to share that some internal experimentation 
found that this can lead to a significant increase in timeouts, particularly 
for read-heavy workloads, that previously would not have competed with each 
other. I think committing this to a patch release is honestly problematic, as 
it could surprise users with a service outage. At the very least, there should 
be HUGE warnings in {{NEWS.txt}}, but honestly I would prefer to have users 
opt-in for patch releases.

As much as I agree that it is problematic to provide the wrong semantics, I 
think it is also problematic to force a decision between stability and 
correctness onto our users without their informed and positive consent.

I hope that I will be able to provide the community with an alternative 
solution in the near future, without these (and many other existing) pitfalls. 
However I'm not sure how that should affect this decision.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
> Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want.

[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-11-05 Thread Benedict Elliott Smith (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226668#comment-17226668
 ] 

Benedict Elliott Smith commented on CASSANDRA-12126:


So, before we commit this I wanted to share that some internal experimentation 
found that this can lead to a significant increase in timeouts, particularly 
for read-heavy workloads, that previously would not have competed with each 
other. I think committing this to a patch release is honestly problematic, as 
it could surprise users with a service outage. At the very least, there should 
be HUGE warnings in {{NEWS.txt}}, but honestly I would prefer to have users 
opt-in for patch releases.

As much as I agree that it is problematic to provide the wrong semantics, I 
think it is also problematic to force a decision between stability and 
correctness onto our users without their informed and positive consent.

I hope that I will be able to provide the community with an alternative 
solution in the near future, without these (and many other existing) pitfalls. 
However I'm not sure how that should affect this decision.

> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
> Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-12126) CAS Reads Inconsistencies

2020-11-05 Thread Sylvain Lebresne (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-12126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226643#comment-17226643
 ] 

Sylvain Lebresne commented on CASSANDRA-12126:
--

Thanks for the review. I've rebased the branches, but since the last runs were 
a while ago, I restarted CI runs. I'll commit if those look clean.

||branch||CI||
|[3.0|https://github.com/pcmanus/cassandra/tree/C-12126-3.0]|[Run 
#171|https://ci-cassandra.apache.org/job/Cassandra-devbranch/171/]|
|[3.11|https://github.com/pcmanus/cassandra/tree/C-12126-3.11]|[Run 
#172|https://ci-cassandra.apache.org/job/Cassandra-devbranch/172/]|
|[4.0|https://github.com/pcmanus/cassandra/tree/C-12126-4.0]|[Run 
#173|https://ci-cassandra.apache.org/job/Cassandra-devbranch/173/]|


> CAS Reads Inconsistencies 
> --
>
> Key: CASSANDRA-12126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Sankalp Kohli
>Assignee: Sylvain Lebresne
>Priority: Normal
>  Labels: LWT, pull-request-available
> Fix For: 3.0.x, 3.11.x, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While looking at the CAS code in Cassandra, I found a potential issue with 
> CAS Reads. Here is how it can happen with RF=3
> 1) You issue a CAS Write and it fails in the propose phase. A machine replies 
> true to a propose and saves the commit in accepted filed. The other two 
> machines B and C does not get to the accept phase. 
> Current state is that machine A has this commit in paxos table as accepted 
> but not committed and B and C does not. 
> 2) Issue a CAS Read and it goes to only B and C. You wont be able to read the 
> value written in step 1. This step is as if nothing is inflight. 
> 3) Issue another CAS Read and it goes to A and B. Now we will discover that 
> there is something inflight from A and will propose and commit it with the 
> current ballot. Now we can read the value written in step 1 as part of this 
> CAS read.
> If we skip step 3 and instead run step 4, we will never learn about value 
> written in step 1. 
> 4. Issue a CAS Write and it involves only B and C. This will succeed and 
> commit a different value than step 1. Step 1 value will never be seen again 
> and was never seen before. 
> If you read the Lamport “paxos made simple” paper and read section 2.3. It 
> talks about this issue which is how learners can find out if majority of the 
> acceptors have accepted the proposal. 
> In step 3, it is correct that we propose the value again since we dont know 
> if it was accepted by majority of acceptors. When we ask majority of 
> acceptors, and more than one acceptors but not majority has something in 
> flight, we have no way of knowing if it is accepted by majority of acceptors. 
> So this behavior is correct. 
> However we need to fix step 2, since it caused reads to not be linearizable 
> with respect to writes and other reads. In this case, we know that majority 
> of acceptors have no inflight commit which means we have majority that 
> nothing was accepted by majority. I think we should run a propose step here 
> with empty commit and that will cause write written in step 1 to not be 
> visible ever after. 
> With this fix, we will either see data written in step 1 on next serial read 
> or will never see it which is what we want. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-05 Thread Benjamin Lerer (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226641#comment-17226641
 ] 

Benjamin Lerer commented on CASSANDRA-14013:


{quote}for example, for sstableloader, people might put sstable into 
/tmp/some/path/mykeyspace/mytable/(data files), and that "mytable" will not 
have any "id" on it ...{quote} 

{{SSTableLoader}} does not rely on the code of 
{{Descriptor::fromFilenameWithComponent}} for creating the {{Descriptor}} 
instances, it has its own mechanism which assume that there will be no 
{{TableID}}.

{quote}And you can have also a snapshot taken which is called "snapshots" That 
complicates things ever further.{quote}

It does not. I did a quick proof of concept and used your PR test to validate 
it 
[here|https://github.com/apache/cassandra/compare/trunk...blerer:CASSANDRA-14013-review].
 Of course, the test to check if a directory is a table one or not might be 
done better.

We know that, outside of this {{snapshots}} keyspace problem, this code work 
and has been battle tested. By consequence, being pragmatic and pretty 
paranoiac ;-), it feels safer to me to not reimplement the all thing. Specially 
if we take into account that we have to backport the fix to 3.11 and 3.0 (not 
sure for 2.2).  


> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-05 Thread Benjamin Lerer (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226641#comment-17226641
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/5/20, 10:56 AM:
---

[~stefan.miklosovic]
{quote}for example, for sstableloader, people might put sstable into 
/tmp/some/path/mykeyspace/mytable/(data files), and that "mytable" will not 
have any "id" on it ...{quote} 

{{SSTableLoader}} does not rely on the code of 
{{Descriptor::fromFilenameWithComponent}} for creating the {{Descriptor}} 
instances, it has its own mechanism which assume that there will be no 
{{TableID}}.

{quote}And you can have also a snapshot taken which is called "snapshots" That 
complicates things ever further.{quote}

It does not. I did a quick proof of concept and used your PR test to validate 
it 
[here|https://github.com/apache/cassandra/compare/trunk...blerer:CASSANDRA-14013-review].
 Of course, the test to check if a directory is a table one or not might be 
done better.

We know that, outside of this {{snapshots}} keyspace problem, this code work 
and has been battle tested. By consequence, being pragmatic and pretty 
paranoiac ;-), it feels safer to me to not reimplement the all thing. Specially 
if we take into account that we have to backport the fix to 3.11 and 3.0 (not 
sure for 2.2).  



was (Author: blerer):
{quote}for example, for sstableloader, people might put sstable into 
/tmp/some/path/mykeyspace/mytable/(data files), and that "mytable" will not 
have any "id" on it ...{quote} 

{{SSTableLoader}} does not rely on the code of 
{{Descriptor::fromFilenameWithComponent}} for creating the {{Descriptor}} 
instances, it has its own mechanism which assume that there will be no 
{{TableID}}.

{quote}And you can have also a snapshot taken which is called "snapshots" That 
complicates things ever further.{quote}

It does not. I did a quick proof of concept and used your PR test to validate 
it 
[here|https://github.com/apache/cassandra/compare/trunk...blerer:CASSANDRA-14013-review].
 Of course, the test to check if a directory is a table one or not might be 
done better.

We know that, outside of this {{snapshots}} keyspace problem, this code work 
and has been battle tested. By consequence, being pragmatic and pretty 
paranoiac ;-), it feels safer to me to not reimplement the all thing. Specially 
if we take into account that we have to backport the fix to 3.11 and 3.0 (not 
sure for 2.2).  


> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-16245) Implement repair quality test scenarios

2020-11-05 Thread Alexander Dejanovski (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Dejanovski reassigned CASSANDRA-16245:


Assignee: Radovan Zvoncek

> Implement repair quality test scenarios
> ---
>
> Key: CASSANDRA-16245
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16245
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java
>Reporter: Alexander Dejanovski
>Assignee: Radovan Zvoncek
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Implement the following test scenarios in a new test suite for repair 
> integration testing with significant load:
> Generate/restore a workload of ~100GB per node. Medusa should be considered 
> to create the initial backup which could then be restored from an S3 bucket 
> to speed up node population.
>  Data should on purpose require repair and be generated accordingly.
> Perform repairs for a 3 nodes cluster with 4 cores each and 16GB-32GB RAM 
> (m5d.xlarge instances would be the most cost efficient type).
>  Repaired keyspaces will use RF=3 or RF=2 in some cases (the latter is for 
> subranges with different sets of replicas).
> ||Mode||Version||Settings||Checks||
> |Full repair|trunk|Sequential + All token ranges|"No anticompaction 
> (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Full repair|trunk|Parallel + Primary range|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Full repair|trunk|Force terminate repair shortly after it was 
> triggered|Repair threads must be cleaned up|
> |Subrange repair|trunk|Sequential + single token range|"No anticompaction 
> (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Subrange repair|trunk|Parallel + 10 token ranges which have the same 
> replicas|"No anticompaction (repairedAt == 0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range
> A single repair session will handle all subranges at once"|
> |Subrange repair|trunk|Parallel + 10 token ranges which have different 
> replicas|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range
> More than one repair session is triggered to process all subranges"|
> |Subrange repair|trunk|"Single token range.
>  Force terminate repair shortly after it was triggered."|Repair threads must 
> be cleaned up|
> |Incremental repair|trunk|"Parallel (mandatory)
>  No compaction during repair"|"Anticompaction status (repairedAt != 0) on all 
> SSTables
>  No pending repair on SSTables after completion (could require to wait a bit 
> as this will happen asynchronously)
>  Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
> |Incremental repair|trunk|"Parallel (mandatory)
>  Major compaction triggered during repair"|"Anticompaction status (repairedAt 
> != 0) on all SSTables
>  No pending repair on SSTables after completion (could require to wait a bit 
> as this will happen asynchronously)
>  Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
> |Incremental repair|trunk|Force terminate repair shortly after it was 
> triggered.|Repair threads must be cleaned up|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16245) Implement repair quality test scenarios

2020-11-05 Thread Alexander Dejanovski (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Dejanovski updated CASSANDRA-16245:
-
Fix Version/s: 4.0-rc

> Implement repair quality test scenarios
> ---
>
> Key: CASSANDRA-16245
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16245
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java
>Reporter: Alexander Dejanovski
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Implement the following test scenarios in a new test suite for repair 
> integration testing with significant load:
> Generate/restore a workload of ~100GB per node. Medusa should be considered 
> to create the initial backup which could then be restored from an S3 bucket 
> to speed up node population.
>  Data should on purpose require repair and be generated accordingly.
> Perform repairs for a 3 nodes cluster with 4 cores each and 16GB-32GB RAM 
> (m5d.xlarge instances would be the most cost efficient type).
>  Repaired keyspaces will use RF=3 or RF=2 in some cases (the latter is for 
> subranges with different sets of replicas).
> ||Mode||Version||Settings||Checks||
> |Full repair|trunk|Sequential + All token ranges|"No anticompaction 
> (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Full repair|trunk|Parallel + Primary range|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Full repair|trunk|Force terminate repair shortly after it was 
> triggered|Repair threads must be cleaned up|
> |Subrange repair|trunk|Sequential + single token range|"No anticompaction 
> (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Subrange repair|trunk|Parallel + 10 token ranges which have the same 
> replicas|"No anticompaction (repairedAt == 0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range
> A single repair session will handle all subranges at once"|
> |Subrange repair|trunk|Parallel + 10 token ranges which have different 
> replicas|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range
> More than one repair session is triggered to process all subranges"|
> |Subrange repair|trunk|"Single token range.
>  Force terminate repair shortly after it was triggered."|Repair threads must 
> be cleaned up|
> |Incremental repair|trunk|"Parallel (mandatory)
>  No compaction during repair"|"Anticompaction status (repairedAt != 0) on all 
> SSTables
>  No pending repair on SSTables after completion (could require to wait a bit 
> as this will happen asynchronously)
>  Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
> |Incremental repair|trunk|"Parallel (mandatory)
>  Major compaction triggered during repair"|"Anticompaction status (repairedAt 
> != 0) on all SSTables
>  No pending repair on SSTables after completion (could require to wait a bit 
> as this will happen asynchronously)
>  Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
> |Incremental repair|trunk|Force terminate repair shortly after it was 
> triggered.|Repair threads must be cleaned up|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16245) Implement repair quality test scenarios

2020-11-05 Thread Alexander Dejanovski (Jira)

Alexander Dejanovski created CASSANDRA-16245:


 Summary: Implement repair quality test scenarios
 Key: CASSANDRA-16245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16245
 Project: Cassandra
  Issue Type: Task
  Components: Test/dtest/java
Reporter: Alexander Dejanovski


Implement the following test scenarios in a new test suite for repair 
integration testing with significant load:

Generate/restore a workload of ~100GB per node. Medusa should be considered to 
create the initial backup which could then be restored from an S3 bucket to 
speed up node population.
 Data should on purpose require repair and be generated accordingly.

Perform repairs for a 3 nodes cluster with 4 cores each and 16GB-32GB RAM 
(m5d.xlarge instances would be the most cost efficient type).
 Repaired keyspaces will use RF=3 or RF=2 in some cases (the latter is for 
subranges with different sets of replicas).
||Mode||Version||Settings||Checks||
|Full repair|trunk|Sequential + All token ranges|"No anticompaction 
(repairedAt==0)
 Out of sync ranges > 0
 Subsequent run must show no out of sync range"|
|Full repair|trunk|Parallel + Primary range|"No anticompaction (repairedAt==0)
 Out of sync ranges > 0
 Subsequent run must show no out of sync range"|
|Full repair|trunk|Force terminate repair shortly after it was triggered|Repair 
threads must be cleaned up|
|Subrange repair|trunk|Sequential + single token range|"No anticompaction 
(repairedAt==0)
 Out of sync ranges > 0
 Subsequent run must show no out of sync range"|
|Subrange repair|trunk|Parallel + 10 token ranges which have the same 
replicas|"No anticompaction (repairedAt == 0)
 Out of sync ranges > 0
 Subsequent run must show no out of sync range
A single repair session will handle all subranges at once"|
|Subrange repair|trunk|Parallel + 10 token ranges which have different 
replicas|"No anticompaction (repairedAt==0)
 Out of sync ranges > 0
 Subsequent run must show no out of sync range
More than one repair session is triggered to process all subranges"|
|Subrange repair|trunk|"Single token range.
 Force terminate repair shortly after it was triggered."|Repair threads must be 
cleaned up|
|Incremental repair|trunk|"Parallel (mandatory)
 No compaction during repair"|"Anticompaction status (repairedAt != 0) on all 
SSTables
 No pending repair on SSTables after completion (could require to wait a bit as 
this will happen asynchronously)
 Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
|Incremental repair|trunk|"Parallel (mandatory)
 Major compaction triggered during repair"|"Anticompaction status (repairedAt 
!= 0) on all SSTables
 No pending repair on SSTables after completion (could require to wait a bit as 
this will happen asynchronously)
 Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
|Incremental repair|trunk|Force terminate repair shortly after it was 
triggered.|Repair threads must be cleaned up|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16244) Create a jvm upgrade dtest for mixed versions repairs

2020-11-05 Thread Alexander Dejanovski (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Dejanovski updated CASSANDRA-16244:
-
Fix Version/s: 4.0-rc

> Create a jvm upgrade dtest for mixed versions repairs
> -
>
> Key: CASSANDRA-16244
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16244
> Project: Cassandra
>  Issue Type: Task
>Reporter: Alexander Dejanovski
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Repair during upgrades should fail on mixed version clusters.
> We'd need an in-jvm upgrade dtest to check that repair indeed fails as 
> expected with mixed current version+previous major version clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

59 matches

Mail list logo