[jira] [Commented] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251574#comment-17251574 ] Michael Semb Wever commented on CASSANDRA-16226: bq. There are actually a couple changes I want to make to the docs around compact storage, as I mentioned in the Documentation Plan above. Would it be best to include that only in the trunk patch? A separate Jira altogether? 3.11 (if doc exists there) and trunk, this jira, please. > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16332) Fix upgrade python dtest test_static_columns_with_2i - upgrade_tests.cql_tests.TestCQLNodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16332: Reviewers: Caleb Rackliffe > Fix upgrade python dtest test_static_columns_with_2i - > upgrade_tests.cql_tests.TestCQLNodes > --- > > Key: CASSANDRA-16332 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16332 > Project: Cassandra > Issue Type: Bug > Components: CI, Test/dtest/python >Reporter: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/843/workflows/9545f259-0a61-4ba8-8dea-485a33136032/jobs/4964 > {code} > # We don't support that > > assert_invalid(cursor, "SELECT s FROM test WHERE v = 1") > upgrade_tests/cql_tests.py:4137: > {code} > {code} > > assert False, "Expecting query to raise an exception, but nothing > > was raised." > E AssertionError: Expecting query to raise an exception, but > nothing was raised. > tools/assertions.py:63: AssertionError > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-4938) CREATE INDEX can block for creation now that schema changes may be concurrent
[ https://issues.apache.org/jira/browse/CASSANDRA-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251458#comment-17251458 ] maxwellguo commented on CASSANDRA-4938: --- I think we can also add a nodetool to show the progress of the 2idx 's building . for our index is local index. IF it is ok ,I would like to open a issue . [~slebresne] [~brandon.williams] > CREATE INDEX can block for creation now that schema changes may be concurrent > - > > Key: CASSANDRA-4938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4938 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: Krzysztof Cieslinski Cognitum >Assignee: Kirk True >Priority: Low > Labels: lhf > Fix For: 4.x > > > Response from CREATE INDEX command comes faster than the creation of > secondary index. So below code: > {code:xml} > CREATE INDEX ON tab(name); > SELECT * FROM tab WHERE name = 'Chris'; > {code} > doesn't return any rows(of course, in column family "tab", there are some > records with "name" value = 'Chris'..) and any errors ( i would expect > something like ??"Bad Request: No indexed columns present in by-columns > clause with Equal operator"??) > Inputing some timeout between those two commands resolves the problem, so: > {code:xml} > CREATE INDEX ON tab(name); > Sleep(timeout); // for column family with 2000 rows the timeout had to be set > for ~1 second > SELECT * FROM tab WHERE name = 'Chris'; > {code} > will return all rows with values as specified. > I'm using single node cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16332) Fix upgrade python dtest test_static_columns_with_2i - upgrade_tests.cql_tests.TestCQLNodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251454#comment-17251454 ] David Capwell commented on CASSANDRA-16332: --- Yep, that sounds about it! Mind running CI and putting the JIRA in the correct state? [~maedhroz] mind reviewing as well? > Fix upgrade python dtest test_static_columns_with_2i - > upgrade_tests.cql_tests.TestCQLNodes > --- > > Key: CASSANDRA-16332 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16332 > Project: Cassandra > Issue Type: Bug > Components: CI, Test/dtest/python >Reporter: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/843/workflows/9545f259-0a61-4ba8-8dea-485a33136032/jobs/4964 > {code} > # We don't support that > > assert_invalid(cursor, "SELECT s FROM test WHERE v = 1") > upgrade_tests/cql_tests.py:4137: > {code} > {code} > > assert False, "Expecting query to raise an exception, but nothing > > was raised." > E AssertionError: Expecting query to raise an exception, but > nothing was raised. > tools/assertions.py:63: AssertionError > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-4938) CREATE INDEX can block for creation now that schema changes may be concurrent
[ https://issues.apache.org/jira/browse/CASSANDRA-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251422#comment-17251422 ] Kirk True edited comment on CASSANDRA-4938 at 12/18/20, 12:56 AM: -- When trying to query against the table using the index before it's fully created, I see this error on the server: {noformat} java.lang.RuntimeException: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2729) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:445) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:2011) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2725) ... 6 common frames omitted {noformat} and this in {{cqlsh}}: {noformat} ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures: UNKNOWN from localhost/127.0.0.1:7000" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} {noformat} It looks like the schema update event is fired on the _start_ of the index creation, not on its _completion_. was (Author: kirktrue): When trying to query against the table using the index before it's fully created, I see this error on the server: {noformat} java.lang.RuntimeException: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2729) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:445) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:2011) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2725) ... 6 common frames omitted {noformat} and this in {cqlsh}: {noformat} ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures: UNKNOWN from localhost/127.0.0.1:7000" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} {noformat} It looks like the schema update event is fired on the _start_ of the index creation, not on its _completion_. > CREATE INDEX can block for creation now that schema changes may be concurrent > - > > Key: CASSANDRA-4938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4938 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: Krzysztof Cieslinski Cognitum >Assignee: Kirk True >Priority: Low > Labels: lhf > Fix For: 4.x > > > Response from CREATE INDEX command comes faster than the creation of > secondary index. So below code: > {code:xml} > CREATE INDEX ON tab(name); > SELECT * FROM tab WHERE name = 'Chris'; > {code} > doesn't return any rows(of course, in column family "tab", there are some > records with "name" value =
[jira] [Commented] (CASSANDRA-4938) CREATE INDEX can block for creation now that schema changes may be concurrent
[ https://issues.apache.org/jira/browse/CASSANDRA-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251422#comment-17251422 ] Kirk True commented on CASSANDRA-4938: -- When trying to query against the table using the index before it's fully created, I see this error on the server: {noformat} java.lang.RuntimeException: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2729) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:445) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:2011) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2725) ... 6 common frames omitted {noformat} and this in `cqlsh`: {noformat} ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures: UNKNOWN from localhost/127.0.0.1:7000" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} {noformat} It looks like the schema update event is fired on the _start_ of the index creation, not on its _completion_. > CREATE INDEX can block for creation now that schema changes may be concurrent > - > > Key: CASSANDRA-4938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4938 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: Krzysztof Cieslinski Cognitum >Assignee: Kirk True >Priority: Low > Labels: lhf > Fix For: 4.x > > > Response from CREATE INDEX command comes faster than the creation of > secondary index. So below code: > {code:xml} > CREATE INDEX ON tab(name); > SELECT * FROM tab WHERE name = 'Chris'; > {code} > doesn't return any rows(of course, in column family "tab", there are some > records with "name" value = 'Chris'..) and any errors ( i would expect > something like ??"Bad Request: No indexed columns present in by-columns > clause with Equal operator"??) > Inputing some timeout between those two commands resolves the problem, so: > {code:xml} > CREATE INDEX ON tab(name); > Sleep(timeout); // for column family with 2000 rows the timeout had to be set > for ~1 second > SELECT * FROM tab WHERE name = 'Chris'; > {code} > will return all rows with values as specified. > I'm using single node cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-4938) CREATE INDEX can block for creation now that schema changes may be concurrent
[ https://issues.apache.org/jira/browse/CASSANDRA-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251422#comment-17251422 ] Kirk True edited comment on CASSANDRA-4938 at 12/18/20, 12:55 AM: -- When trying to query against the table using the index before it's fully created, I see this error on the server: {noformat} java.lang.RuntimeException: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2729) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:445) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:2011) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2725) ... 6 common frames omitted {noformat} and this in {cqlsh}: {noformat} ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures: UNKNOWN from localhost/127.0.0.1:7000" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} {noformat} It looks like the schema update event is fired on the _start_ of the index creation, not on its _completion_. was (Author: kirktrue): When trying to query against the table using the index before it's fully created, I see this error on the server: {noformat} java.lang.RuntimeException: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2729) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: org.apache.cassandra.index.IndexNotAvailableException: The secondary index 'test_table_user_name_idx' is not yet available at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:445) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:2011) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2725) ... 6 common frames omitted {noformat} and this in `cqlsh`: {noformat} ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures: UNKNOWN from localhost/127.0.0.1:7000" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} {noformat} It looks like the schema update event is fired on the _start_ of the index creation, not on its _completion_. > CREATE INDEX can block for creation now that schema changes may be concurrent > - > > Key: CASSANDRA-4938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4938 > Project: Cassandra > Issue Type: Improvement > Components: Feature/2i Index >Reporter: Krzysztof Cieslinski Cognitum >Assignee: Kirk True >Priority: Low > Labels: lhf > Fix For: 4.x > > > Response from CREATE INDEX command comes faster than the creation of > secondary index. So below code: > {code:xml} > CREATE INDEX ON tab(name); > SELECT * FROM tab WHERE name = 'Chris'; > {code} > doesn't return any rows(of course, in column family "tab", there are some > records with "name" value =
[jira] [Updated] (CASSANDRA-16362) SSLFactory should initialize SSLContext before setting protocols
[ https://issues.apache.org/jira/browse/CASSANDRA-16362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Merkle updated CASSANDRA-16362: Description: Trying to use sstableloader from the latest trunk produced the following Exception: {quote} Exception in thread "main" java.lang.RuntimeException: Could not create SSL Context. at org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:261) at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:64) at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49) Caused by: java.io.IOException: Error creating/initializing the SSL Context at org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:184) at org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:257) ... 2 more Caused by: java.lang.IllegalStateException: SSLContext is not initialized at sun.security.ssl.SSLContextImpl.engineGetSocketFactory(SSLContextImpl.java:208) at javax.net.ssl.SSLContextSpi.getDefaultSocket(SSLContextSpi.java:158) at javax.net.ssl.SSLContextSpi.engineGetDefaultSSLParameters(SSLContextSpi.java:184) at javax.net.ssl.SSLContext.getDefaultSSLParameters(SSLContext.java:435) at org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:178) ... 3 more {quote} I believe this is because of a change to SSLFactory for CASSANDRA-13325 here: [https://github.com/apache/cassandra/commit/919a8964a83511d96766c3e53ba603e77bca626c#diff-0d569398cfd58566fc56bfb80c971a72afe3f392addc2df731a0b44baf29019eR177-R178] I think the solution is to call {{ctx.init()}} before trying to call {{ctx.getDefaultSSLParameters()}}, essentialy swapping the two lines in the link above. was: Trying to use sstableloader from the latest trunk produced the following Exception: {{{color:#172b4d}Exception in thread "main" java.lang.RuntimeException: Could not create SSL Context.{color}}} \{{ at org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:261)}} \{{ at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:64)}} \{{ at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)}} \{{ Caused by: java.io.IOException: Error creating/initializing the SSL Context}} \{{ at org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:184)}} \{{ at org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:257)}} \{{ ... 2 more}} \{{ Caused by: java.lang.IllegalStateException: SSLContext is not initialized}} \{{ at sun.security.ssl.SSLContextImpl.engineGetSocketFactory(SSLContextImpl.java:208)}} \{{ at javax.net.ssl.SSLContextSpi.getDefaultSocket(SSLContextSpi.java:158)}} \{{ at javax.net.ssl.SSLContextSpi.engineGetDefaultSSLParameters(SSLContextSpi.java:184)}} \{{ at javax.net.ssl.SSLContext.getDefaultSSLParameters(SSLContext.java:435)}} \{{ at org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:178)}} \{{ ... 3 more}} \{{ }} I believe this is because of a change to SSLFactory for CASSANDRA-13325 here: [https://github.com/apache/cassandra/commit/919a8964a83511d96766c3e53ba603e77bca626c#diff-0d569398cfd58566fc56bfb80c971a72afe3f392addc2df731a0b44baf29019eR177-R178] I think the solution is to call {{ctx.init()}} before trying to call {{ctx.getDefaultSSLParameters()., essentialy swapping the two lines in the link above.}} > SSLFactory should initialize SSLContext before setting protocols > > > Key: CASSANDRA-16362 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16362 > Project: Cassandra > Issue Type: Bug >Reporter: Erik Merkle >Priority: Normal > > Trying to use sstableloader from the latest trunk produced the following > Exception: > {quote} > Exception in thread "main" java.lang.RuntimeException: Could not create SSL > Context. > at > org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:261) > at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:64) > at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49) > Caused by: java.io.IOException: Error creating/initializing the SSL Context > at > org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:184) > at > org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:257) > ... 2 more > Caused by: java.lang.IllegalStateException: SSLContext is not initialized > at > sun.security.ssl.SSLContextImpl.engineGetSocketFactory(SSLContextImpl.java:208) > at javax.net.ssl.SSLContextSpi.getDefaultSocket(SSLContextSpi.java:158) > at > javax.net.ssl.SSLContextSpi.engineGetDefaultSSLParameters(SSLContextSpi.java:184) > at
[jira] [Created] (CASSANDRA-16362) SSLFactory should initialize SSLContext before setting protocols
Erik Merkle created CASSANDRA-16362: --- Summary: SSLFactory should initialize SSLContext before setting protocols Key: CASSANDRA-16362 URL: https://issues.apache.org/jira/browse/CASSANDRA-16362 Project: Cassandra Issue Type: Bug Reporter: Erik Merkle Trying to use sstableloader from the latest trunk produced the following Exception: {{{color:#172b4d}Exception in thread "main" java.lang.RuntimeException: Could not create SSL Context.{color}}} \{{ at org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:261)}} \{{ at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:64)}} \{{ at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)}} \{{ Caused by: java.io.IOException: Error creating/initializing the SSL Context}} \{{ at org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:184)}} \{{ at org.apache.cassandra.tools.BulkLoader.buildSSLOptions(BulkLoader.java:257)}} \{{ ... 2 more}} \{{ Caused by: java.lang.IllegalStateException: SSLContext is not initialized}} \{{ at sun.security.ssl.SSLContextImpl.engineGetSocketFactory(SSLContextImpl.java:208)}} \{{ at javax.net.ssl.SSLContextSpi.getDefaultSocket(SSLContextSpi.java:158)}} \{{ at javax.net.ssl.SSLContextSpi.engineGetDefaultSSLParameters(SSLContextSpi.java:184)}} \{{ at javax.net.ssl.SSLContext.getDefaultSSLParameters(SSLContext.java:435)}} \{{ at org.apache.cassandra.security.SSLFactory.createSSLContext(SSLFactory.java:178)}} \{{ ... 3 more}} \{{ }} I believe this is because of a change to SSLFactory for CASSANDRA-13325 here: [https://github.com/apache/cassandra/commit/919a8964a83511d96766c3e53ba603e77bca626c#diff-0d569398cfd58566fc56bfb80c971a72afe3f392addc2df731a0b44baf29019eR177-R178] I think the solution is to call {{ctx.init()}} before trying to call {{ctx.getDefaultSSLParameters()., essentialy swapping the two lines in the link above.}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251375#comment-17251375 ] Caleb Rackliffe commented on CASSANDRA-16226: - [~mck] There are actually a couple changes I want to make to the docs around compact storage, as I mentioned in the Documentation Plan above. Would it be best to include that only in the trunk patch? A separate Jira altogether? > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251357#comment-17251357 ] Caleb Rackliffe edited comment on CASSANDRA-16226 at 12/17/20, 10:33 PM: - bq. Will the failure to invalidate the prepared statement cache when dropping compact storage in trunk be addressed in a separate ticket? See CASSANDRA-16361. (UPDATE: A patch is now available there.) was (Author: maedhroz): bq. Will the failure to invalidate the prepared statement cache when dropping compact storage in trunk be addressed in a separate ticket? See CASSANDRA-16361 ;) > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16361) DROP COMPACT STORAGE should invalidate prepared statements still using CompactTableMetadata
[ https://issues.apache.org/jira/browse/CASSANDRA-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16361: Test and Documentation Plan: a new test that verifies the contents of the prepared statement cache around the DROP COMPACT STORAGE logic Status: Patch Available (was: In Progress) patch: https://github.com/apache/cassandra/pull/858 CircleCI: https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16361 > DROP COMPACT STORAGE should invalidate prepared statements still using > CompactTableMetadata > --- > > Key: CASSANDRA-16361 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16361 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > When we drop compact storage from a table, existing prepared statements > continue to refer to an instance of {{CompactTableMetadata}}, rather than > being invalidated so they can be assigned a new {{TableMetadata}} instance. > There are perhaps some brute force ways to fix this, like bouncing the node, > but that obviously sub-optimal. > One idea is to have {{TableMetadata#changeAffectsPreparedStatements()}} > return true when we go from having to not having the DENSE flag. It should be > pretty easy to validate a fix with a small addition to {{CompactTableTest}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16355) Fix flaky test incompletePropose - org.apache.cassandra.distributed.test.CASTest
[ https://issues.apache.org/jira/browse/CASSANDRA-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251358#comment-17251358 ] Ekaterina Dimitrova commented on CASSANDRA-16355: - Just saw it again today in CircleCI: https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/545/workflows/f797bf58-b572-4d8c-831e-a61936d23624/jobs/3027 > Fix flaky test incompletePropose - > org.apache.cassandra.distributed.test.CASTest > > > Key: CASSANDRA-16355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16355 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Test/dtest/java >Reporter: David Capwell >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.0-beta > > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/853/workflows/0766c0de-956e-4831-aa40-9303748a2708/jobs/5030 > {code} > junit.framework.AssertionFailedError: Expected: [[1, 1, 2]] > Actual: [] > at > org.apache.cassandra.distributed.shared.AssertUtils.fail(AssertUtils.java:193) > at > org.apache.cassandra.distributed.shared.AssertUtils.assertEquals(AssertUtils.java:163) > at > org.apache.cassandra.distributed.shared.AssertUtils.assertRows(AssertUtils.java:63) > at > org.apache.cassandra.distributed.test.CASTest.incompletePropose(CASTest.java:124) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251357#comment-17251357 ] Caleb Rackliffe commented on CASSANDRA-16226: - bq. Will the failure to invalidate the prepared statement cache when dropping compact storage in trunk be addressed in a separate ticket? See CASSANDRA-16361 ;) > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251356#comment-17251356 ] Michael Semb Wever edited comment on CASSANDRA-16226 at 12/17/20, 9:50 PM: --- +1 to the 3.0 patch. Will the failure to invalidate the prepared statement cache when dropping compact storage in trunk be addressed in a separate ticket? was (Author: michaelsembwever): +1 to the 3.0 patch. Will the failure to invalidate the prepared statement cache when dropping compact storage in trunk being addressed in a separate ticket? > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251356#comment-17251356 ] Michael Semb Wever commented on CASSANDRA-16226: +1 to the 3.0 patch. Will the failure to invalidate the prepared statement cache when dropping compact storage in trunk being addressed in a separate ticket? > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16361) DROP COMPACT STORAGE should invalidate prepared statements still using CompactTableMetadata
[ https://issues.apache.org/jira/browse/CASSANDRA-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16361: Bug Category: Parent values: Correctness(12982)Level 1 values: Transient Incorrect Response(12987) Complexity: Normal Discovered By: Unit Test Fix Version/s: 4.0-beta Severity: Normal Assignee: Caleb Rackliffe Status: Open (was: Triage Needed) > DROP COMPACT STORAGE should invalidate prepared statements still using > CompactTableMetadata > --- > > Key: CASSANDRA-16361 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16361 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta > > > When we drop compact storage from a table, existing prepared statements > continue to refer to an instance of {{CompactTableMetadata}}, rather than > being invalidated so they can be assigned a new {{TableMetadata}} instance. > There are perhaps some brute force ways to fix this, like bouncing the node, > but that obviously sub-optimal. > One idea is to have {{TableMetadata#changeAffectsPreparedStatements()}} > return true when we go from having to not having the DENSE flag. It should be > pretty easy to validate a fix with a small addition to {{CompactTableTest}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16361) DROP COMPACT STORAGE should invalidate prepared statements still using CompactTableMetadata
Caleb Rackliffe created CASSANDRA-16361: --- Summary: DROP COMPACT STORAGE should invalidate prepared statements still using CompactTableMetadata Key: CASSANDRA-16361 URL: https://issues.apache.org/jira/browse/CASSANDRA-16361 Project: Cassandra Issue Type: Bug Components: Legacy/CQL Reporter: Caleb Rackliffe When we drop compact storage from a table, existing prepared statements continue to refer to an instance of {{CompactTableMetadata}}, rather than being invalidated so they can be assigned a new {{TableMetadata}} instance. There are perhaps some brute force ways to fix this, like bouncing the node, but that obviously sub-optimal. One idea is to have {{TableMetadata#changeAffectsPreparedStatements()}} return true when we go from having to not having the DENSE flag. It should be pretty easy to validate a fix with a small addition to {{CompactTableTest}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13701) Lower default num_tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-13701: --- Fix Version/s: (was: 4.0-alpha) 4.0-beta4 4.0 Source Control Link: https://github.com/apache/cassandra/commit/3cfc8502b82ba88da6ffc69fdad476f7fa0819ca https://github.com/apache/cassandra-builds/commit/6ff05f088ccab9a2376d2b83f9ef66a800b0c787 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed [3cfc8502b82ba88da6ffc69fdad476f7fa0819ca|https://github.com/apache/cassandra/commit/3cfc8502b82ba88da6ffc69fdad476f7fa0819ca] and [6ff05f088ccab9a2376d2b83f9ef66a800b0c787|https://github.com/apache/cassandra-builds/commit/6ff05f088ccab9a2376d2b83f9ef66a800b0c787] > Lower default num_tokens > > > Key: CASSANDRA-13701 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13701 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Chris Lohfink >Assignee: Alexander Dejanovski >Priority: Low > Fix For: 4.0, 4.0-beta4 > > > For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not > necessary. It is very expensive for operations processes and scanning. Its > come up a lot and its pretty standard and known now to always reduce the > num_tokens within the community. We should just lower the defaults. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Updated default num_tokens from 256 to 16 with associated allocate_tokens_for_local_replication_factor set to 3
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 3cfc850 Updated default num_tokens from 256 to 16 with associated allocate_tokens_for_local_replication_factor set to 3 3cfc850 is described below commit 3cfc8502b82ba88da6ffc69fdad476f7fa0819ca Author: Jeremy Hanna AuthorDate: Tue Jul 7 13:19:55 2020 +1000 Updated default num_tokens from 256 to 16 with associated allocate_tokens_for_local_replication_factor set to 3 patch by Jeremy Hanna; reviewed by Alexander Dejanovski, Paulo Motta, Brandon Williams, Michael Semb Wever for CASSANDRA-13701 --- .circleci/config-2_1.yml | 16 .circleci/config-2_1.yml.mid_res.patch | 4 ++-- .circleci/config.yml | 16 .circleci/config.yml.HIGHRES | 16 .circleci/config.yml.LOWRES| 16 .circleci/config.yml.MIDRES| 16 CHANGES.txt| 1 + NEWS.txt | 8 conf/cassandra.yaml| 7 +-- 9 files changed, 56 insertions(+), 44 deletions(-) diff --git a/.circleci/config-2_1.yml b/.circleci/config-2_1.yml index c80eaee..07c406b 100644 --- a/.circleci/config-2_1.yml +++ b/.circleci/config-2_1.yml @@ -467,7 +467,7 @@ jobs: run_dtests_extra_args: "--use-vnodes --skip-resource-intensive-tests --pytest-options '-k not cql'" - run_dtests: file_tag: j8_with_vnodes - pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' + pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests' j11_dtests-with-vnodes: <<: *j11_par_executor @@ -482,7 +482,7 @@ jobs: run_dtests_extra_args: "--use-vnodes --skip-resource-intensive-tests --pytest-options '-k not cql'" - run_dtests: file_tag: j11_with_vnodes -pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' +pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests' j8_dtests-no-vnodes: <<: *j8_par_executor @@ -539,7 +539,7 @@ jobs: run_dtests_extra_args: "--use-vnodes --skip-resource-intensive-tests --pytest-options '-k cql'" - run_dtests: file_tag: j8_with_vnodes - pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' + pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests' extra_env_args: 'CQLSH_PYTHON=/usr/bin/python2.7' j8_cqlsh-dtests-py3-with-vnodes: @@ -554,7 +554,7 @@ jobs: run_dtests_extra_args: "--use-vnodes --skip-resource-intensive-tests --pytest-options '-k cql'" - run_dtests: file_tag: j8_with_vnodes - pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' + pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests' extra_env_args: 'CQLSH_PYTHON=/usr/bin/python3.6' j8_cqlsh-dtests-py38-with-vnodes: @@ -571,7 +571,7 @@ jobs: python_version: '3.8' - run_dtests: file_tag: j8_with_vnodes - pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' + pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests' extra_env_args: 'CQLSH_PYTHON=/usr/bin/python3.8' python_version: '3.8' @@ -635,7 +635,7 @@ jobs: run_dtests_extra_args: "--use-vnodes --skip-resource-intensive-tests --pytest-options '-k cql'" - run_dtests: file_tag: j11_with_vnodes - pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' + pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests' extra_env_args: 'CQLSH_PYTHON=/usr/bin/python2.7' j11_cqlsh-dtests-py3-with-vnodes: @@ -650,7 +650,7 @@ jobs: run_dtests_extra_args: "--use-vnodes --skip-resource-intensive-tests --pytest-options '-k cql'" - run_dtests: file_tag: j11_with_vnodes - pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' + pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests' extra_env_args: 'CQLSH_PYTHON=/usr/bin/python3.6' j11_cqlsh-dtests-py38-with-vnodes: @@ -667,7 +667,7 @@ jobs: python_version: '3.8' - run_dtests: file_tag: j11_with_vnodes - pytest_extra_args: '--use-vnodes --num-tokens=32 --skip-resource-intensive-tests' + pytest_extra_args: '--use-vnodes --num-tokens=16 --skip-resource-intensive-tests'
[cassandra-builds] branch trunk updated: Lower default num_tokens to 16 (CASSANDRA-13701)
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git The following commit(s) were added to refs/heads/trunk by this push: new 6ff05f0 Lower default num_tokens to 16 (CASSANDRA-13701) 6ff05f0 is described below commit 6ff05f088ccab9a2376d2b83f9ef66a800b0c787 Author: Mick Semb Wever AuthorDate: Thu Dec 17 22:07:14 2020 +0100 Lower default num_tokens to 16 (CASSANDRA-13701) --- build-scripts/cassandra-dtest-pytest.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/build-scripts/cassandra-dtest-pytest.sh b/build-scripts/cassandra-dtest-pytest.sh index 069a868..d985e7d 100755 --- a/build-scripts/cassandra-dtest-pytest.sh +++ b/build-scripts/cassandra-dtest-pytest.sh @@ -18,7 +18,7 @@ export CASS_DRIVER_NO_CYTHON=true export CCM_MAX_HEAP_SIZE="1024M" export CCM_HEAP_NEWSIZE="512M" export CCM_CONFIG_DIR=${WORKSPACE}/.ccm -export NUM_TOKENS="32" +export NUM_TOKENS="16" export CASSANDRA_DIR=${WORKSPACE} #Have Cassandra skip all fsyncs to improve test performance and reliability export CASSANDRA_SKIP_SYNC=true - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16332) Fix upgrade python dtest test_static_columns_with_2i - upgrade_tests.cql_tests.TestCQLNodes
[ https://issues.apache.org/jira/browse/CASSANDRA-16332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251323#comment-17251323 ] Adam Holmberg commented on CASSANDRA-16332: --- Is [this|https://github.com/apache/cassandra-dtest/pull/109] all you had in mind, [~dcapwell]? > Fix upgrade python dtest test_static_columns_with_2i - > upgrade_tests.cql_tests.TestCQLNodes > --- > > Key: CASSANDRA-16332 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16332 > Project: Cassandra > Issue Type: Bug > Components: CI, Test/dtest/python >Reporter: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/843/workflows/9545f259-0a61-4ba8-8dea-485a33136032/jobs/4964 > {code} > # We don't support that > > assert_invalid(cursor, "SELECT s FROM test WHERE v = 1") > upgrade_tests/cql_tests.py:4137: > {code} > {code} > > assert False, "Expecting query to raise an exception, but nothing > > was raised." > E AssertionError: Expecting query to raise an exception, but > nothing was raised. > tools/assertions.py:63: AssertionError > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251275#comment-17251275 ] Caleb Rackliffe commented on CASSANDRA-16226: - I've added a few more tests in the wake of [~mck]'s review in the 3.0 branch and addressed the concerns raised. If those discussions are resolved, I'll begin the process of updating the 3.11 and trunk PRs... > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16226) COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info
[ https://issues.apache.org/jira/browse/CASSANDRA-16226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-16226: Authors: Caleb Rackliffe, Michael Semb Wever (was: Caleb Rackliffe) > COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by > timestamp due to missing primary key liveness info > --- > > Key: CASSANDRA-16226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16226 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: perfomance, upgrade > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h > Remaining Estimate: 0h > > This was discovered while tracking down a spike in the number of SSTables > per read for a COMPACT STORAGE table after a 2.1 -> 3.0 upgrade. Before 3.0, > there is no direct analog of 3.0's primary key liveness info. When we upgrade > 2.1 COMPACT STORAGE SSTables to the mf format, we simply don't write row > timestamps, even if the original mutations were INSERTs. On read, when we > look at SSTables in order from newest to oldest max timestamp, we expect to > have this primary key liveness information to determine whether we can skip > older SSTables after finding completely populated rows. > ex. I have three SSTables in a COMPACT STORAGE table with max timestamps > 1000, 2000, and 3000. There are many rows in a particular partition, making > filtering on the min and max clustering effectively a no-op. All data is > inserted, and there are no partial updates. A fully specified row with > timestamp 2500 exists in the SSTable with a max timestamp of 3000. With a > proper row timestamp in hand, we can easily ignore the SSTables w/ max > timestamps of 1000 and 2000. Without it, we read 3 SSTables instead of 1, > which likely means a significant performance regression. > The following test illustrates this difference in behavior between 2.1 and > 3.0: > https://github.com/maedhroz/cassandra/commit/84ce9242bedd735ca79d4f06007d127de6a82800 > A solution here might be as simple as having > {{SinglePartitionReadCommand#canRemoveRow()}} only inspect primary key > liveness information for non-compact/CQL tables. Tombstones seem to be > handled at a level above that anyway. (One potential problem with that is > whether or not the distinction will continue to exist in 4.0, and dropping > compact storage from a table doesn't magically make pk liveness information > appear.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16160) cqlsh row_id resets on page boundaries
[ https://issues.apache.org/jira/browse/CASSANDRA-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251235#comment-17251235 ] David Capwell commented on CASSANDRA-16160: --- thanks for the work! > cqlsh row_id resets on page boundaries > -- > > Key: CASSANDRA-16160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16160 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta4 > > > When you run a query such as > {code} > expand on; > select * from table_with_clustering_keys where token(partition_key) = > 1192326969048244361; > {code} > We print out a header for each row that looks like the following > @ Row 1 > In 3.0 all values printed were uniq, but in 4.0 they are no longer unique > {code} > $ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 1 @ Row 999 > 1 @ Row 998 > 1 @ Row 997 > 1 @ Row 996 > 1 @ Row 995 > 1 @ Row 994 > 1 @ Row 993 > 1 @ Row 992 > 1 @ Row 991 > 1 @ Row 990 > {code} > {code} > $ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 10 @ Row 9 > 10 @ Row 8 > 10 @ Row 7 > 10 @ Row 6 > 10 @ Row 5 > 10 @ Row 48 > 10 @ Row 47 > 10 @ Row 46 > 10 @ Row 45 > 10 @ Row 44 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16160) cqlsh row_id resets on page boundaries
[ https://issues.apache.org/jira/browse/CASSANDRA-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-16160: -- Fix Version/s: (was: 4.0-beta) 4.0-beta4 Since Version: 4.0-alpha1 Source Control Link: https://github.com/apache/cassandra/commit/cb71f2395896d29fd1f7d248cf48c69cb12c0411 Resolution: Fixed Status: Resolved (was: Ready to Commit) > cqlsh row_id resets on page boundaries > -- > > Key: CASSANDRA-16160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16160 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta4 > > > When you run a query such as > {code} > expand on; > select * from table_with_clustering_keys where token(partition_key) = > 1192326969048244361; > {code} > We print out a header for each row that looks like the following > @ Row 1 > In 3.0 all values printed were uniq, but in 4.0 they are no longer unique > {code} > $ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 1 @ Row 999 > 1 @ Row 998 > 1 @ Row 997 > 1 @ Row 996 > 1 @ Row 995 > 1 @ Row 994 > 1 @ Row 993 > 1 @ Row 992 > 1 @ Row 991 > 1 @ Row 990 > {code} > {code} > $ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 10 @ Row 9 > 10 @ Row 8 > 10 @ Row 7 > 10 @ Row 6 > 10 @ Row 5 > 10 @ Row 48 > 10 @ Row 47 > 10 @ Row 46 > 10 @ Row 45 > 10 @ Row 44 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16160) cqlsh row_id resets on page boundaries
[ https://issues.apache.org/jira/browse/CASSANDRA-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251233#comment-17251233 ] David Capwell commented on CASSANDRA-16160: --- ok, see they passed in jenkins, will commit. > cqlsh row_id resets on page boundaries > -- > > Key: CASSANDRA-16160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16160 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > When you run a query such as > {code} > expand on; > select * from table_with_clustering_keys where token(partition_key) = > 1192326969048244361; > {code} > We print out a header for each row that looks like the following > @ Row 1 > In 3.0 all values printed were uniq, but in 4.0 they are no longer unique > {code} > $ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 1 @ Row 999 > 1 @ Row 998 > 1 @ Row 997 > 1 @ Row 996 > 1 @ Row 995 > 1 @ Row 994 > 1 @ Row 993 > 1 @ Row 992 > 1 @ Row 991 > 1 @ Row 990 > {code} > {code} > $ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 10 @ Row 9 > 10 @ Row 8 > 10 @ Row 7 > 10 @ Row 6 > 10 @ Row 5 > 10 @ Row 48 > 10 @ Row 47 > 10 @ Row 46 > 10 @ Row 45 > 10 @ Row 44 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: cqlsh row_id resets on page boundaries
This is an automated email from the ASF dual-hosted git repository. dcapwell pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new cb71f23 cqlsh row_id resets on page boundaries cb71f23 is described below commit cb71f2395896d29fd1f7d248cf48c69cb12c0411 Author: Adam Holmberg AuthorDate: Wed Dec 16 15:08:37 2020 -0800 cqlsh row_id resets on page boundaries patch by Adam Holmberg; reviewed by Brandon Williams, David Capwell for CASSANDRA-16160 --- CHANGES.txt | 1 + bin/cqlsh.py | 14 +++--- pylib/cqlshlib/test/test_cqlsh_output.py | 10 ++ 3 files changed, 18 insertions(+), 7 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index 93d9d4e..6aa6343 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -25,6 +25,7 @@ * Bring back the accepted encryption protocols list as configurable option (CASSANDRA-13325) * DigestResolver.getData throws AssertionError since dataResponse is null (CASSANDRA-16097) * Cannot replace_address /X because it doesn't exist in gossip (CASSANDRA-16213) + * cqlsh row_id resets on page boundaries (CASSANDRA-16160) Merged from 3.11: * SASI's `max_compaction_flush_memory_in_mb` settings over 100GB revert to default of 1GB (CASSANDRA-16071) Merged from 3.0: diff --git a/bin/cqlsh.py b/bin/cqlsh.py index 003515d..5162a00 100644 --- a/bin/cqlsh.py +++ b/bin/cqlsh.py @@ -1131,9 +1131,9 @@ class Shell(cmd.Cmd): while True: # Always print for the first page even it is empty if result.current_rows or isFirst: -num_rows += len(result.current_rows) with_header = isFirst or tty -self.print_static_result(result, table_meta, with_header, tty) +self.print_static_result(result, table_meta, with_header, tty, num_rows) +num_rows += len(result.current_rows) if result.has_more_pages: if self.shunted_query_out is None and tty: # Only pause when not capturing. @@ -1156,7 +1156,7 @@ class Shell(cmd.Cmd): self.writeresult('%d more decoding errors suppressed.' % (len(self.decoding_errors) - 2), color=RED) -def print_static_result(self, result, table_meta, with_header, tty): +def print_static_result(self, result, table_meta, with_header, tty, row_count_offset=0): if not result.column_names and not table_meta: return @@ -1176,7 +1176,7 @@ class Shell(cmd.Cmd): formatted_values = [list(map(self.myformat_value, [row[c] for c in column_names], cql_types)) for row in result.current_rows] if self.expand_enabled: -self.print_formatted_result_vertically(formatted_names, formatted_values) +self.print_formatted_result_vertically(formatted_names, formatted_values, row_count_offset) else: self.print_formatted_result(formatted_names, formatted_values, with_header, tty) @@ -1207,13 +1207,13 @@ class Shell(cmd.Cmd): if tty: self.writeresult("") -def print_formatted_result_vertically(self, formatted_names, formatted_values): +def print_formatted_result_vertically(self, formatted_names, formatted_values, row_count_offset): max_col_width = max([n.displaywidth for n in formatted_names]) max_val_width = max([n.displaywidth for row in formatted_values for n in row]) # for each row returned, list all the column-value pairs -for row_id, row in enumerate(formatted_values): -self.writeresult("@ Row %d" % (row_id + 1)) +for i, row in enumerate(formatted_values): +self.writeresult("@ Row %d" % (row_count_offset + i + 1)) self.writeresult('-%s-' % '-+-'.join(['-' * max_col_width, '-' * max_val_width])) for field_id, field in enumerate(row): column = formatted_names[field_id].ljust(max_col_width, color=self.color) diff --git a/pylib/cqlshlib/test/test_cqlsh_output.py b/pylib/cqlshlib/test/test_cqlsh_output.py index 304050d..4962167 100644 --- a/pylib/cqlshlib/test/test_cqlsh_output.py +++ b/pylib/cqlshlib/test/test_cqlsh_output.py @@ -910,3 +910,13 @@ class TestCqlshOutput(BaseTestCase): """), )) + +def test_expanded_output_counts_past_page(self): +query = "PAGING 5; EXPAND ON; SELECT * FROM twenty_rows_table;" +output, result = testcall_cqlsh(prompt=None, env=self.default_env, +tty=False, input=query) +self.assertEqual(0, result) +# format is "@ Row 1" +row_headers = [s for s in output.splitlines() if "@ Row" in s] +row_ids = [int(s.split(' ')[2]) for s in row_headers] +
[jira] [Commented] (CASSANDRA-16160) cqlsh row_id resets on page boundaries
[ https://issues.apache.org/jira/browse/CASSANDRA-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251232#comment-17251232 ] David Capwell commented on CASSANDRA-16160: --- sorry for the delay, ci had issues with checking out python dtests, so rerunning again. > cqlsh row_id resets on page boundaries > -- > > Key: CASSANDRA-16160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16160 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > When you run a query such as > {code} > expand on; > select * from table_with_clustering_keys where token(partition_key) = > 1192326969048244361; > {code} > We print out a header for each row that looks like the following > @ Row 1 > In 3.0 all values printed were uniq, but in 4.0 they are no longer unique > {code} > $ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 1 @ Row 999 > 1 @ Row 998 > 1 @ Row 997 > 1 @ Row 996 > 1 @ Row 995 > 1 @ Row 994 > 1 @ Row 993 > 1 @ Row 992 > 1 @ Row 991 > 1 @ Row 990 > {code} > {code} > $ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10 > 10 @ Row 9 > 10 @ Row 8 > 10 @ Row 7 > 10 @ Row 6 > 10 @ Row 5 > 10 @ Row 48 > 10 @ Row 47 > 10 @ Row 46 > 10 @ Row 45 > 10 @ Row 44 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15810: --- Fix Version/s: 4.0-beta4 Since Version: 2.0.4 Source Control Link: https://github.com/apache/cassandra/commit/0d56f70ae7def5b8ff9e3aef14cfa7dff01a71ac Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed into trunk at 0d56f70ae7def5b8ff9e3aef14cfa7dff01a71ac > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta4 > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15810: --- Fix Version/s: (was: 4.0-beta) > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Remove use of String.intern()
This is an automated email from the ASF dual-hosted git repository. blerer pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 0d56f70 Remove use of String.intern() 0d56f70 is described below commit 0d56f70ae7def5b8ff9e3aef14cfa7dff01a71ac Author: Benjamin Lerer AuthorDate: Thu Dec 17 11:24:46 2020 +0100 Remove use of String.intern() patch by Benjamin Lerer; reviewed by Brandon Williams for CASSANDRA-15810 --- CHANGES.txt | 1 + conf/jvm-server.options | 3 --- src/java/org/apache/cassandra/gms/VersionedValue.java | 6 +- 3 files changed, 2 insertions(+), 8 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index e8f89c4..93d9d4e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0-beta4 + * Remove use of String.intern() (CASSANDRA-15810) * Fix the missing bb position in ByteBufferAccessor.getUnsignedShort (CASSANDRA-16249) * Make sure OOM errors are rethrown on truncation failure (CASSANDRA-16254) * Send back client warnings when creating too many tables/keyspaces (CASSANDRA-16309) diff --git a/conf/jvm-server.options b/conf/jvm-server.options index c52e192..46967f4 100644 --- a/conf/jvm-server.options +++ b/conf/jvm-server.options @@ -103,9 +103,6 @@ # Per-thread stack size. -Xss256k -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) --XX:StringTableSize=103 - # Make sure all memory is faulted and zeroed on startup. # This helps prevent soft faults in containers and makes # transparent hugepage allocation more effective. diff --git a/src/java/org/apache/cassandra/gms/VersionedValue.java b/src/java/org/apache/cassandra/gms/VersionedValue.java index 3dc4c57..7c54559 100644 --- a/src/java/org/apache/cassandra/gms/VersionedValue.java +++ b/src/java/org/apache/cassandra/gms/VersionedValue.java @@ -87,11 +87,7 @@ public class VersionedValue implements Comparable private VersionedValue(String value, int version) { assert value != null; -// blindly interning everything is somewhat suboptimal -- lots of VersionedValues are unique -- -// but harmless, and interning the non-unique ones saves significant memory. (Unfortunately, -// we don't really have enough information here in VersionedValue to tell the probably-unique -// values apart.) See CASSANDRA-6410. -this.value = value.intern(); +this.value = value; this.version = version; } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15810: --- Workflow: Cassandra Bug Workflow (was: Cassandra Default Workflow) Issue Type: Bug (was: Improvement) > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15810: --- Status: Ready to Commit (was: Review In Progress) > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15810: --- Reviewers: Brandon Williams Status: Review In Progress (was: Patch Available) > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15810: --- Test and Documentation Plan: No additional test needed Status: Patch Available (was: Open) > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15810: --- Status: Open (was: Triage Needed) > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-16355) Fix flaky test incompletePropose - org.apache.cassandra.distributed.test.CASTest
[ https://issues.apache.org/jira/browse/CASSANDRA-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-16355: -- Assignee: Benjamin Lerer > Fix flaky test incompletePropose - > org.apache.cassandra.distributed.test.CASTest > > > Key: CASSANDRA-16355 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16355 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Test/dtest/java >Reporter: David Capwell >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.0-beta > > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/853/workflows/0766c0de-956e-4831-aa40-9303748a2708/jobs/5030 > {code} > junit.framework.AssertionFailedError: Expected: [[1, 1, 2]] > Actual: [] > at > org.apache.cassandra.distributed.shared.AssertUtils.fail(AssertUtils.java:193) > at > org.apache.cassandra.distributed.shared.AssertUtils.assertEquals(AssertUtils.java:163) > at > org.apache.cassandra.distributed.shared.AssertUtils.assertRows(AssertUtils.java:63) > at > org.apache.cassandra.distributed.test.CASTest.incompletePropose(CASTest.java:124) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251142#comment-17251142 ] Brandon Williams commented on CASSANDRA-15810: -- +1 > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16360) CRC32 is inefficient on x86
[ https://issues.apache.org/jira/browse/CASSANDRA-16360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-16360: Labels: protocolv5 (was: ) > CRC32 is inefficient on x86 > --- > > Key: CASSANDRA-16360 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16360 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Client >Reporter: Avi Kivity >Priority: Normal > Labels: protocolv5 > Fix For: 4.0-beta > > > The client/server protocol specifies CRC24 and CRC32 as the checksum > algorithm (cql_protocol_V5_framing.asc). Those however are expensive to > compute; this affects both the client and the server. > > A better checksum algorithm is CRC32C, which has hardware support on x86 (as > well as other modern architectures). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16360) CRC32 is inefficient on x86
[ https://issues.apache.org/jira/browse/CASSANDRA-16360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-16360: Change Category: Semantic Complexity: Normal Fix Version/s: 4.0-beta Status: Open (was: Triage Needed) > CRC32 is inefficient on x86 > --- > > Key: CASSANDRA-16360 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16360 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Client >Reporter: Avi Kivity >Priority: Normal > Fix For: 4.0-beta > > > The client/server protocol specifies CRC24 and CRC32 as the checksum > algorithm (cql_protocol_V5_framing.asc). Those however are expensive to > compute; this affects both the client and the server. > > A better checksum algorithm is CRC32C, which has hardware support on x86 (as > well as other modern architectures). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250981#comment-17250981 ] Benjamin Lerer commented on CASSANDRA-15810: [~brandon.williams] do you have the time to review this patch? > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250980#comment-17250980 ] Benjamin Lerer commented on CASSANDRA-15810: I pushed a [PR|https://github.com/apache/cassandra/pull/857] for 4.0. Ci results [j8|https://app.circleci.com/pipelines/github/blerer/cassandra/84/workflows/05e5708a-0378-4b67-bd3b-427487a866a0], [j11|https://app.circleci.com/pipelines/github/blerer/cassandra/84/workflows/5fe0c950-9886-4df1-9246-711a5a542ae2]. > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250971#comment-17250971 ] Avi Kivity commented on CASSANDRA-13304: I filed https://issues.apache.org/jira/browse/CASSANDRA-16360 proposing to change to CRC32C. > Add checksumming to the native protocol > --- > > Key: CASSANDRA-13304 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13304 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Core >Reporter: Michael Kjellman >Assignee: Sam Tunnicliffe >Priority: Urgent > Fix For: 4.0, 4.0-alpha1 > > Attachments: 13304_v1.diff, boxplot-read-throughput.png, > boxplot-write-throughput.png > > > The native binary transport implementation doesn't include checksums. This > makes it highly susceptible to silently inserting corrupted data either due > to hardware issues causing bit flips on the sender/client side, C*/receiver > side, or network in between. > Attaching an implementation that makes checksum'ing mandatory (assuming both > client and server know about a protocol version that supports checksums) -- > and also adds checksumming to clients that request compression. > The serialized format looks something like this: > {noformat} > * 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 > * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Number of Compressed Chunks | Compressed Length (e1)/ > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * / Compressed Length cont. (e1) |Uncompressed Length (e1) / > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e1) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (e2)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (e2) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e2) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (en)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (en) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (en) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > {noformat} > The first pass here adds checksums only to the actual contents of the frame > body itself (and doesn't actually checksum lengths and headers). While it > would be great to fully add checksuming across the entire protocol, the > proposed implementation will ensure we at least catch corrupted data and > likely protect ourselves pretty well anyways. > I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor > implementation as it's been deprecated for a while -- is really slow and > crappy compared to LZ4 -- and we should do everything in our power to make > sure no one in the community is still using it. I left it in (for obvious > backwards compatibility aspects) old for clients that don't know about the > new protocol. > The current protocol has a 256MB (max) frame body -- where the serialized > contents are simply written in to the frame body. > If the client sends a compression option in the startup, we will install a > FrameCompressor inline. Unfortunately, we went with a decision to treat the > frame body separately from the header bits etc in a given message. So, > instead we put a compressor implementation in the options and then
[jira] [Created] (CASSANDRA-16360) CRC32 is inefficient on x86
Avi Kivity created CASSANDRA-16360: -- Summary: CRC32 is inefficient on x86 Key: CASSANDRA-16360 URL: https://issues.apache.org/jira/browse/CASSANDRA-16360 Project: Cassandra Issue Type: Improvement Components: Messaging/Client Reporter: Avi Kivity The client/server protocol specifies CRC24 and CRC32 as the checksum algorithm (cql_protocol_V5_framing.asc). Those however are expensive to compute; this affects both the client and the server. A better checksum algorithm is CRC32C, which has hardware support on x86 (as well as other modern architectures). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15810) Default StringTableSize parameter causes GC slowdown
[ https://issues.apache.org/jira/browse/CASSANDRA-15810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-15810: -- Assignee: Benjamin Lerer > Default StringTableSize parameter causes GC slowdown > > > Key: CASSANDRA-15810 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15810 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Tom van der Woerdt >Assignee: Benjamin Lerer >Priority: Normal > Labels: gc, performance > Fix For: 4.0-beta > > > While looking at tail latency on a Cassandra cluster, it came up that the > default StringTableSize in Cassandra is set to a million: > {code:java} > # Larger interned string table, for gossip's benefit (CASSANDRA-6410) > -XX:StringTableSize=103{code} > This was done for CASSANDRA-6410 by [~jbellis] in '13, to optimize heap usage > on a test case, running with 500 nodes and num_tokens=512. > Until Java 13, this string table is implemented as native code, and has to be > traversed entirely during the GC initial marking phase, which is a STW event. > Some testing on my end shows that the pause time of a GC cycle can be reduced > by approximately 10 milliseconds if we lower the string table size back to > the Java 8 default of 60013 entries. > Thus, I would recommend this patch (3.11 branch, similar patch for 4.0): > {code:java} > diff --git a/conf/jvm.options b/conf/jvm.options > index 01bb1685b3..c184d18c5d 100644 > --- a/conf/jvm.options > +++ b/conf/jvm.options > @@ -107,9 +107,6 @@ > # Per-thread stack size. > -Xss256k > -# Larger interned string table, for gossip's benefit (CASSANDRA-6410) > --XX:StringTableSize=103 > - > # Make sure all memory is faulted and zeroed on startup. > # This helps prevent soft faults in containers and makes > # transparent hugepage allocation more effective. > {code} > It does need some testing on more extreme clusters than I have access to, but > I ran some Cassandra nodes with {{-XX:+PrintStringTableStatistics}} which > suggested that the Java default will suffice here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org