[jira] [Updated] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table
[ https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Hanna updated CASSANDRA-15803: - Description: Currently allow filtering can mean two things in the spirit of "avoid operations that don't seek to a specific row or sequential rows of data." First, it can mean scanning across the entire table to meet the criteria of the query. That's almost always a bad thing and should be discouraged or disabled (see CASSANDRA-8303). Second, it can mean filtering within a specific partition. For example, in a query you could specify the full partition key and if you specify a criterion on a non-key field, it requires allow filtering. The second reason to require allow filtering is significantly less work to scan through a partition. It is still extra work over seeking to a specific row and getting N sequential rows though. So while an application developer and/or operator needs to be cautious about this second type, it's not necessarily a bad thing, depending on the table and the use case. I propose that we separate the way to specify allow filtering across an entire table (involving a scatter gather) from specifying allow filtering across a partition in a backwards compatible way. One idea that was brought up in Slack in the cassandra-dev room was to have allow filtering mean the superset - scanning across the table. Then if you want to specify that you *only* want to scan within a partition you would use something like {{ALLOW FILTERING [WITHIN PARTITION]}} So it will succeed if you specify non-key criteria within a single partition, but fail with a message to say it requires the full allow filtering. This would allow for a backwards compatible full allow filtering while allowing a user to specify that they want to just scan within a partition, but error out if trying to scan a full table. This is potentially also related to the capability limitation framework by which operators could more granularly specify what features are allowed or disallowed per user, discussed in CASSANDRA-8303. This way an operator could disallow the more general allow filtering while allowing the partition scan (or disallow them both at their discretion). was: Currently allow filtering can mean two things in the spirit of "avoid operations that don't seek to a specific row or sequential rows of data." First, it can mean scanning across the entire table to meet the criteria of the query. That's almost always a bad thing and should be discouraged or disabled (see CASSANDRA-8303). Second, it can mean filtering within a specific partition. For example, in a query you could specify the full partition key and if you specify a criterion on a non-key field, it requires allow filtering. The second reason to require allow filtering is significantly less work to scan through a partition. It is still extra work over seeking to a specific row and getting N sequential rows though. So while an application developer and/or operator needs to be cautious about this second type, it's not necessarily a bad thing, depending on the table and the use case. I propose that we separate the way to specify allow filtering across an entire table (involving a scatter gather) from specifying allow filtering across a partition in a backwards compatible way. One idea that was brought up in Slack in the cassandra-dev room was to have allow filtering mean the superset - scanning across the table. Then if you want to specify that you *only* want to scan within a partition you would use something like {{ALLOW FILTERING [WITHIN PARTITION]}} So it will succeed if you specify non-key criteria within a single partition, but fail with a message to say it requires the full allow filtering. This would allow for a backwards compatible full allow filtering while allowing a user to specify that they want to just scan within a partition, but error out if trying to scan a full table. This is potentially also related to the capability limitation framework by which operators could more granularly specify what features are allowed or disallowed per user, discussed in CASSANDRA-8303. This way an operator could disallow the more general allow filtering while allowing the partition scan (or disallow them both at their discretion). > Separate out allow filtering scanning through a partition versus scanning > over the table > > > Key: CASSANDRA-15803 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15803 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Syntax >Reporter: Jeremy Hanna >Priority: Normal > > Currently allow filtering can mean two things in the spirit of "avoid > operations that
[jira] [Updated] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table
[ https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Hanna updated CASSANDRA-15803: - Description: Currently allow filtering can mean two things in the spirit of "avoid operations that don't seek to a specific row or sequential rows of data." First, it can mean scanning across the entire table to meet the criteria of the query. That's almost always a bad thing and should be discouraged or disabled (see CASSANDRA-8303). Second, it can mean filtering within a specific partition. For example, in a query you could specify the full partition key and if you specify a criterion on a non-key field, it requires allow filtering. The second reason to require allow filtering is significantly less work to scan through a partition. It is still extra work over seeking to a specific row and getting N sequential rows though. So while an application developer and/or operator needs to be cautious about this second type, it's not necessarily a bad thing, depending on the table and the use case. I propose that we separate the way to specify allow filtering across an entire table (involving a scatter gather) from specifying allow filtering across a partition in a backwards compatible way. One idea that was brought up in Slack in the cassandra-dev room was to have allow filtering mean the superset - scanning across the table. Then if you want to specify that you *only* want to scan within a partition you would use something like {{ALLOW FILTERING [WITHIN PARTITION]}} So it will succeed if you specify non-key criteria within a single partition, but fail with a message to say it requires the full allow filtering. This would allow for a backwards compatible full allow filtering while allowing a user to specify that they want to just scan within a partition, but error out if trying to scan a full table. This is potentially also related to the capability limitation framework by which operators could more granularly specify what features are allowed or disallowed per user, discussed in CASSANDRA-8303. This way an operator could disallow the more general allow filtering while allowing the partition scan (or disallow them both at their discretion). was: Currently allow filtering can mean two things in the spirit of "avoid operations that don't seek to a specific row or sequential rows of data." First, it can mean scanning across the entire table to meet the criteria of the query. That's almost always a bad thing and should be discouraged or disabled (see CASSANDRA-8303). Second, it can mean filtering within a specific partition. For example, in a query you could specify the full partition key and if you specify a criterion on a non-key field, it requires allow filtering. The second reason to require allow filtering is significantly less work to scan through a partition. It is still extra work over seeking to a specific row and getting N sequential rows though. So while an application developer and/or operator needs to be cautious about this second type, it's not necessarily a bad thing, depending on the table and the use case. I propose that we separate the way to specify allow filtering across an entire table (involving a scatter gather) from specifying allow filtering across a partition in a backwards compatible way. One idea that was brought up in Slack in the cassandra-dev room was to have allow filtering mean the superset - scanning across the table. Then if you want to specify that you *only* want to scan within a partition. So it will succeed if you specify non-key criteria within a single partition, but fail with a message to say it requires the full allow filtering. One way would be to have it be {{ALLOW FILTERING [WITHIN PARTITION]}} This would allow for a backwards compatible full allow filtering while allowing a user to specify that they want to just scan within a partition, but error out if trying to scan a full table. This is potentially also related to the capability limitation framework by which operators could more granularly specify what features are allowed or disallowed per user, discussed in CASSANDRA-8303. This way an operator could disallow the more general allow filtering while allowing the partition scan (or disallow them both at their discretion). > Separate out allow filtering scanning through a partition versus scanning > over the table > > > Key: CASSANDRA-15803 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15803 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Syntax >Reporter: Jeremy Hanna >Priority: Normal > > Currently allow filtering can mean two things in the spirit of "avoid > operations that
[jira] [Commented] (CASSANDRA-15775) Configuration to disallow queries with "allow filtering"
[ https://issues.apache.org/jira/browse/CASSANDRA-15775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105070#comment-17105070 ] Jeremy Hanna commented on CASSANDRA-15775: -- See CASSANDRA-8303 which generalizes this. The problem though is that allow filtering has two purposes - first, it allows you to scan over multiple partitions which is almost always bad. Second it is needed if you are scanning through a partition. See CASSANDRA-15803 for a proposal to separate out the first from the second case. > Configuration to disallow queries with "allow filtering" > > > Key: CASSANDRA-15775 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15775 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Local Write-Read Paths >Reporter: Christian Fredriksson >Priority: Normal > > Problem: We have inexperienced developers not following guidelines or best > pratices who do queries with "allow filtering" which have negative impact on > performance on other queries and developers. > It would be beneficial to have a (server side) configuration to disallow > these queries altogether. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table
Jeremy Hanna created CASSANDRA-15803: Summary: Separate out allow filtering scanning through a partition versus scanning over the table Key: CASSANDRA-15803 URL: https://issues.apache.org/jira/browse/CASSANDRA-15803 Project: Cassandra Issue Type: Improvement Components: CQL/Syntax Reporter: Jeremy Hanna Currently allow filtering can mean two things in the spirit of "avoid operations that don't seek to a specific row or sequential rows of data." First, it can mean scanning across the entire table to meet the criteria of the query. That's almost always a bad thing and should be discouraged or disabled (see CASSANDRA-8303). Second, it can mean filtering within a specific partition. For example, in a query you could specify the full partition key and if you specify a criterion on a non-key field, it requires allow filtering. The second reason to require allow filtering is significantly less work to scan through a partition. It is still extra work over seeking to a specific row and getting N sequential rows though. So while an application developer and/or operator needs to be cautious about this second type, it's not necessarily a bad thing, depending on the table and the use case. I propose that we separate the way to specify allow filtering across an entire table (involving a scatter gather) from specifying allow filtering across a partition in a backwards compatible way. One idea that was brought up in Slack in the cassandra-dev room was to have allow filtering mean the superset - scanning across the table. Then if you want to specify that you *only* want to scan within a partition. So it will succeed if you specify non-key criteria within a single partition, but fail with a message to say it requires the full allow filtering. One way would be to have it be {{ALLOW FILTERING [WITHIN PARTITION]}} This would allow for a backwards compatible full allow filtering while allowing a user to specify that they want to just scan within a partition, but error out if trying to scan a full table. This is potentially also related to the capability limitation framework by which operators could more granularly specify what features are allowed or disallowed per user, discussed in CASSANDRA-8303. This way an operator could disallow the more general allow filtering while allowing the partition scan (or disallow them both at their discretion). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15670) Transient Replication: unable to insert data when the keyspace is configured with the SimpleStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105035#comment-17105035 ] Yifan Cai commented on CASSANDRA-15670: --- Hi [~fcofdezc], thank you for the patch. Can you also link the CircleCI result? > Transient Replication: unable to insert data when the keyspace is configured > with the SimpleStrategy > > > Key: CASSANDRA-15670 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15670 > Project: Cassandra > Issue Type: Bug > Components: Feature/Transient Replication >Reporter: Alan Boudreault >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > An error is thrown then trying to insert data with the transient replication > + SimpleStrategy configured. > Test case: > {code:java} > CREATE KEYSPACE test_tr WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': '3/1'}; > CREATE TABLE test_tr.users (id int PRIMARY KEY, username text) with > read_repair ='NONE'; > INSERT INTO test_tr.users (id, username) VALUES (1, 'alan');{code} > > traceback: > {code:java} > ERROR [Native-Transport-Requests-8] 2020-03-27 10:27:17,188 > ErrorMessage.java:450 - Unexpected exception during request > java.lang.ClassCastException: org.apache.cassandra.locator.SimpleStrategy > cannot be cast to org.apache.cassandra.locator.NetworkTopologyStrategy > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForRead(ConsistencyLevel.java:103) > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForWrite(ConsistencyLevel.java:112) > at > org.apache.cassandra.locator.ReplicaPlans$2.select(ReplicaPlans.java:409) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:353) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:348) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:341) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:330) > at > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1171) > at > org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:713) > at > org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:951) > at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:475) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:453) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:216) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:247) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:233) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:108) > at > org.apache.cassandra.transport.Message$Request.execute(Message.java:253) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:725) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:630) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {code} > > --> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L103 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15733) jvm dtest builder should be provided to the factory and expose state
[ https://issues.apache.org/jira/browse/CASSANDRA-15733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104987#comment-17104987 ] David Capwell commented on CASSANDRA-15733: --- [~ifesdjeen] poke =) > jvm dtest builder should be provided to the factory and expose state > > > Key: CASSANDRA-15733 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15733 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > Currently the builder is rather heavy and creates configs plus call the > factory with specific fields only, this isn’t that flexible and makes it > harder to have custom cluster definitions which require additional fields to > be defined. To solve this we should make the builder be sent to the factory > and expose the state so the factory can get all the fields it needs, the > factory should also be in charge of creating the configs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15776) python dtest regression caused by CASSANDRA-15637
[ https://issues.apache.org/jira/browse/CASSANDRA-15776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104986#comment-17104986 ] David Capwell commented on CASSANDRA-15776: --- Thanks mck. The linked build had a few failures unit: https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch-test/detail/Cassandra-devbranch-test/90/tests. org.apache.cassandra.audit.AuditLoggerAuthTest jvm crashed. I don't see the logs archived so wasn't sure what those logs said cdc: something killed the ant process it looks like? is this a timeout? it then failed reading a file that didn't exist? > python dtest regression caused by CASSANDRA-15637 > - > > Key: CASSANDRA-15776 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15776 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 1h 40m > Remaining Estimate: 0h > > CASSANDRA-15637 deprecated size_estimates in favor of table_estimates to > allow for local primary range estimates (needed for MapReduce). This appears > to have caused a regression in the python dtest nodetool_test.TestNodetool. > test_refresh_size_estimates_clears_invalid_entries (as seen by [Circle > CI|https://app.circleci.com/pipelines/github/dcapwell/cassandra/255/workflows/21907001-93ed-4963-9314-6a0ac6ea0f1d/jobs/1246/tests] > and > [Jenkins|https://ci-cassandra.apache.org/job/Cassandra-trunk-dtest/56/]). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15797) Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104880#comment-17104880 ] Yifan Cai edited comment on CASSANDRA-15797 at 5/11/20, 11:17 PM: -- PR: https://github.com/apache/cassandra/pull/588 Code: https://github.com/yifan-c/cassandra/tree/CASSANDRA-15797-Flaky-BinLogTest Test: https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-15797-Flaky-BinLogTest _update: unit and jvm dtest passed in java 8 and java 11_ *Failed tests* * BinLogTest.testPutAfterStop: there is a chance that at the time of assertion, the consumer thread in BinLog has not yet drain the queue, so that assertion get the last NO_OP object and fail. The behavior is expected. The fix to the test is to assert none of the records get from the queue is the one inserted after stopping the binLog. * BinLogTest.testBinLogStartStop: there is no barrier in the test to block assertion until records in the queue are consumed. Added the coundownlatch as the barrier. *Chronicle reference counter history trace* The trace indicates the last release operation sees the reference counter is already 0, so that it prints the history. It is caused by the resources has already been released by the try-with-resources statement. And according to StoreComponentReferenceHandler#processWireQueue, *the exception can be ignored as the resources has already been released*. In this case, all test that calls readBinLogRecords can see the reference counter history. Why the history trace is not always printed every time? Because test already ended and there is no time to print it. If adding a sleep time (5 seconds) at the end of the test, it is guaranteed to print reference count history. was (Author: yifanc): PR: https://github.com/apache/cassandra/pull/588 Code: https://github.com/yifan-c/cassandra/tree/CASSANDRA-15797-Flaky-BinLogTest Test: https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-15797-Flaky-BinLogTest *Failed tests* * BinLogTest.testPutAfterStop: there is a chance that at the time of assertion, the consumer thread in BinLog has not yet drain the queue, so that assertion get the last NO_OP object and fail. The behavior is expected. The fix to the test is to assert none of the records get from the queue is the one inserted after stopping the binLog. * BinLogTest.testBinLogStartStop: there is no barrier in the test to block assertion until records in the queue are consumed. Added the coundownlatch as the barrier. *Chronicle reference counter history trace* The trace indicates the last release operation sees the reference counter is already 0, so that it prints the history. It is caused by the resources has already been released by the try-with-resources statement. And according to StoreComponentReferenceHandler#processWireQueue, *the exception can be ignored as the resources has already been released*. In this case, all test that calls readBinLogRecords can see the reference counter history. Why the history trace is not always printed every time? Because test already ended and there is no time to print it. If adding a sleep time (5 seconds) at the end of the test, it is guaranteed to print reference count history. > Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest > --- > > Key: CASSANDRA-15797 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15797 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Jon Meredith >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-alpha > > > An internal CI system is failing BinLogTest somewhat frequently under JDK11. > Configuration was recently changed to reduce the number of cores the tests > run with, however it is reproducible on an 8 core laptop. > {code} > [junit-timeout] OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC > was deprecated in version 9.0 and will likely be removed in a future release. > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest > [Junit-timeout] WARNING: An illegal reflective access operation has occurred > [junit-timeout] WARNING: Illegal reflective access by > net.openhft.chronicle.core.Jvm (file:/.../lib/chronicle-core-1.16.4.jar) to > field java.nio.Bits.RESERVED_MEMORY > [junit-timeout] WARNING: Please consider reporting this to the maintainers of > net.openhft.chronicle.core.Jvm > [junit-timeout] WARNING: Use --illegal-access=warn to enable warnings of > further illegal reflective access operations > [junit-timeout] WARNING: All illegal access operations will be denied in a > future release > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 1,
[jira] [Commented] (CASSANDRA-15677) Topology events are not sent to clients if the nodes use the same network interface
[ https://issues.apache.org/jira/browse/CASSANDRA-15677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104902#comment-17104902 ] Alan Boudreault commented on CASSANDRA-15677: - I've unassigned myself in case someone else is looking for a task to pick since I haven't been able to get back to this yet. This ticket only affect a cluster that is using a single interface for all nodes (different ports). >From my discussion with Ariel. CASSANDRA-7544 was tested by running manually all dtests with the single interface ccm option. There is actually no specific dtests for that and it's not something that is run regularly. So to complete this ticket: * If there is no way yet to easily run the dtests with the single interface mode, it might be a good idea to add something for this. * Check if there is a test for those topology events (expected) * Run it and see why it is not failing with the single interface mode * Run the entire dtests suite with the single interface mode and ensure everything is OK > Topology events are not sent to clients if the nodes use the same network > interface > --- > > Key: CASSANDRA-15677 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15677 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Alan Boudreault >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-rc > > Time Spent: 20m > Remaining Estimate: 0h > > *This bug only happens when the cassandra nodes are configured to use a > single network interface (ip) but different ports. See CASSANDRA-7544.* > Issue: The topology events aren't sent to clients. The problem is that the > port is not taken into account when determining if we send it or not: > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L624 > To reproduce: > {code} > # I think the cassandra-test branch is required to get the -S option > (USE_SINGLE_INTERFACE) > ccm create -n4 local40 -v 4.0-alpha2 -S > {code} > > Then run this small python driver script: > {code} > import time > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect() > while True: > print(cluster.metadata.all_hosts()) > print([h.is_up for h in cluster.metadata.all_hosts()]) > time.sleep(5) > {code} > Then decommission a node: > {code} > ccm node2 nodetool disablebinary > ccm node2 nodetool decommission > {code} > > You should see that the node is never removed from the client cluster > metadata and the reconnector started. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15677) Topology events are not sent to clients if the nodes use the same network interface
[ https://issues.apache.org/jira/browse/CASSANDRA-15677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Boudreault reassigned CASSANDRA-15677: --- Assignee: (was: Alan Boudreault) > Topology events are not sent to clients if the nodes use the same network > interface > --- > > Key: CASSANDRA-15677 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15677 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Alan Boudreault >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-rc > > Time Spent: 20m > Remaining Estimate: 0h > > *This bug only happens when the cassandra nodes are configured to use a > single network interface (ip) but different ports. See CASSANDRA-7544.* > Issue: The topology events aren't sent to clients. The problem is that the > port is not taken into account when determining if we send it or not: > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/Server.java#L624 > To reproduce: > {code} > # I think the cassandra-test branch is required to get the -S option > (USE_SINGLE_INTERFACE) > ccm create -n4 local40 -v 4.0-alpha2 -S > {code} > > Then run this small python driver script: > {code} > import time > from cassandra.cluster import Cluster > cluster = Cluster() > session = cluster.connect() > while True: > print(cluster.metadata.all_hosts()) > print([h.is_up for h in cluster.metadata.all_hosts()]) > time.sleep(5) > {code} > Then decommission a node: > {code} > ccm node2 nodetool disablebinary > ccm node2 nodetool decommission > {code} > > You should see that the node is never removed from the client cluster > metadata and the reconnector started. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104885#comment-17104885 ] Ekaterina Dimitrova commented on CASSANDRA-15783: - Thank you both [~djoshi] and [~bdeggleston]! > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15797) Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-15797: -- Source Control Link: https://github.com/apache/cassandra/pull/588 > Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest > --- > > Key: CASSANDRA-15797 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15797 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Jon Meredith >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-alpha > > > An internal CI system is failing BinLogTest somewhat frequently under JDK11. > Configuration was recently changed to reduce the number of cores the tests > run with, however it is reproducible on an 8 core laptop. > {code} > [junit-timeout] OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC > was deprecated in version 9.0 and will likely be removed in a future release. > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest > [Junit-timeout] WARNING: An illegal reflective access operation has occurred > [junit-timeout] WARNING: Illegal reflective access by > net.openhft.chronicle.core.Jvm (file:/.../lib/chronicle-core-1.16.4.jar) to > field java.nio.Bits.RESERVED_MEMORY > [junit-timeout] WARNING: Please consider reporting this to the maintainers of > net.openhft.chronicle.core.Jvm > [junit-timeout] WARNING: Use --illegal-access=warn to enable warnings of > further illegal reflective access operations > [junit-timeout] WARNING: All illegal access operations will be denied in a > future release > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.895 sec > [junit-timeout] > [junit-timeout] Testcase: > testPutAfterStop(org.apache.cassandra.utils.binlog.BinLogTest): FAILED > [junit-timeout] expected: but > was: > [junit-timeout] junit.framework.AssertionFailedError: expected: but > was: > [junit-timeout] at > org.apache.cassandra.utils.binlog.BinLogTest.testPutAfterStop(BinLogTest.java:431) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > [junit-timeout] > [junit-timeout] > [junit-timeout] Test org.apache.cassandra.utils.binlog.BinLogTest FAILED > {code} > There's also a different failure under JDK8 > {code} > junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.273 sec > [junit-timeout] > [junit-timeout] Testcase: > testBinLogStartStop(org.apache.cassandra.utils.binlog.BinLogTest): FAILED > [junit-timeout] expected:<2> but was:<0> > [junit-timeout] junit.framework.AssertionFailedError: expected:<2> but was:<0> > [junit-timeout] at > org.apache.cassandra.utils.binlog.BinLogTest.testBinLogStartStop(BinLogTest.java:172) > [junit-timeout] > [junit-timeout] > [junit-timeout] Test org.apache.cassandra.utils.binlog.BinLogTest FAILED > {code} > Reproducer > {code} > PASSED=0; time { while ant testclasslist -Dtest.classlistprefix=unit > -Dtest.classlistfile=<(echo > org/apache/cassandra/utils/binlog/BinLogTest.java); do PASSED=$((PASSED+1)); > echo PASSED $PASSED; done }; echo FAILED after $PASSED runs. > {code} > In the last four attempts it has taken 31, 38, 27 and 10 rounds respectively > under JDK11 and took 51 under JDK8 (about 15 minutes). > I have not tried running in a cpu-limited container or anything like that yet. > Additionally, this went past in the logs a few times (under JDK11). No idea > if it's just an artifact of weird test setup, or something more serious. > {code} > [junit-timeout] WARNING: Please consider reporting this to the maintainers of > net.openhft.chronicle.core.Jvm > [junit-timeout] WARNING: Use --illegal-access=warn to enable warnings of > further illegal reflective access operations > [junit-timeout] WARNING: All illegal access operations will be denied in a > future release > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.839 sec > [junit-timeout] > [junit-timeout] java.lang.Throwable: 1e53135d-main creation ref-count=1 > [junit-timeout] at > net.openhft.chronicle.core.ReferenceCounter.newRefCountHistory(ReferenceCounter.java:45) > [junit-timeout] at >
[jira] [Updated] (CASSANDRA-15797) Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-15797: -- Test and Documentation Plan: unit test Status: Patch Available (was: In Progress) > Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest > --- > > Key: CASSANDRA-15797 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15797 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Jon Meredith >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-alpha > > > An internal CI system is failing BinLogTest somewhat frequently under JDK11. > Configuration was recently changed to reduce the number of cores the tests > run with, however it is reproducible on an 8 core laptop. > {code} > [junit-timeout] OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC > was deprecated in version 9.0 and will likely be removed in a future release. > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest > [Junit-timeout] WARNING: An illegal reflective access operation has occurred > [junit-timeout] WARNING: Illegal reflective access by > net.openhft.chronicle.core.Jvm (file:/.../lib/chronicle-core-1.16.4.jar) to > field java.nio.Bits.RESERVED_MEMORY > [junit-timeout] WARNING: Please consider reporting this to the maintainers of > net.openhft.chronicle.core.Jvm > [junit-timeout] WARNING: Use --illegal-access=warn to enable warnings of > further illegal reflective access operations > [junit-timeout] WARNING: All illegal access operations will be denied in a > future release > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.895 sec > [junit-timeout] > [junit-timeout] Testcase: > testPutAfterStop(org.apache.cassandra.utils.binlog.BinLogTest): FAILED > [junit-timeout] expected: but > was: > [junit-timeout] junit.framework.AssertionFailedError: expected: but > was: > [junit-timeout] at > org.apache.cassandra.utils.binlog.BinLogTest.testPutAfterStop(BinLogTest.java:431) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > [junit-timeout] > [junit-timeout] > [junit-timeout] Test org.apache.cassandra.utils.binlog.BinLogTest FAILED > {code} > There's also a different failure under JDK8 > {code} > junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.273 sec > [junit-timeout] > [junit-timeout] Testcase: > testBinLogStartStop(org.apache.cassandra.utils.binlog.BinLogTest): FAILED > [junit-timeout] expected:<2> but was:<0> > [junit-timeout] junit.framework.AssertionFailedError: expected:<2> but was:<0> > [junit-timeout] at > org.apache.cassandra.utils.binlog.BinLogTest.testBinLogStartStop(BinLogTest.java:172) > [junit-timeout] > [junit-timeout] > [junit-timeout] Test org.apache.cassandra.utils.binlog.BinLogTest FAILED > {code} > Reproducer > {code} > PASSED=0; time { while ant testclasslist -Dtest.classlistprefix=unit > -Dtest.classlistfile=<(echo > org/apache/cassandra/utils/binlog/BinLogTest.java); do PASSED=$((PASSED+1)); > echo PASSED $PASSED; done }; echo FAILED after $PASSED runs. > {code} > In the last four attempts it has taken 31, 38, 27 and 10 rounds respectively > under JDK11 and took 51 under JDK8 (about 15 minutes). > I have not tried running in a cpu-limited container or anything like that yet. > Additionally, this went past in the logs a few times (under JDK11). No idea > if it's just an artifact of weird test setup, or something more serious. > {code} > [junit-timeout] WARNING: Please consider reporting this to the maintainers of > net.openhft.chronicle.core.Jvm > [junit-timeout] WARNING: Use --illegal-access=warn to enable warnings of > further illegal reflective access operations > [junit-timeout] WARNING: All illegal access operations will be denied in a > future release > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.839 sec > [junit-timeout] > [junit-timeout] java.lang.Throwable: 1e53135d-main creation ref-count=1 > [junit-timeout] at > net.openhft.chronicle.core.ReferenceCounter.newRefCountHistory(ReferenceCounter.java:45) >
[jira] [Commented] (CASSANDRA-15797) Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104880#comment-17104880 ] Yifan Cai commented on CASSANDRA-15797: --- PR: https://github.com/apache/cassandra/pull/588 Code: https://github.com/yifan-c/cassandra/tree/CASSANDRA-15797-Flaky-BinLogTest Test: https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-15797-Flaky-BinLogTest *Failed tests* * BinLogTest.testPutAfterStop: there is a chance that at the time of assertion, the consumer thread in BinLog has not yet drain the queue, so that assertion get the last NO_OP object and fail. The behavior is expected. The fix to the test is to assert none of the records get from the queue is the one inserted after stopping the binLog. * BinLogTest.testBinLogStartStop: there is no barrier in the test to block assertion until records in the queue are consumed. Added the coundownlatch as the barrier. *Chronicle reference counter history trace* The trace indicates the last release operation sees the reference counter is already 0, so that it prints the history. It is caused by the resources has already been released by the try-with-resources statement. And according to StoreComponentReferenceHandler#processWireQueue, *the exception can be ignored as the resources has already been released*. In this case, all test that calls readBinLogRecords can see the reference counter history. Why the history trace is not always printed every time? Because test already ended and there is no time to print it. If adding a sleep time (5 seconds) at the end of the test, it is guaranteed to print reference count history. > Fix flaky BinLogTest - org.apache.cassandra.utils.binlog.BinLogTest > --- > > Key: CASSANDRA-15797 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15797 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Jon Meredith >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-alpha > > > An internal CI system is failing BinLogTest somewhat frequently under JDK11. > Configuration was recently changed to reduce the number of cores the tests > run with, however it is reproducible on an 8 core laptop. > {code} > [junit-timeout] OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC > was deprecated in version 9.0 and will likely be removed in a future release. > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest > [Junit-timeout] WARNING: An illegal reflective access operation has occurred > [junit-timeout] WARNING: Illegal reflective access by > net.openhft.chronicle.core.Jvm (file:/.../lib/chronicle-core-1.16.4.jar) to > field java.nio.Bits.RESERVED_MEMORY > [junit-timeout] WARNING: Please consider reporting this to the maintainers of > net.openhft.chronicle.core.Jvm > [junit-timeout] WARNING: Use --illegal-access=warn to enable warnings of > further illegal reflective access operations > [junit-timeout] WARNING: All illegal access operations will be denied in a > future release > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.895 sec > [junit-timeout] > [junit-timeout] Testcase: > testPutAfterStop(org.apache.cassandra.utils.binlog.BinLogTest): FAILED > [junit-timeout] expected: but > was: > [junit-timeout] junit.framework.AssertionFailedError: expected: but > was: > [junit-timeout] at > org.apache.cassandra.utils.binlog.BinLogTest.testPutAfterStop(BinLogTest.java:431) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > [junit-timeout] > [junit-timeout] > [junit-timeout] Test org.apache.cassandra.utils.binlog.BinLogTest FAILED > {code} > There's also a different failure under JDK8 > {code} > junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest > [junit-timeout] Testsuite: org.apache.cassandra.utils.binlog.BinLogTest Tests > run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 15.273 sec > [junit-timeout] > [junit-timeout] Testcase: > testBinLogStartStop(org.apache.cassandra.utils.binlog.BinLogTest): FAILED > [junit-timeout] expected:<2> but was:<0> > [junit-timeout] junit.framework.AssertionFailedError: expected:<2> but was:<0> > [junit-timeout] at > org.apache.cassandra.utils.binlog.BinLogTest.testBinLogStartStop(BinLogTest.java:172) > [junit-timeout] > [junit-timeout] > [junit-timeout] Test
[jira] [Updated] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15783: - Reviewers: Blake Eggleston, Dinesh Joshi, Dinesh Joshi (was: Blake Eggleston, Dinesh Joshi) Blake Eggleston, Dinesh Joshi, Dinesh Joshi (was: Blake Eggleston, Dinesh Joshi) Status: Review In Progress (was: Patch Available) > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15783: - Reviewers: Blake Eggleston, Dinesh Joshi [~jasonstack], [~e.dimitrova] I'll take a look. I'd prefer [~bdeggleston] also gets in a review before we commit this change. > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15802) Commented-out lines that end in a semicolon cause an error.
[ https://issues.apache.org/jira/browse/CASSANDRA-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan West updated CASSANDRA-15802: Bug Category: Parent values: Correctness(12982)Level 1 values: API / Semantic Implementation(12988) Complexity: Normal Discovered By: User Report Severity: Low Status: Open (was: Triage Needed) This seems to be specific to multi-line comments: {code} cqlsh> /* describe keyspaces */ cqlsh> /* describe keyspaces; */ cqlsh> -- describe keyspaces; cqlsh> -- describe keyspaces cqlsh> -- describe keyspaces cqlsh> /* ... describe keyspaces; SyntaxException: line 2:19 mismatched character '' expecting '*' cqlsh> // describe keyspaces; {code} {code} $ cat no-semi.cql /* this is a comment with no semi-colon */ $ ./bin/cqlsh -f no-semi.cql $ {code} {code} $ cat semi.cql /* this is a multi-line comment with a semi-colon; */ $ ./bin/cqlsh -f semi.cql semi.cql:4:SyntaxException: line 3:27 mismatched character '' expecting '*' semi.cql:5:Incomplete statement at end of file {code} > Commented-out lines that end in a semicolon cause an error. > --- > > Key: CASSANDRA-15802 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15802 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter, CQL/Syntax >Reporter: null >Priority: Normal > Attachments: cqlsh.png > > > Commented-out lines that end in a semicolon cause an error. > For example: > /* > describe keyspaces; > */ > > This produces an error: > SyntaxException: line 2:22 no viable alternative at input ' (...* > describe keyspaces;...) > > It works as expected if you use syntax: > -- describe keyspaces; > > Environment: > python:3.7.7-slim-stretch (docker image) > > I found that this was first seen here, and was patched, but the bug appears > to have resurfaced: > https://issues.apache.org/jira/browse/CASSANDRA-2488 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104773#comment-17104773 ] Ekaterina Dimitrova commented on CASSANDRA-15783: - Thanks for the fix [~jasonstack] [~djoshi], I am not a committer, if you approve the current patch can you, please, commit it? > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15725) Add support for adding custom Verbs
[ https://issues.apache.org/jira/browse/CASSANDRA-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15725: -- Status: Review In Progress (was: Changes Suggested) LGTM +1 > Add support for adding custom Verbs > --- > > Key: CASSANDRA-15725 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15725 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > Attachments: feedback_15725.patch > > > It should be possible to safely add custom/internal Verbs - without risking > conflicts when new ones are added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15670) Transient Replication: unable to insert data when the keyspace is configured with the SimpleStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg updated CASSANDRA-15670: -- Test and Documentation Plan: Test added/modified in as part of the patch.. Status: Patch Available (was: Open) > Transient Replication: unable to insert data when the keyspace is configured > with the SimpleStrategy > > > Key: CASSANDRA-15670 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15670 > Project: Cassandra > Issue Type: Bug > Components: Feature/Transient Replication >Reporter: Alan Boudreault >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > An error is thrown then trying to insert data with the transient replication > + SimpleStrategy configured. > Test case: > {code:java} > CREATE KEYSPACE test_tr WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': '3/1'}; > CREATE TABLE test_tr.users (id int PRIMARY KEY, username text) with > read_repair ='NONE'; > INSERT INTO test_tr.users (id, username) VALUES (1, 'alan');{code} > > traceback: > {code:java} > ERROR [Native-Transport-Requests-8] 2020-03-27 10:27:17,188 > ErrorMessage.java:450 - Unexpected exception during request > java.lang.ClassCastException: org.apache.cassandra.locator.SimpleStrategy > cannot be cast to org.apache.cassandra.locator.NetworkTopologyStrategy > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForRead(ConsistencyLevel.java:103) > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForWrite(ConsistencyLevel.java:112) > at > org.apache.cassandra.locator.ReplicaPlans$2.select(ReplicaPlans.java:409) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:353) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:348) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:341) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:330) > at > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1171) > at > org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:713) > at > org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:951) > at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:475) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:453) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:216) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:247) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:233) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:108) > at > org.apache.cassandra.transport.Message$Request.execute(Message.java:253) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:725) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:630) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {code} > > --> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L103 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15670) Transient Replication: unable to insert data when the keyspace is configured with the SimpleStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg updated CASSANDRA-15670: -- Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear Impact(13164) Complexity: Normal Discovered By: User Report Severity: Normal Assignee: (was: Francisco Fernandez) Status: Open (was: Triage Needed) > Transient Replication: unable to insert data when the keyspace is configured > with the SimpleStrategy > > > Key: CASSANDRA-15670 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15670 > Project: Cassandra > Issue Type: Bug > Components: Feature/Transient Replication >Reporter: Alan Boudreault >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > An error is thrown then trying to insert data with the transient replication > + SimpleStrategy configured. > Test case: > {code:java} > CREATE KEYSPACE test_tr WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': '3/1'}; > CREATE TABLE test_tr.users (id int PRIMARY KEY, username text) with > read_repair ='NONE'; > INSERT INTO test_tr.users (id, username) VALUES (1, 'alan');{code} > > traceback: > {code:java} > ERROR [Native-Transport-Requests-8] 2020-03-27 10:27:17,188 > ErrorMessage.java:450 - Unexpected exception during request > java.lang.ClassCastException: org.apache.cassandra.locator.SimpleStrategy > cannot be cast to org.apache.cassandra.locator.NetworkTopologyStrategy > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForRead(ConsistencyLevel.java:103) > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForWrite(ConsistencyLevel.java:112) > at > org.apache.cassandra.locator.ReplicaPlans$2.select(ReplicaPlans.java:409) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:353) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:348) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:341) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:330) > at > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1171) > at > org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:713) > at > org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:951) > at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:475) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:453) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:216) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:247) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:233) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:108) > at > org.apache.cassandra.transport.Message$Request.execute(Message.java:253) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:725) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:630) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {code} > > --> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L103 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15670) Transient Replication: unable to insert data when the keyspace is configured with the SimpleStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg updated CASSANDRA-15670: -- Fix Version/s: (was: 4.0-rc) 4.0-beta > Transient Replication: unable to insert data when the keyspace is configured > with the SimpleStrategy > > > Key: CASSANDRA-15670 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15670 > Project: Cassandra > Issue Type: Bug > Components: Feature/Transient Replication >Reporter: Alan Boudreault >Assignee: Francisco Fernandez >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > An error is thrown then trying to insert data with the transient replication > + SimpleStrategy configured. > Test case: > {code:java} > CREATE KEYSPACE test_tr WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': '3/1'}; > CREATE TABLE test_tr.users (id int PRIMARY KEY, username text) with > read_repair ='NONE'; > INSERT INTO test_tr.users (id, username) VALUES (1, 'alan');{code} > > traceback: > {code:java} > ERROR [Native-Transport-Requests-8] 2020-03-27 10:27:17,188 > ErrorMessage.java:450 - Unexpected exception during request > java.lang.ClassCastException: org.apache.cassandra.locator.SimpleStrategy > cannot be cast to org.apache.cassandra.locator.NetworkTopologyStrategy > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForRead(ConsistencyLevel.java:103) > at > org.apache.cassandra.db.ConsistencyLevel.eachQuorumForWrite(ConsistencyLevel.java:112) > at > org.apache.cassandra.locator.ReplicaPlans$2.select(ReplicaPlans.java:409) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:353) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:348) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:341) > at > org.apache.cassandra.locator.ReplicaPlans.forWrite(ReplicaPlans.java:330) > at > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1171) > at > org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:713) > at > org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:951) > at > org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:475) > at > org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:453) > at > org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:216) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:247) > at > org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:233) > at > org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:108) > at > org.apache.cassandra.transport.Message$Request.execute(Message.java:253) > at > org.apache.cassandra.transport.Message$Dispatcher.processRequest(Message.java:725) > at > org.apache.cassandra.transport.Message$Dispatcher.lambda$channelRead0$0(Message.java:630) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {code} > > --> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L103 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-15783: - Test and Documentation Plan: added dtests. [Patch|https://github.com/apache/cassandra/pull/587] [Dtest|https://github.com/apache/cassandra-dtest/pull/68] [Circle|https://circleci.com/workflow-run/e10d1061-4480-4ac3-9468-add94a9b3108] was:added dtests > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-15783: - Authors: ZhaoYang (was: Ekaterina Dimitrova) Test and Documentation Plan: added dtests Status: Patch Available (was: Open) [~e.dimitrova] thanks for the report! > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104704#comment-17104704 ] ZhaoYang commented on CASSANDRA-15783: -- bq. However, here, the issue went unnoticed because the test only exercised STCS and not LCS. If the test would've exercised both strategies then the issue would've surfaced earlier when we added ZCS to LCS. Before CASSANDRA-15657, yes. if we have a LCS test, the issue should have been caught.. But after 15657, both STCS and LCS are streaming entire sstable, they work exactly the same way.. Anyway, I have added one LCS test.. > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15229) BufferPool Regression
[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104629#comment-17104629 ] ZhaoYang edited comment on CASSANDRA-15229 at 5/11/20, 4:08 PM: {quote}I would just be looking to smooth out the random distribution of sizes used with e.g. a handful of queues each containing a single size of buffer and at most a handful of items each. {quote} It looks simpler in its initial form, but I am wondering whether it will eventually grow/evolve into another buffer pool. {quote}so if you are willing to at least enable the behaviour only for the ChunkCache so this change cannot have any unintended negative effect for those users not expected to benefit, my main concern will be alleviated. {quote} +1, partially freed chunk recirculation is only enabled for permanent pool, not for temporary pool. [Patch|https://github.com/apache/cassandra/pull/535/files] / [Circle|https://circleci.com/workflow-run/096afbe1-ec99-4d5f-bdaa-06f538b8280f]: * Initialize 2 buffer pool instances, one for chunk cache (default 512mb) called {{"Permanent Pool"}}, one for network (default 128mb) called {{"Temporary Pool"}}. So they won't interfere each other. * Improve buffer pool metrics to track: ** {{"overflowSize"}} - buffer size that is allocated outside of buffer pool. ** {{"UsedSize"}} - buffer size that is currently being allocated. * Allow partially freed chunk to be recycled in Permanent Pool to improve cache utilization due to chunk cache holding buffer for arbitrary time period. Note that due to various allocation sizes, fragmentation still exists in partially freed chunk. was (Author: jasonstack): bq. I would just be looking to smooth out the random distribution of sizes used with e.g. a handful of queues each containing a single size of buffer and at most a handful of items each. It looks simpler in its initial form, but I am wondering whether it will eventually grow/evolve into another buffer pool. bq. so if you are willing to at least enable the behaviour only for the ChunkCache so this change cannot have any unintended negative effect for those users not expected to benefit, my main concern will be alleviated. +1, partially freed chunk recirculation is only enabled for permanent pool, not for temporary pool. [Patch|https://github.com/apache/cassandra/pull/535/files] / [Circle|https://circleci.com/workflow-run/096afbe1-ec99-4d5f-bdaa-06f538b8280f]: * Initiate 2 buffer pool instances, one for chunk cache (default 512mb) called {{"Permanent Pool"}}, one for network (default 128mb) called {{"Temporary Pool"}}. So they won't interfere each other. * Improve buffer pool metrics to track: {{"overflowSize"}} - buffer size that is allocated outside of buffer pool and {{"UsedSize"}} - buffer size that is currently being allocated. * Allow partially freed chunk to be recycled in Permanent Pool to improve cache utilization due to chunk cache holding buffer for arbitrary time period. Note that due to various allocation sizes, fragmentation still exists in partially freed chunk. > BufferPool Regression > - > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 4.0-beta > > Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, > 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, > 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, > 15229-unsafe.png > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of
[jira] [Commented] (CASSANDRA-15789) Rows can get duplicated in mixed major-version clusters and after full upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104630#comment-17104630 ] Alex Petrov commented on CASSANDRA-15789: - +1 with some minor comments: * {{nowInSeconds}} [here|https://github.com/apache/cassandra/commit/966ae03e778742a94fbbecffe07d977c3a39f70b#diff-984dca80cb7e39e7a99f4928ae4b3ec8R55] seems to be unused * [this|https://github.com/apache/cassandra/compare/trunk...krummas:15789-3.0#diff-984dca80cb7e39e7a99f4928ae4b3ec8R52] can be just {{flush()}} * [here|https://github.com/apache/cassandra/compare/trunk...krummas:15789-3.0#diff-32fe9b86f85fea958f137ab7862ec522R98], we will be logging unconditionally, even if we have sent snapshot messages. On a somewhat related note, we can also use {{CompactionIteratorTest#duplicateRowsTest}} to verify that throttling works by just clearing {{sentMessages}} and making sure we don't issue it again if there's one more duplicate. * I'm not 100% sure which level is the best for in-jvm dtests. Should we keep {{DEBUG}} or should we switch to {{ERROR}}. * should we add some information that shows {{Row/RT/Row}} sandwich, like the one in description? It might make it easier for people to read it in future. * in {{PartitionUpdateTest}}, {{testDuplicate}} and {{testMerge}} seem to be specific to this issue, but we don't have any ticket specific information there. Should we add some motivation/information? In fact, we may consider adding some fuzz tests to test even more scenarios in the future. * There are some (possibly intended) {{printlns}} in {{assertCommandIssued}} Regarding duplicates elimination in {{PartitionUpdate}}, since they'll still be detected and snapshotted, I think this is fine. However, I can imagine a scenario where an erroneous duplicate row can result into data resurrection. But given duplicate rows are by no means a right behaviour, and we already know at least three ways that could lead to such behaviour (12144, 14008, and this issue), merging them to one seems to be a reasonable thing to do, but doesn't always guarantee behaviour one would otherwise expect from the database. It might be good to make it configurable and/or disabled by default. > Rows can get duplicated in mixed major-version clusters and after full upgrade > -- > > Key: CASSANDRA-15789 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15789 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Local/Memtable, Local/SSTable >Reporter: Aleksey Yeschenko >Assignee: Marcus Eriksson >Priority: Normal > > In a mixed 2.X/3.X major version cluster a sequence of row deletes, > collection overwrites, paging, and read repair can cause 3.X nodes to split > individual rows into several rows with identical clustering. This happens due > to 2.X paging and RT semantics, and a 3.X {{LegacyLayout}} deficiency. > To reproduce, set up a 2-node mixed major version cluster with the following > table: > {code} > CREATE TABLE distributed_test_keyspace.tlb ( > pk int, > ck int, > v map, > PRIMARY KEY (pk, ck) > ); > {code} > 1. Using either node as the coordinator, delete the row with ck=2 using > timestamp 1 > {code} > DELETE FROM tbl USING TIMESTAMP 1 WHERE pk = 1 AND ck = 2; > {code} > 2. Using either node as the coordinator, insert the following 3 rows: > {code} > INSERT INTO tbl (pk, ck, v) VALUES (1, 1, {'e':'f'}) USING TIMESTAMP 3; > INSERT INTO tbl (pk, ck, v) VALUES (1, 2, {'g':'h'}) USING TIMESTAMP 3; > INSERT INTO tbl (pk, ck, v) VALUES (1, 3, {'i':'j'}) USING TIMESTAMP 3; > {code} > 3. Flush the table on both nodes > 4. Using the 2.2 node as the coordinator, force read repar by querying the > table with page size = 2: > > {code} > SELECT * FROM tbl; > {code} > 5. Overwrite the row with ck=2 using timestamp 5: > {code} > INSERT INTO tbl (pk, ck, v) VALUES (1, 2, {'g':'h'}) USING TIMESTAMP 5;}} > {code} > 6. Query the 3.0 node and observe the split row: > {code} > cqlsh> select * from distributed_test_keyspace.tlb ; > pk | ck | v > ++ > 1 | 1 | {'e': 'f'} > 1 | 2 | {'g': 'h'} > 1 | 2 | {'k': 'l'} > 1 | 3 | {'i': 'j'} > {code} > This happens because the read to query the second page ends up generating the > following mutation for the 3.0 node: > {code} > ColumnFamily(tbl -{deletedAt=-9223372036854775808, localDeletion=2147483647, > ranges=[2:v:_-2:v:!, deletedAt=2, localDeletion=1588588821] > [2:v:!-2:!, deletedAt=1, localDeletion=1588588821] > [3:v:_-3:v:!, deletedAt=2, localDeletion=1588588821]}- > [2:v:63:false:1@3,]) > {code} > Which on 3.0 side gets incorrectly deserialized as > {code} >
[jira] [Updated] (CASSANDRA-15229) BufferPool Regression
[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-15229: - Test and Documentation Plan: added unit test and tested performance. Status: Patch Available (was: In Progress) bq. I would just be looking to smooth out the random distribution of sizes used with e.g. a handful of queues each containing a single size of buffer and at most a handful of items each. It looks simpler in its initial form, but I am wondering whether it will eventually grow/evolve into another buffer pool. bq. so if you are willing to at least enable the behaviour only for the ChunkCache so this change cannot have any unintended negative effect for those users not expected to benefit, my main concern will be alleviated. +1, partially freed chunk recirculation is only enabled for permanent pool, not for temporary pool. [Patch|https://github.com/apache/cassandra/pull/535/files]/[Circle|https://circleci.com/workflow-run/096afbe1-ec99-4d5f-bdaa-06f538b8280f]: * Initiate 2 buffer pool instances, one for chunk cache (default 512mb) called {{"Permanent Pool"}}, one for network (default 128mb) called {{"Temporary Pool"}}. So they won't interfere each other. * Improve buffer pool metrics to track: {{"overflowSize"}} - buffer size that is allocated outside of buffer pool and {{"UsedSize"}} - buffer size that is currently being allocated. * Allow partially freed chunk to be recycled in Permanent Pool to improve cache utilization due to chunk cache holding buffer for arbitrary time period. Note that due to various allocation sizes, fragmentation still exists in partially freed chunk. > BufferPool Regression > - > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 4.0-beta > > Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, > 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, > 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, > 15229-unsafe.png > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of global coordination and per-allocation overhead. We don’t need > 1KiB granularity for allocations, nor 16 byte granularity for tiny > allocations. > - > Since CASSANDRA-5863, chunk cache is implemented to use buffer pool. When > local pool is full, one of its chunks will be evicted and only put back to > global pool when all buffers in the evicted chunk are released. But due to > chunk cache, buffers can be held for long period of time, preventing evicted > chunk to be recycled even though most of space in the evicted chunk are free. > There two things need to be improved: > 1. Evicted chunk with free space should be recycled to global pool, even if > it's not fully free. It's doable in 4.0. > 2. Reduce fragmentation caused by different buffer size. With #1, partially > freed chunk will be available for allocation, but "holes" in the partially > freed chunk are with different sizes. We should consider allocating fixed > buffer size which is unlikely to fit in 4.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15229) BufferPool Regression
[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104629#comment-17104629 ] ZhaoYang edited comment on CASSANDRA-15229 at 5/11/20, 4:06 PM: bq. I would just be looking to smooth out the random distribution of sizes used with e.g. a handful of queues each containing a single size of buffer and at most a handful of items each. It looks simpler in its initial form, but I am wondering whether it will eventually grow/evolve into another buffer pool. bq. so if you are willing to at least enable the behaviour only for the ChunkCache so this change cannot have any unintended negative effect for those users not expected to benefit, my main concern will be alleviated. +1, partially freed chunk recirculation is only enabled for permanent pool, not for temporary pool. [Patch|https://github.com/apache/cassandra/pull/535/files] / [Circle|https://circleci.com/workflow-run/096afbe1-ec99-4d5f-bdaa-06f538b8280f]: * Initiate 2 buffer pool instances, one for chunk cache (default 512mb) called {{"Permanent Pool"}}, one for network (default 128mb) called {{"Temporary Pool"}}. So they won't interfere each other. * Improve buffer pool metrics to track: {{"overflowSize"}} - buffer size that is allocated outside of buffer pool and {{"UsedSize"}} - buffer size that is currently being allocated. * Allow partially freed chunk to be recycled in Permanent Pool to improve cache utilization due to chunk cache holding buffer for arbitrary time period. Note that due to various allocation sizes, fragmentation still exists in partially freed chunk. was (Author: jasonstack): bq. I would just be looking to smooth out the random distribution of sizes used with e.g. a handful of queues each containing a single size of buffer and at most a handful of items each. It looks simpler in its initial form, but I am wondering whether it will eventually grow/evolve into another buffer pool. bq. so if you are willing to at least enable the behaviour only for the ChunkCache so this change cannot have any unintended negative effect for those users not expected to benefit, my main concern will be alleviated. +1, partially freed chunk recirculation is only enabled for permanent pool, not for temporary pool. [Patch|https://github.com/apache/cassandra/pull/535/files]/[Circle|https://circleci.com/workflow-run/096afbe1-ec99-4d5f-bdaa-06f538b8280f]: * Initiate 2 buffer pool instances, one for chunk cache (default 512mb) called {{"Permanent Pool"}}, one for network (default 128mb) called {{"Temporary Pool"}}. So they won't interfere each other. * Improve buffer pool metrics to track: {{"overflowSize"}} - buffer size that is allocated outside of buffer pool and {{"UsedSize"}} - buffer size that is currently being allocated. * Allow partially freed chunk to be recycled in Permanent Pool to improve cache utilization due to chunk cache holding buffer for arbitrary time period. Note that due to various allocation sizes, fragmentation still exists in partially freed chunk. > BufferPool Regression > - > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 4.0-beta > > Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, > 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, > 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, > 15229-unsafe.png > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of global coordination
[jira] [Assigned] (CASSANDRA-15685) flaky testWithMismatchingPending - org.apache.cassandra.distributed.test.PreviewRepairTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova reassigned CASSANDRA-15685: --- Assignee: Ekaterina Dimitrova > flaky testWithMismatchingPending - > org.apache.cassandra.distributed.test.PreviewRepairTest > -- > > Key: CASSANDRA-15685 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15685 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Kevin Gallardo >Assignee: Ekaterina Dimitrova >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > Observed in: > https://app.circleci.com/pipelines/github/newkek/cassandra/34/workflows/1c6b157d-13c3-48a9-85fb-9fe8c153256b/jobs/191/tests > Failure: > {noformat} > testWithMismatchingPending - > org.apache.cassandra.distributed.test.PreviewRepairTest > junit.framework.AssertionFailedError > at > org.apache.cassandra.distributed.test.PreviewRepairTest.testWithMismatchingPending(PreviewRepairTest.java:97) > {noformat} > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15685] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13047) Point cqlsh help to the new doc
[ https://issues.apache.org/jira/browse/CASSANDRA-13047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi reassigned CASSANDRA-13047: --- Assignee: Berenguer Blasi > Point cqlsh help to the new doc > --- > > Key: CASSANDRA-13047 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13047 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0 > > > Cqlsh still points to the "old" textile CQL doc, but that's not really > maintain anymore now that we have the new doc (which include everything the > old doc had and more). We should update cqlsh to point to the new doc so we > can remove the old one completely. > Any takers? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15299) CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta
[ https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104465#comment-17104465 ] Sam Tunnicliffe edited comment on CASSANDRA-15299 at 5/11/20, 1:39 PM: --- bq. find out that the _uncompressed_ size is above the limit, split it into 128 KiB chunks, then compress each chunk separately and wrap it into its own (uncontained) outer frame? Yes, exactly. bq. uncontained frames are never recoverable when compression is enabled Large messages are never recoverable when a corrupt payload is detected. Self-contained frames *could* be recoverable as long as only the payload is corrupt, but like we discussed earlier this is more complicated here than in the internode case due to the payload potentially containing multiple stream ids, so we may as well close the connection whenever a corrupt frame is encountered. was (Author: beobal): > find out that the _uncompressed_ size is above the limit, split it into 128 >KiB chunks, then compress each chunk separately and wrap it into its own >(uncontained) outer frame? Yes, exactly. > uncontained frames are never recoverable when compression is enabled Large messages are never recoverable when a corrupt payload is detected. Self-contained frames *could* be recoverable as long as only the payload is corrupt, but like we discussed earlier this is more complicated here than in the internode case due to the payload potentially containing multiple stream ids, so we may as well close the connection whenever a corrupt frame is encountered. > CASSANDRA-13304 follow-up: improve checksumming and compression in protocol > v5-beta > --- > > Key: CASSANDRA-15299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15299 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Client >Reporter: Aleksey Yeschenko >Assignee: Sam Tunnicliffe >Priority: Normal > Labels: protocolv5 > Fix For: 4.0-beta > > > CASSANDRA-13304 made an important improvement to our native protocol: it > introduced checksumming/CRC32 to request and response bodies. It’s an > important step forward, but it doesn’t cover the entire stream. In > particular, the message header is not covered by a checksum or a crc, which > poses a correctness issue if, for example, {{streamId}} gets corrupted. > Additionally, we aren’t quite using CRC32 correctly, in two ways: > 1. We are calculating the CRC32 of the *decompressed* value instead of > computing the CRC32 on the bytes written on the wire - losing the properties > of the CRC32. In some cases, due to this sequencing, attempting to decompress > a corrupt stream can cause a segfault by LZ4. > 2. When using CRC32, the CRC32 value is written in the incorrect byte order, > also losing some of the protections. > See https://users.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf for > explanation for the two points above. > Separately, there are some long-standing issues with the protocol - since > *way* before CASSANDRA-13304. Importantly, both checksumming and compression > operate on individual message bodies rather than frames of multiple complete > messages. In reality, this has several important additional downsides. To > name a couple: > # For compression, we are getting poor compression ratios for smaller > messages - when operating on tiny sequences of bytes. In reality, for most > small requests and responses we are discarding the compressed value as it’d > be smaller than the uncompressed one - incurring both redundant allocations > and compressions. > # For checksumming and CRC32 we pay a high overhead price for small messages. > 4 bytes extra is *a lot* for an empty write response, for example. > To address the correctness issue of {{streamId}} not being covered by the > checksum/CRC32 and the inefficiency in compression and checksumming/CRC32, we > should switch to a framing protocol with multiple messages in a single frame. > I suggest we reuse the framing protocol recently implemented for internode > messaging in CASSANDRA-15066 to the extent that its logic can be borrowed, > and that we do it before native protocol v5 graduates from beta. See > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderCrc.java > and > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderLZ4.java. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15299) CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta
[ https://issues.apache.org/jira/browse/CASSANDRA-15299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104465#comment-17104465 ] Sam Tunnicliffe commented on CASSANDRA-15299: - > find out that the _uncompressed_ size is above the limit, split it into 128 >KiB chunks, then compress each chunk separately and wrap it into its own >(uncontained) outer frame? Yes, exactly. > uncontained frames are never recoverable when compression is enabled Large messages are never recoverable when a corrupt payload is detected. Self-contained frames *could* be recoverable as long as only the payload is corrupt, but like we discussed earlier this is more complicated here than in the internode case due to the payload potentially containing multiple stream ids, so we may as well close the connection whenever a corrupt frame is encountered. > CASSANDRA-13304 follow-up: improve checksumming and compression in protocol > v5-beta > --- > > Key: CASSANDRA-15299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15299 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Client >Reporter: Aleksey Yeschenko >Assignee: Sam Tunnicliffe >Priority: Normal > Labels: protocolv5 > Fix For: 4.0-beta > > > CASSANDRA-13304 made an important improvement to our native protocol: it > introduced checksumming/CRC32 to request and response bodies. It’s an > important step forward, but it doesn’t cover the entire stream. In > particular, the message header is not covered by a checksum or a crc, which > poses a correctness issue if, for example, {{streamId}} gets corrupted. > Additionally, we aren’t quite using CRC32 correctly, in two ways: > 1. We are calculating the CRC32 of the *decompressed* value instead of > computing the CRC32 on the bytes written on the wire - losing the properties > of the CRC32. In some cases, due to this sequencing, attempting to decompress > a corrupt stream can cause a segfault by LZ4. > 2. When using CRC32, the CRC32 value is written in the incorrect byte order, > also losing some of the protections. > See https://users.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf for > explanation for the two points above. > Separately, there are some long-standing issues with the protocol - since > *way* before CASSANDRA-13304. Importantly, both checksumming and compression > operate on individual message bodies rather than frames of multiple complete > messages. In reality, this has several important additional downsides. To > name a couple: > # For compression, we are getting poor compression ratios for smaller > messages - when operating on tiny sequences of bytes. In reality, for most > small requests and responses we are discarding the compressed value as it’d > be smaller than the uncompressed one - incurring both redundant allocations > and compressions. > # For checksumming and CRC32 we pay a high overhead price for small messages. > 4 bytes extra is *a lot* for an empty write response, for example. > To address the correctness issue of {{streamId}} not being covered by the > checksum/CRC32 and the inefficiency in compression and checksumming/CRC32, we > should switch to a framing protocol with multiple messages in a single frame. > I suggest we reuse the framing protocol recently implemented for internode > messaging in CASSANDRA-15066 to the extent that its logic can be borrowed, > and that we do it before native protocol v5 graduates from beta. See > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderCrc.java > and > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderLZ4.java. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104181#comment-17104181 ] ZhaoYang edited comment on CASSANDRA-15783 at 5/11/20, 12:32 PM: - {quote}Could you also update the Python dtests so they’ll check with both STCS and LCS. {quote} The difference is not about compaction strategy, but the streaming method.. I has moved transient replica repair dtests into {{"TestTransientReplicationRepairStreamEntireSSTable"}} and created another class {{"TestTransientReplicationRepairLegacyStreaming"}} with "stream_entire_sstables=false".. WDYT? [patch|https://github.com/apache/cassandra/pull/587/files]: mark "isTransient=false" for entire-streaming sstables and fixed some typos in transient replica document. [dtest|https://github.com/apache/cassandra-dtest/pull/68]: add legacy streaming for transient replica repair tests. [CI|https://circleci.com/workflow-run/d9a14206-5acd-4a66-a0f2-5d8c37c0db8d] was (Author: jasonstack): {quote}Could you also update the Python dtests so they’ll check with both STCS and LCS. {quote} The difference is not about compaction strategy, but the streaming method.. I has moved transient replica repair dtests into {{"TestTransientReplicationRepairStreamEntireSSTable"}} and created another class {{"TestTransientReplicationRepairLegacyStreaming"}} with "stream_entire_sstables=false".. WDYT? [patch|https://github.com/apache/cassandra/pull/587/files]: mark "isTransient=false" for entire-streaming sstables and fixed some typos in transient replica document. [dtest|https://github.com/apache/cassandra-dtest/pull/68]: add legacy streaming for transient replica repair tests. [CI|https://circleci.com/workflow-run/6b3e7023-4f1d-4611-bfe9-7b63367a78fa] > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down
[ https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-15795: Component/s: (was: Consistency/Coordination) (was: Cluster/Gossip) > Cannot read data from a 3-node cluster which has two nodes down > --- > > Key: CASSANDRA-15795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15795 > Project: Cassandra > Issue Type: Bug >Reporter: YCozy >Priority: Normal > > I start up a 3 nodes cluster, and write a row with 'replication_factor' : > '3'. The consistency level is ONE. > Then I kill two nodes, and try to get the row that I just inserted by cqlsh. > But cqlsh returns NoHostAvailable. > I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15795) Cannot read data from a 3-node cluster which has two nodes down
[ https://issues.apache.org/jira/browse/CASSANDRA-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104365#comment-17104365 ] ZhaoYang commented on CASSANDRA-15795: -- I cannot reproduce it with CCM on latest cassandra 3.11: * ccm create -n 3 apollo --install-dir=TO_CASSANDRA && ccm start * cqlsh> create keyspace ks WITH replication = \{'class':'SimpleStrategy', 'replication_factor' : 3}; cqlsh> use ks; cqlsh:ks> create table cf (key int primary key, value int); cqlsh:ks> insert into cf (key, value) values(1,1); * ccm node2 stop && ccm node3 stop * cqlsh> select * from ks.cf > Cannot read data from a 3-node cluster which has two nodes down > --- > > Key: CASSANDRA-15795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15795 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Consistency/Coordination >Reporter: YCozy >Priority: Normal > > I start up a 3 nodes cluster, and write a row with 'replication_factor' : > '3'. The consistency level is ONE. > Then I kill two nodes, and try to get the row that I just inserted by cqlsh. > But cqlsh returns NoHostAvailable. > I find this issue in CA 3.11.5, and it can also be exposed in newest 3.11.6. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15503) Slow query log indicates opposite LTE when GTE operator
[ https://issues.apache.org/jira/browse/CASSANDRA-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-15503: -- Source Control Link: https://github.com/apache/cassandra/commit/406a8596eb7ad18079df20121521b1c659063ef4 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Slow query log indicates opposite LTE when GTE operator > --- > > Key: CASSANDRA-15503 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15503 > Project: Cassandra > Issue Type: Bug > Components: Observability/Logging >Reporter: Wallace Baggaley >Assignee: Andres de la Peña >Priority: Normal > Fix For: 3.11.7, 4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Slow query log is indicating a '<=' when a ">=" operator was sent. This > appears to be a logging only issue, but it threw off development for a day > figuring this out. Please fix. > How to reproduce. Set slow query log timeout to 1 millisecond. > In cqlsh run > {noformat} > CREATE TABLE atable ( > .id text, > timestamp timestamp, > PRIMARY KEY ((id), timestamp) > ) WITH CLUSTERING ORDER BY (timestamp DESC); > insert into atable (id, timestamp) VALUES ( '1',1); > insert into atable (id, timestamp) VALUES ( '2',2); > insert into atable (id, timestamp) VALUES ( '3',3); > insert into atable (id, timestamp) VALUES ( '4',4); > insert into atable (id, timestamp) VALUES ( '5',5); > insert into atable (id, timestamp) VALUES ( '6',6); > insert into atable (id, timestamp) VALUES ( '7',7); > insert into atable (id, timestamp) VALUES ( '8',8); > insert into atable (id, timestamp) VALUES ( '9',9); > insert into atable (id, timestamp) VALUES ( '10',10); > insert into atable (id, timestamp) VALUES ( '11',11); > select * from atable where timestamp >= '1970-01-01 00:00:00.006+' allow > filtering; > {noformat} > In the logs it prints: > {noformat} > DEBUG 1 operations were slow in the last 5003 msecs: > , > time 7 msec - slow timeout 1 msec > {noformat} > But the query works appropriately and returns > {noformat} > id | timestamp > +- > 6 | 1970-01-01 00:00:00.006000+ > 7 | 1970-01-01 00:00:00.007000+ > 9 | 1970-01-01 00:00:00.009000+ > 10 | 1970-01-01 00:00:00.01+ > 8 | 1970-01-01 00:00:00.008000+ > 11 | 1970-01-01 00:00:00.011000+ > (6 rows) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15503) Slow query log indicates opposite LTE when GTE operator
[ https://issues.apache.org/jira/browse/CASSANDRA-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-15503: -- Reviewers: Berenguer Blasi, Andres de la Peña (was: Andres de la Peña, Berenguer Blasi) Berenguer Blasi, Andres de la Peña (was: Berenguer Blasi) Status: Review In Progress (was: Patch Available) > Slow query log indicates opposite LTE when GTE operator > --- > > Key: CASSANDRA-15503 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15503 > Project: Cassandra > Issue Type: Bug > Components: Observability/Logging >Reporter: Wallace Baggaley >Assignee: Andres de la Peña >Priority: Normal > Fix For: 3.11.7, 4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Slow query log is indicating a '<=' when a ">=" operator was sent. This > appears to be a logging only issue, but it threw off development for a day > figuring this out. Please fix. > How to reproduce. Set slow query log timeout to 1 millisecond. > In cqlsh run > {noformat} > CREATE TABLE atable ( > .id text, > timestamp timestamp, > PRIMARY KEY ((id), timestamp) > ) WITH CLUSTERING ORDER BY (timestamp DESC); > insert into atable (id, timestamp) VALUES ( '1',1); > insert into atable (id, timestamp) VALUES ( '2',2); > insert into atable (id, timestamp) VALUES ( '3',3); > insert into atable (id, timestamp) VALUES ( '4',4); > insert into atable (id, timestamp) VALUES ( '5',5); > insert into atable (id, timestamp) VALUES ( '6',6); > insert into atable (id, timestamp) VALUES ( '7',7); > insert into atable (id, timestamp) VALUES ( '8',8); > insert into atable (id, timestamp) VALUES ( '9',9); > insert into atable (id, timestamp) VALUES ( '10',10); > insert into atable (id, timestamp) VALUES ( '11',11); > select * from atable where timestamp >= '1970-01-01 00:00:00.006+' allow > filtering; > {noformat} > In the logs it prints: > {noformat} > DEBUG 1 operations were slow in the last 5003 msecs: > , > time 7 msec - slow timeout 1 msec > {noformat} > But the query works appropriately and returns > {noformat} > id | timestamp > +- > 6 | 1970-01-01 00:00:00.006000+ > 7 | 1970-01-01 00:00:00.007000+ > 9 | 1970-01-01 00:00:00.009000+ > 10 | 1970-01-01 00:00:00.01+ > 8 | 1970-01-01 00:00:00.008000+ > 11 | 1970-01-01 00:00:00.011000+ > (6 rows) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15503) Slow query log indicates opposite LTE when GTE operator
[ https://issues.apache.org/jira/browse/CASSANDRA-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-15503: -- Status: Ready to Commit (was: Review In Progress) > Slow query log indicates opposite LTE when GTE operator > --- > > Key: CASSANDRA-15503 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15503 > Project: Cassandra > Issue Type: Bug > Components: Observability/Logging >Reporter: Wallace Baggaley >Assignee: Andres de la Peña >Priority: Normal > Fix For: 3.11.7, 4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Slow query log is indicating a '<=' when a ">=" operator was sent. This > appears to be a logging only issue, but it threw off development for a day > figuring this out. Please fix. > How to reproduce. Set slow query log timeout to 1 millisecond. > In cqlsh run > {noformat} > CREATE TABLE atable ( > .id text, > timestamp timestamp, > PRIMARY KEY ((id), timestamp) > ) WITH CLUSTERING ORDER BY (timestamp DESC); > insert into atable (id, timestamp) VALUES ( '1',1); > insert into atable (id, timestamp) VALUES ( '2',2); > insert into atable (id, timestamp) VALUES ( '3',3); > insert into atable (id, timestamp) VALUES ( '4',4); > insert into atable (id, timestamp) VALUES ( '5',5); > insert into atable (id, timestamp) VALUES ( '6',6); > insert into atable (id, timestamp) VALUES ( '7',7); > insert into atable (id, timestamp) VALUES ( '8',8); > insert into atable (id, timestamp) VALUES ( '9',9); > insert into atable (id, timestamp) VALUES ( '10',10); > insert into atable (id, timestamp) VALUES ( '11',11); > select * from atable where timestamp >= '1970-01-01 00:00:00.006+' allow > filtering; > {noformat} > In the logs it prints: > {noformat} > DEBUG 1 operations were slow in the last 5003 msecs: > , > time 7 msec - slow timeout 1 msec > {noformat} > But the query works appropriately and returns > {noformat} > id | timestamp > +- > 6 | 1970-01-01 00:00:00.006000+ > 7 | 1970-01-01 00:00:00.007000+ > 9 | 1970-01-01 00:00:00.009000+ > 10 | 1970-01-01 00:00:00.01+ > 8 | 1970-01-01 00:00:00.008000+ > 11 | 1970-01-01 00:00:00.011000+ > (6 rows) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15503) Slow query log indicates opposite LTE when GTE operator
[ https://issues.apache.org/jira/browse/CASSANDRA-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104349#comment-17104349 ] Andres de la Peña commented on CASSANDRA-15503: --- Thanks for the review. Committed to 3.11 branch as [406a8596eb7ad18079df20121521b1c659063ef4|https://github.com/apache/cassandra/commit/406a8596eb7ad18079df20121521b1c659063ef4] and merged up to [trunk|https://github.com/apache/cassandra/commit/5f61e94ea5d25aae1eb96e74512e55edef1cef14]. Dtest changes committed as [10ff82bf779289da913b40c1058fd85bd748c986|https://github.com/apache/cassandra-dtest/commit/10ff82bf779289da913b40c1058fd85bd748c986]. > Slow query log indicates opposite LTE when GTE operator > --- > > Key: CASSANDRA-15503 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15503 > Project: Cassandra > Issue Type: Bug > Components: Observability/Logging >Reporter: Wallace Baggaley >Assignee: Andres de la Peña >Priority: Normal > Fix For: 3.11.7, 4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Slow query log is indicating a '<=' when a ">=" operator was sent. This > appears to be a logging only issue, but it threw off development for a day > figuring this out. Please fix. > How to reproduce. Set slow query log timeout to 1 millisecond. > In cqlsh run > {noformat} > CREATE TABLE atable ( > .id text, > timestamp timestamp, > PRIMARY KEY ((id), timestamp) > ) WITH CLUSTERING ORDER BY (timestamp DESC); > insert into atable (id, timestamp) VALUES ( '1',1); > insert into atable (id, timestamp) VALUES ( '2',2); > insert into atable (id, timestamp) VALUES ( '3',3); > insert into atable (id, timestamp) VALUES ( '4',4); > insert into atable (id, timestamp) VALUES ( '5',5); > insert into atable (id, timestamp) VALUES ( '6',6); > insert into atable (id, timestamp) VALUES ( '7',7); > insert into atable (id, timestamp) VALUES ( '8',8); > insert into atable (id, timestamp) VALUES ( '9',9); > insert into atable (id, timestamp) VALUES ( '10',10); > insert into atable (id, timestamp) VALUES ( '11',11); > select * from atable where timestamp >= '1970-01-01 00:00:00.006+' allow > filtering; > {noformat} > In the logs it prints: > {noformat} > DEBUG 1 operations were slow in the last 5003 msecs: > , > time 7 msec - slow timeout 1 msec > {noformat} > But the query works appropriately and returns > {noformat} > id | timestamp > +- > 6 | 1970-01-01 00:00:00.006000+ > 7 | 1970-01-01 00:00:00.007000+ > 9 | 1970-01-01 00:00:00.009000+ > 10 | 1970-01-01 00:00:00.01+ > 8 | 1970-01-01 00:00:00.008000+ > 11 | 1970-01-01 00:00:00.011000+ > (6 rows) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-dtest] branch master updated: Add tests for CASSANDRA-15503
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git The following commit(s) were added to refs/heads/master by this push: new 10ff82b Add tests for CASSANDRA-15503 10ff82b is described below commit 10ff82bf779289da913b40c1058fd85bd748c986 Author: Andrés de la Peña AuthorDate: Mon May 11 11:59:52 2020 +0100 Add tests for CASSANDRA-15503 --- cql_test.py | 342 +--- 1 file changed, 259 insertions(+), 83 deletions(-) diff --git a/cql_test.py b/cql_test.py index 659cbae..dcfc161 100644 --- a/cql_test.py +++ b/cql_test.py @@ -1040,52 +1040,7 @@ class TestCQLSlowQuery(CQLTester): node = cluster.nodelist()[0] session = self.patient_cql_connection(node) -create_ks(session, 'ks', 1) -session.execute(""" -CREATE TABLE test1 ( -id int, -col int, -val text, -PRIMARY KEY(id, col) -); -""") - -for i in range(100): -session.execute("INSERT INTO test1 (id, col, val) VALUES (1, {}, 'foo')".format(i)) - -# only check debug logs because at INFO level the no-spam logger has unpredictable results -mark = node.mark_log(filename='debug.log') - -session.execute(SimpleStatement("SELECT * from test1", -consistency_level=ConsistencyLevel.ONE, -retry_policy=FallthroughRetryPolicy())) - -node.watch_log_for(["operations were slow", "SELECT \* FROM ks.test1"], - from_mark=mark, filename='debug.log', timeout=60) -mark = node.mark_log(filename='debug.log') - -session.execute(SimpleStatement("SELECT * from test1 where id = 1", -consistency_level=ConsistencyLevel.ONE, -retry_policy=FallthroughRetryPolicy())) - -node.watch_log_for(["operations were slow", "SELECT \* FROM ks.test1"], - from_mark=mark, filename='debug.log', timeout=60) -mark = node.mark_log(filename='debug.log') - -session.execute(SimpleStatement("SELECT * from test1 where id = 1", -consistency_level=ConsistencyLevel.ONE, -retry_policy=FallthroughRetryPolicy())) - -node.watch_log_for(["operations were slow", "SELECT \* FROM ks.test1"], - from_mark=mark, filename='debug.log', timeout=60) -mark = node.mark_log(filename='debug.log') - -session.execute(SimpleStatement("SELECT * from test1 where token(id) < 0", -consistency_level=ConsistencyLevel.ONE, -retry_policy=FallthroughRetryPolicy())) - -node.watch_log_for(["operations were slow", "SELECT \* FROM ks.test1"], - from_mark=mark, filename='debug.log', timeout=60) +self._assert_logs_slow_queries(node, session) def test_remote_query(self): """ @@ -1120,42 +1075,7 @@ class TestCQLSlowQuery(CQLTester): session = self.patient_exclusive_cql_connection(node1) -create_ks(session, 'ks', 1) -session.execute(""" -CREATE TABLE test2 ( -id int, -col int, -val text, -PRIMARY KEY(id, col) -); -""") - -for i, j in itertools.product(list(range(100)), list(range(10))): -session.execute("INSERT INTO test2 (id, col, val) VALUES ({}, {}, 'foo')".format(i, j)) - -# only check debug logs because at INFO level the no-spam logger has unpredictable results -mark = node2.mark_log(filename='debug.log') -session.execute(SimpleStatement("SELECT * from test2", -consistency_level=ConsistencyLevel.ONE, -retry_policy=FallthroughRetryPolicy())) -node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2"], -from_mark=mark, filename='debug.log', timeout=60) - - -mark = node2.mark_log(filename='debug.log') -session.execute(SimpleStatement("SELECT * from test2 where id = 1", -consistency_level=ConsistencyLevel.ONE, -retry_policy=FallthroughRetryPolicy())) -node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2 WHERE id = 1"], -from_mark=mark, filename='debug.log', timeout=60) - - -mark = node2.mark_log(filename='debug.log') -session.execute(SimpleStatement("SELECT * from test2 where token(id) <= 0", -
[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git commit 5f61e94ea5d25aae1eb96e74512e55edef1cef14 Merge: 66eae58 406a859 Author: Andrés de la Peña AuthorDate: Mon May 11 11:45:50 2020 +0100 Merge branch 'cassandra-3.11' into trunk # Conflicts: # CHANGES.txt # src/java/org/apache/cassandra/db/DataRange.java CHANGES.txt | 1 + src/java/org/apache/cassandra/db/DataRange.java | 11 +++ src/java/org/apache/cassandra/db/Slices.java | 16 ++-- .../cassandra/db/filter/ClusteringIndexNamesFilter.java | 4 ++-- 4 files changed, 24 insertions(+), 8 deletions(-) diff --cc CHANGES.txt index 3e7343c,46625b3..e6acf40 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,29 -1,6 +1,30 @@@ -3.11.7 +4.0-alpha5 + * Only calculate dynamicBadnessThreshold once per loop in DynamicEndpointSnitch (CASSANDRA-15798) + * Cleanup redundant nodetool commands added in 4.0 (CASSANDRA-15256) + * Update to Python driver 3.23 for cqlsh (CASSANDRA-15793) + * Add tunable initial size and growth factor to RangeTombstoneList (CASSANDRA-15763) + * Improve debug logging in SSTableReader for index summary (CASSANDRA-15755) + * bin/sstableverify should support user provided token ranges (CASSANDRA-15753) + * Improve logging when mutation passed to commit log is too large (CASSANDRA-14781) + * replace LZ4FastDecompressor with LZ4SafeDecompressor (CASSANDRA-15560) + * Fix buffer pool NPE with concurrent release due to in-progress tiny pool eviction (CASSANDRA-15726) + * Avoid race condition when completing stream sessions (CASSANDRA-15666) + * Flush with fast compressors by default (CASSANDRA-15379) + * Fix CqlInputFormat regression from the switch to system.size_estimates (CASSANDRA-15637) + * Allow sending Entire SSTables over SSL (CASSANDRA-15740) + * Fix CQLSH UTF-8 encoding issue for Python 2/3 compatibility (CASSANDRA-15739) + * Fix batch statement preparation when multiple tables and parameters are used (CASSANDRA-15730) + * Fix regression with traceOutgoingMessage printing message size (CASSANDRA-15687) + * Ensure repaired data tracking reads a consistent amount of data across replicas (CASSANDRA-15601) + * Fix CQLSH to avoid arguments being evaluated (CASSANDRA-15660) + * Correct Visibility and Improve Safety of Methods in LatencyMetrics (CASSANDRA-15597) + * Allow cqlsh to run with Python2.7/Python3.6+ (CASSANDRA-15659,CASSANDRA-15573) + * Improve logging around incremental repair (CASSANDRA-15599) + * Do not check cdc_raw_directory filesystem space if CDC disabled (CASSANDRA-15688) + * Replace array iterators with get by index (CASSANDRA-15394) + * Minimize BTree iterator allocations (CASSANDRA-15389) +Merged from 3.11: + * Fix CQL formatting of read command restrictions for slow query log (CASSANDRA-15503) - * Allow sstableloader to use SSL on the native port (CASSANDRA-14904) Merged from 3.0: * liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted (CASSANDRA-15674) * Fix Debian init start/stop (CASSANDRA-15770) diff --cc src/java/org/apache/cassandra/db/DataRange.java index 420f3da,c77d9dc..4db4198 --- a/src/java/org/apache/cassandra/db/DataRange.java +++ b/src/java/org/apache/cassandra/db/DataRange.java @@@ -274,15 -274,16 +274,16 @@@ public class DataRang return sb.toString(); } -private void appendClause(PartitionPosition pos, StringBuilder sb, CFMetaData metadata, boolean isStart, boolean isInclusive) +private void appendClause(PartitionPosition pos, StringBuilder sb, TableMetadata metadata, boolean isStart, boolean isInclusive) { sb.append("token("); - sb.append(ColumnDefinition.toCQLString(metadata.partitionKeyColumns())); +sb.append(ColumnMetadata.toCQLString(metadata.partitionKeyColumns())); - sb.append(") ").append(getOperator(isStart, isInclusive)).append(" "); + sb.append(") "); if (pos instanceof DecoratedKey) { + sb.append(getOperator(isStart, isInclusive)).append(" "); sb.append("token("); -appendKeyString(sb, metadata.getKeyValidator(), ((DecoratedKey)pos).getKey()); +appendKeyString(sb, metadata.partitionKeyType, ((DecoratedKey)pos).getKey()); sb.append(")"); } else diff --cc src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java index f25dc91,6c7e14b..63815a1 --- a/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java +++ b/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java @@@ -189,9 -189,9 +189,9 @@@ public class ClusteringIndexNamesFilte return sb.append(')').toString(); } -public String toCQLString(CFMetaData metadata)
[cassandra] branch cassandra-3.11 updated: Fix CQL formatting of read command restrictions for slow query log
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a commit to branch cassandra-3.11 in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/cassandra-3.11 by this push: new 406a859 Fix CQL formatting of read command restrictions for slow query log 406a859 is described below commit 406a8596eb7ad18079df20121521b1c659063ef4 Author: Andrés de la Peña AuthorDate: Mon May 11 11:14:54 2020 +0100 Fix CQL formatting of read command restrictions for slow query log patch by Andres de la Peña; reviewed by Berenguer Blasi for CASSANDRA-15503 --- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/DataRange.java | 11 +++ src/java/org/apache/cassandra/db/Slices.java | 16 ++-- .../cassandra/db/filter/ClusteringIndexNamesFilter.java | 4 ++-- 4 files changed, 24 insertions(+), 8 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index c326801..46625b3 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.11.7 + * Fix CQL formatting of read command restrictions for slow query log (CASSANDRA-15503) * Allow sstableloader to use SSL on the native port (CASSANDRA-14904) Merged from 3.0: * liveDiskSpaceUsed and totalDiskSpaceUsed get corrupted if IndexSummaryRedistribution gets interrupted (CASSANDRA-15674) diff --git a/src/java/org/apache/cassandra/db/DataRange.java b/src/java/org/apache/cassandra/db/DataRange.java index d2f9c76..c77d9dc 100644 --- a/src/java/org/apache/cassandra/db/DataRange.java +++ b/src/java/org/apache/cassandra/db/DataRange.java @@ -67,7 +67,7 @@ public class DataRange */ public static DataRange allData(IPartitioner partitioner) { -return forTokenRange(new Range(partitioner.getMinimumToken(), partitioner.getMinimumToken())); +return forTokenRange(new Range<>(partitioner.getMinimumToken(), partitioner.getMinimumToken())); } /** @@ -105,7 +105,7 @@ public class DataRange */ public static DataRange allData(IPartitioner partitioner, ClusteringIndexFilter filter) { -return new DataRange(Range.makeRowRange(new Range(partitioner.getMinimumToken(), partitioner.getMinimumToken())), filter); +return new DataRange(Range.makeRowRange(new Range<>(partitioner.getMinimumToken(), partitioner.getMinimumToken())), filter); } /** @@ -278,16 +278,19 @@ public class DataRange { sb.append("token("); sb.append(ColumnDefinition.toCQLString(metadata.partitionKeyColumns())); -sb.append(") ").append(getOperator(isStart, isInclusive)).append(" "); +sb.append(") "); if (pos instanceof DecoratedKey) { +sb.append(getOperator(isStart, isInclusive)).append(" "); sb.append("token("); appendKeyString(sb, metadata.getKeyValidator(), ((DecoratedKey)pos).getKey()); sb.append(")"); } else { -sb.append(((Token.KeyBound)pos).getToken()); +Token.KeyBound keyBound = (Token.KeyBound) pos; +sb.append(getOperator(isStart, isStart == keyBound.isMinimumBound)).append(" "); +sb.append(keyBound.getToken()); } } diff --git a/src/java/org/apache/cassandra/db/Slices.java b/src/java/org/apache/cassandra/db/Slices.java index b3fd20a..93dcab9 100644 --- a/src/java/org/apache/cassandra/db/Slices.java +++ b/src/java/org/apache/cassandra/db/Slices.java @@ -613,6 +613,8 @@ public abstract class Slices implements Iterable } else { +boolean isReversed = column.isReversedType(); + // As said above, we assume (without checking) that this means all ComponentOfSlice for this column // are the same, so we only bother about the first. if (first.startValue != null) @@ -620,14 +622,24 @@ public abstract class Slices implements Iterable if (needAnd) sb.append(" AND "); needAnd = true; -sb.append(column.name).append(first.startInclusive ? " >= " : " > ").append(column.type.getString(first.startValue)); +sb.append(column.name); +if (isReversed) +sb.append(first.startInclusive ? " <= " : " < "); +else +sb.append(first.startInclusive ? " >= " : " > "); +sb.append(column.type.getString(first.startValue)); } if (first.endValue != null) { if (needAnd) sb.append(" AND "); needAnd = true; -
[cassandra] branch trunk updated (66eae58 -> 5f61e94)
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git. from 66eae58 Only calculate dynamicBadnessThreshold once per loop in DynamicEndpointSnitch new 406a859 Fix CQL formatting of read command restrictions for slow query log new 5f61e94 Merge branch 'cassandra-3.11' into trunk The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGES.txt | 1 + src/java/org/apache/cassandra/db/DataRange.java | 11 +++ src/java/org/apache/cassandra/db/Slices.java | 16 ++-- .../cassandra/db/filter/ClusteringIndexNamesFilter.java | 4 ++-- 4 files changed, 24 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15323) MessagingServiceTest failed with method listenRequiredSecureConnectionWithBroadcastAddr ON MAC OS
[ https://issues.apache.org/jira/browse/CASSANDRA-15323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104208#comment-17104208 ] Angelo Polo commented on CASSANDRA-15323: - Adding here for reference: the problem and fix provided in the description apply to FreeBSD as well. > MessagingServiceTest failed with method > listenRequiredSecureConnectionWithBroadcastAddrON MAC OS > -- > > Key: CASSANDRA-15323 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15323 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: maxwellguo >Assignee: maxwellguo >Priority: Normal > Fix For: 4.0 > > Attachments: exception.png > > > when I do unit test on mac os for cassandra4.0 tag version , I found that the > unit test failed when doing MessagingServiceTest on method > listenRequiredSecureConnectionWithBroadcastAddr. > I found out that it is because that the mac os can not get connect to ip > address 127.0.0.2 on default. > so when you doing : ant test -Dtest.name=MessagingServiceTest > -Dtest.methods=listenRequiredSecureConnectionWithBroadcastAddr > you can get a bind exception : can not assign request address. > !exception.png! > what to do with it ,you can just set the 127.0.0.2 by : > sudo ifconfig lo0 alias 127.0.0.2 netmask 0x > then the unit can run successfully . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104191#comment-17104191 ] Dinesh Joshi edited comment on CASSANDRA-15783 at 5/11/20, 8:14 AM: bq. The difference is not about compaction strategy, but the streaming method.. I understand the issue is related to the streaming method. However, here, the issue went unnoticed because the test only exercised STCS and not LCS. If the test would've exercised both strategies then the issue would've surfaced earlier when we added ZCS to LCS. was (Author: djoshi3): The difference is not about compaction strategy, but the streaming method.. I understand the issue is related to the streaming method. However, here, the issue went unnoticed because the test only exercised STCS and not LCS. If the test would've exercised both strategies then the issue would've surfaced earlier when we added ZCS to LCS. > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104191#comment-17104191 ] Dinesh Joshi commented on CASSANDRA-15783: -- The difference is not about compaction strategy, but the streaming method.. I understand the issue is related to the streaming method. However, here, the issue went unnoticed because the test only exercised STCS and not LCS. If the test would've exercised both strategies then the issue would've surfaced earlier when we added ZCS to LCS. > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104181#comment-17104181 ] ZhaoYang edited comment on CASSANDRA-15783 at 5/11/20, 8:04 AM: {quote}Could you also update the Python dtests so they’ll check with both STCS and LCS. {quote} The difference is not about compaction strategy, but the streaming method.. I has moved transient replica repair dtests into {{"TestTransientReplicationRepairStreamEntireSSTable"}} and created another class {{"TestTransientReplicationRepairLegacyStreaming"}} with "stream_entire_sstables=false".. WDYT? [patch|https://github.com/apache/cassandra/pull/587/files]: mark "isTransient=false" for entire-streaming sstables and fixed some typos in transient replica document. [dtest|https://github.com/apache/cassandra-dtest/pull/68]: add legacy streaming for transient replica repair tests. [CI|https://circleci.com/workflow-run/6b3e7023-4f1d-4611-bfe9-7b63367a78fa] was (Author: jasonstack): {quote}Could you also update the Python dtests so they’ll check with both STCS and LCS. {quote} The difference is not about compaction strategy, but the streaming method.. I has moved transient replica repair dtests into {{"TestTransientReplicationRepairStreamEntireSSTable"}} and created another class {{"TestTransientReplicationRepairLegacyStreaming"}} with "stream_entire_sstables=false".. WDYT? [patch|https://github.com/apache/cassandra/pull/587/files]: mark "isTransient=false" for ZCS sstables and fixed some typos in transient replica document. [dtest|https://github.com/apache/cassandra-dtest/pull/68]: add legacy streaming for transient replica repair tests. [CI|https://circleci.com/workflow-run/6b3e7023-4f1d-4611-bfe9-7b63367a78fa] > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15783) test_optimized_primary_range_repair - transient_replication_test.TestTransientReplication
[ https://issues.apache.org/jira/browse/CASSANDRA-15783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104181#comment-17104181 ] ZhaoYang commented on CASSANDRA-15783: -- {quote}Could you also update the Python dtests so they’ll check with both STCS and LCS. {quote} The difference is not about compaction strategy, but the streaming method.. I has moved transient replica repair dtests into {{"TestTransientReplicationRepairStreamEntireSSTable"}} and created another class {{"TestTransientReplicationRepairLegacyStreaming"}} with "stream_entire_sstables=false".. WDYT? [patch|https://github.com/apache/cassandra/pull/587/files]: mark "isTransient=false" for ZCS sstables and fixed some typos in transient replica document. [dtest|https://github.com/apache/cassandra-dtest/pull/68]: add legacy streaming for transient replica repair tests. [CI|https://circleci.com/workflow-run/6b3e7023-4f1d-4611-bfe9-7b63367a78fa] > test_optimized_primary_range_repair - > transient_replication_test.TestTransientReplication > - > > Key: CASSANDRA-15783 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15783 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > Dtest failure. > Example: > https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/118/workflows/9e57522d-52fa-4d44-88d8-5cec0e87f517/jobs/585/tests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15800) BinLog deadlock on stopping when the sample queue is full
[ https://issues.apache.org/jira/browse/CASSANDRA-15800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-15800: -- Resolution: Not A Bug Status: Resolved (was: Open) I was not looking at the code carefully enough. The consumer thread in the {{BinLog}} does drain the queue once {{shouldContinue}} is set to false. The object {{NO_OP}} can be eventually put and {{stop()}} can proceed. And my test was not properly constructed (by not starting the consumer thread) so seeing the {{stop()}} method being blocking. > BinLog deadlock on stopping when the sample queue is full > - > > Key: CASSANDRA-15800 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15800 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Tools, Observability/Logging, Tool/fql >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > A deadlock can happen when 1) the BinLog is being stoped and 2) the BinLog's > internal sample queue is full. > When stopping, BinLog first set the flag shouldContinue to false, so that the > internal consumer thread stop consuming. It is possible to leave the queue > being full. > Then, the BinLog puts one extra object NO_OP into the sample queue. However, > the queue is already full, so the put operation blocks, and the stop method > never returns. > Therefore, we got a deadlock. > BinLog is used by Cassandra 40 new features such as audit logging and full > query logging. > If such deadlock happens, the thread cannot be not joined and the referenced > items in the queue are never released, hence memory leak. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15725) Add support for adding custom Verbs
[ https://issues.apache.org/jira/browse/CASSANDRA-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104143#comment-17104143 ] Marcus Eriksson edited comment on CASSANDRA-15725 at 5/11/20, 7:19 AM: --- good point regarding the id conflicts, pushed a simpler commit that just makes sure that {{minCustomId > max}} was (Author: krummas): good point regarding the id conflicts, pushed a simpler commit that just makes sure that `minCustomId > max` > Add support for adding custom Verbs > --- > > Key: CASSANDRA-15725 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15725 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > Attachments: feedback_15725.patch > > > It should be possible to safely add custom/internal Verbs - without risking > conflicts when new ones are added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15725) Add support for adding custom Verbs
[ https://issues.apache.org/jira/browse/CASSANDRA-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104143#comment-17104143 ] Marcus Eriksson commented on CASSANDRA-15725: - good point regarding the id conflicts, pushed a simpler commit that just makes sure that `minCustomId > max` > Add support for adding custom Verbs > --- > > Key: CASSANDRA-15725 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15725 > Project: Cassandra > Issue Type: Improvement > Components: Messaging/Internode >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 4.0-alpha > > Attachments: feedback_15725.patch > > > It should be possible to safely add custom/internal Verbs - without risking > conflicts when new ones are added. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15802) Commented-out lines that end in a semicolon cause an error.
null created CASSANDRA-15802: - Summary: Commented-out lines that end in a semicolon cause an error. Key: CASSANDRA-15802 URL: https://issues.apache.org/jira/browse/CASSANDRA-15802 Project: Cassandra Issue Type: Bug Components: CQL/Interpreter, CQL/Syntax Reporter: null Attachments: cqlsh.png Commented-out lines that end in a semicolon cause an error. For example: /* describe keyspaces; */ This produces an error: SyntaxException: line 2:22 no viable alternative at input ' (...* describe keyspaces;...) It works as expected if you use syntax: -- describe keyspaces; Environment: python:3.7.7-slim-stretch (docker image) I found that this was first seen here, and was patched, but the bug appears to have resurfaced: https://issues.apache.org/jira/browse/CASSANDRA-2488 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org