[jira] [Commented] (CASSANDRA-16101) Make sure we don't throw any uncaught exceptions during in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193998#comment-17193998 ] Marcus Eriksson commented on CASSANDRA-16101: - bq. can we also have a way for the test to add filters to exclude expected exceptions? I did this first, but then thought we could just catch any exceptions and ignore them if expected, do you have an example where this would not be enough? > Make sure we don't throw any uncaught exceptions during in-jvm dtests > - > > Key: CASSANDRA-16101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16101 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > > We should assert that we don't throw any uncaught exceptions when running > in-jvm dtests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16101) Make sure we don't throw any uncaught exceptions during in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-16101: Reviewers: Alex Petrov, David Capwell (was: Alex Petrov) > Make sure we don't throw any uncaught exceptions during in-jvm dtests > - > > Key: CASSANDRA-16101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16101 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > > We should assert that we don't throw any uncaught exceptions when running > in-jvm dtests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15949) NPE thrown while updating speculative execution time if table is removed during task execution
[ https://issues.apache.org/jira/browse/CASSANDRA-15949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193979#comment-17193979 ] Caleb Rackliffe commented on CASSANDRA-15949: - [~dcapwell] [~jmeredithco] I think I can see a sequence that produces the error above now. Here goes... 1.) We drop a keyspace, which hits {{Schema#dropKeyspace()}}. 2.) This winds around a bit, but finally clears the keyspace name from the {{keyspaceInstances}} field of {{Schema}}. However, the {{keyspaces}} field still thinks the keyspace is present right before {{dropKeyspace()}} proceeds to {{unload()}}. 3.) At this point, the speculative retry threshold task decides it's a good time to run. It hits {{Keyspace.open()}} and sees that, according to {{Schema#getKeyspaceInstance()}}, the keyspace doesn't exist! 4.) We move into the {{Keyspace}} constructor just in time to get a reference to a {{KeyspaceMetadata}} from {{keyspaces}} in {{Schema}} that thinks the table in question still exists. 5.) Immediately after this happens, the original thread continues into {{unload()}} and drops the hammer on everything, including {{metadataRefs}}. 6.) The task thread wakes up and proceeds in the {{Keyspace}} constructor to try to get a {{TableMetadataRef}}, but of course, it's gone. If the sequence above is coherent, I think it means the [current patch|https://github.com/apache/cassandra/pull/733/files] is at least an improvement, given it stops the {{Keyspace}} constructor before it proceeds to {{initCf}} and doesn't kill all future executions of the threshold update task in addition. My only concern is that we might still be able to hit {{initCf()}} if we get a {{TableMetadataRef}} _just_ before {{unload()}} blows it away, which seems like it would create a new {{ColumnFamilyStore}}. So naïve question...why do we allow the speculative retry threshold updater task to create keyspaces at all, ever? It seems like we could approach this an entirely different way...by just having something like {{Keyspace.allExisting()}} that just uses the non-null results of {{Schema#getKeyspaceInstance()}} instead of {{Keyspace.open()}}. Even if one of those instances is in the process of being removed, updating the thresholds on a doomed CFS is harmless. (Also, we can avoid a new esoteric bit of logging trying to explain how our schema updates work.) > NPE thrown while updating speculative execution time if table is removed > during task execution > -- > > Key: CASSANDRA-15949 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15949 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Jon Meredith >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 40m > Remaining Estimate: 0h > > CASSANDRA-14338 fixed the scheduling the speculation retry threshold > calculation, but if the task happens to be scheduled while a table is being > dropped, it triggers an NPE. > ERROR 2020-07-14T11:34:55,762 [OptionalTasks:1] > org.apache.cassandra.service.CassandraDaemon:446 - Exception in thread > Thread[OptionalTasks:1,5,main] > java.lang.NullPointerException: null > at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:444) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace.(Keyspace.java:346) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace.open(Keyspace.java:139) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace.open(Keyspace.java:116) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:102) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:99) > ~[cassandra-4.0.0.jar:4.0.0] > at > com.google.common.collect.Iterables$5.lambda$forEach$0(Iterables.java:704) > ~[guava-27.0-jre.jar:?] > at > com.google.common.collect.IndexedImmutableSet.forEach(IndexedImmutableSet.java:45) > ~[guava-27.0-jre.jar:?] > at com.google.common.collect.Iterables$5.forEach(Iterables.java:704) > ~[guava-27.0-jre.jar:?] > at > org.apache.cassandra.service.CassandraDaemon.lambda$setup$2(CassandraDaemon.java:412) > ~[cassandra-4.0.0.jar:4.0.0] > at > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) > [cassandra-4.0.0.jar:4.0.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) > [?:?] > at >
[jira] [Updated] (CASSANDRA-15164) Overflowed Partition Cell Histograms Can Prevent Compactions from Executing
[ https://issues.apache.org/jira/browse/CASSANDRA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-15164: Fix Version/s: 4.0-beta > Overflowed Partition Cell Histograms Can Prevent Compactions from Executing > --- > > Key: CASSANDRA-15164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15164 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter >Reporter: Ankur Jha >Assignee: Caleb Rackliffe >Priority: Urgent > Labels: compaction, partition > Fix For: 4.0-beta > > > Hi, we are running 6 node Cassandra cluster in production with 3 seed node > but from last night one of our seed nodes is continuously throwing an error > like this;- > cassandra.protocol.ServerError: message="java.lang.IllegalStateException: Unable to compute ceiling for max > when histogram overflowed"> > For a cluster to be up and running I Drained this node. > Can somebody help me out with this? > > Any help or lead would be appreciated > > Note : We are using Cassandra version 3.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15164) Overflowed Partition Cell Histograms Can Prevent Compactions from Executing
[ https://issues.apache.org/jira/browse/CASSANDRA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193934#comment-17193934 ] Caleb Rackliffe commented on CASSANDRA-15164: - Let's revisit the stack trace from CASSANDRA-15326, which is almost certainly the same here: {noformat} Exception in thread Thread[CompactionExecutor:113041,1,main] java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:231) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:220) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:115) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.io.sstable.format.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1926) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:424) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:99) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:183) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:153) ~[apache-cassandra-3.11.4.jar:3.11.4] {noformat} All our compaction strategies, at some point, want to know what the ratio of droppable tombstones to cells is on average for the partitions in an SSTable. However, if we ever have more than about 1.9 billion cells in a partition, the {{EstimatedHistogram}} that tracks this will overflow. Then, when compaction attempts to get the mean number of cells per partition {{EstimatedHistogram}} throws an {{IllegalStateException}} that aborts the attempt at compaction. This can continue indefinitely. In C* 4.0, full checksum validation for metadata components exists, but it's also possible that, in previous versions, the serialization/deserialization cycle for {{EstimatedHistogram}} could introduce corruption that breaks the mean calculation. > Overflowed Partition Cell Histograms Can Prevent Compactions from Executing > --- > > Key: CASSANDRA-15164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15164 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter >Reporter: Ankur Jha >Assignee: Caleb Rackliffe >Priority: Urgent > Labels: compaction, partition > > Hi, we are running 6 node Cassandra cluster in production with 3 seed node > but from last night one of our seed nodes is continuously throwing an error > like this;- > cassandra.protocol.ServerError: message="java.lang.IllegalStateException: Unable to compute ceiling for max > when histogram overflowed"> > For a cluster to be up and running I Drained this node. > Can somebody help me out with this? > > Any help or lead would be appreciated > > Note : We are using Cassandra version 3.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15949) NPE thrown while updating speculative execution time if table is removed during task execution
[ https://issues.apache.org/jira/browse/CASSANDRA-15949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15949: -- Reviewers: David Capwell, David Capwell (was: David Capwell) David Capwell, David Capwell Status: Review In Progress (was: Patch Available) > NPE thrown while updating speculative execution time if table is removed > during task execution > -- > > Key: CASSANDRA-15949 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15949 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Jon Meredith >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-14338 fixed the scheduling the speculation retry threshold > calculation, but if the task happens to be scheduled while a table is being > dropped, it triggers an NPE. > ERROR 2020-07-14T11:34:55,762 [OptionalTasks:1] > org.apache.cassandra.service.CassandraDaemon:446 - Exception in thread > Thread[OptionalTasks:1,5,main] > java.lang.NullPointerException: null > at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:444) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace.(Keyspace.java:346) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace.open(Keyspace.java:139) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace.open(Keyspace.java:116) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:102) > ~[cassandra-4.0.0.jar:4.0.0] > at org.apache.cassandra.db.Keyspace$1.apply(Keyspace.java:99) > ~[cassandra-4.0.0.jar:4.0.0] > at > com.google.common.collect.Iterables$5.lambda$forEach$0(Iterables.java:704) > ~[guava-27.0-jre.jar:?] > at > com.google.common.collect.IndexedImmutableSet.forEach(IndexedImmutableSet.java:45) > ~[guava-27.0-jre.jar:?] > at com.google.common.collect.Iterables$5.forEach(Iterables.java:704) > ~[guava-27.0-jre.jar:?] > at > org.apache.cassandra.service.CassandraDaemon.lambda$setup$2(CassandraDaemon.java:412) > ~[cassandra-4.0.0.jar:4.0.0] > at > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) > [cassandra-4.0.0.jar:4.0.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) > [?:?] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [netty-all-4.1.37.Final.jar:4.1.37.Final] > at java.lang.Thread.run(Thread.java:834) [?:?] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15164) Overflowed Partition Cell Histograms Can Prevent Compactions from Executing
[ https://issues.apache.org/jira/browse/CASSANDRA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-15164: Summary: Overflowed Partition Cell Histograms Can Prevent Compactions from Executing (was: Unable to compute ceiling for max when histogram overflowed">) > Overflowed Partition Cell Histograms Can Prevent Compactions from Executing > --- > > Key: CASSANDRA-15164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15164 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter >Reporter: Ankur Jha >Assignee: Caleb Rackliffe >Priority: Urgent > Labels: compaction, partition > > Hi, we are running 6 node Cassandra cluster in production with 3 seed node > but from last night one of our seed nodes is continuously throwing an error > like this;- > cassandra.protocol.ServerError: message="java.lang.IllegalStateException: Unable to compute ceiling for max > when histogram overflowed"> > For a cluster to be up and running I Drained this node. > Can somebody help me out with this? > > Any help or lead would be appreciated > > Note : We are using Cassandra version 3.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16101) Make sure we don't throw any uncaught exceptions during in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193925#comment-17193925 ] David Capwell commented on CASSANDRA-16101: --- LGTM but left a comment in the cassandra commit. If a test causes uncaught exception to be thrown then it will fail, can we also have a way for the test to add filters to exclude expected exceptions? > Make sure we don't throw any uncaught exceptions during in-jvm dtests > - > > Key: CASSANDRA-16101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16101 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > > We should assert that we don't throw any uncaught exceptions when running > in-jvm dtests -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16057) Should update in-jvm dtest to expose stdout and stderr for nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-16057: -- Reviewers: David Capwell, David Capwell (was: David Capwell) David Capwell, David Capwell Status: Review In Progress (was: Patch Available) Overall LGTM, only minor comments put in the PRs. [~ifesdjeen] could you review as well? We were going to release 0.0.5 soon I thought, so would be nice to get this in > Should update in-jvm dtest to expose stdout and stderr for nodetool > --- > > Key: CASSANDRA-16057 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16057 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Time Spent: 20m > Remaining Estimate: 0h > > Many nodetool commands output to stdout or stderr so running nodetool using > in-jvm dtest should expose that to tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193913#comment-17193913 ] Ekaterina Dimitrova edited comment on CASSANDRA-16063 at 9/11/20, 12:00 AM: So the reason for the [CommitLog.instance.start()|#L214] being set before the startup checks are {code:java} checkSystemKeyspaceState, checkDatacenter, checkRack, checkLegacyAuthTables{code} and the following dependency: * In order to read from the table, we have to first create a {{ColumnFamilyStore.}} * {{ColumnFamilyStore}}, in order to function "normally", has to create a memtable. * In order to create Memtable, we have to get a current position from commit log. I'll have to think in detail about how we can workaround this but at first glance it doesn't look trivial to me(?) and I am not sure whether it will qualify for beta(?). I will get back to this again tomorrow. was (Author: e.dimitrova): So the reason for the [CommitLog.instance.start()|#L214] being set before the startup checks are {code:java} checkSystemKeyspaceState, checkDatacenter, checkRack, checkLegacyAuthTables{code} and the following dependency: * In order to read from the table, we have to first create a {{ColumnFamilyStore.}} * {{ColumnFamilyStore}}, in order to function "normally", has to create a memtable. * In order to create Memtable, we have to get a current position from commit log. I'll have to think in detail about how we can workaround this but it doesn't look trivial and not sure whether it will qualify for beta. I will get back to this again tomorrow. > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193913#comment-17193913 ] Ekaterina Dimitrova commented on CASSANDRA-16063: - So the reason for the [CommitLog.instance.start()|#L214] being set before the startup checks are {code:java} checkSystemKeyspaceState, checkDatacenter, checkRack, checkLegacyAuthTables{code} and the following dependency: * In order to read from the table, we have to first create a {{ColumnFamilyStore.}} * {{ColumnFamilyStore}}, in order to function "normally", has to create a memtable. * In order to create Memtable, we have to get a current position from commit log. I'll have to think in detail about how we can workaround this but it doesn't look trivial and not sure whether it will qualify for beta. I will get back to this again tomorrow. > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging
[ https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193902#comment-17193902 ] Adam Holmberg commented on CASSANDRA-15958: --- Thanks for kicking that off. > org.apache.cassandra.net.ConnectionTest testMessagePurging > -- > > Key: CASSANDRA-15958 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15958 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > Build: > https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/ > Build: > https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/ > java.util.concurrent.TimeoutException > at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258) > at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15160) Add flag to ignore unreplicated keyspaces during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193900#comment-17193900 ] David Capwell commented on CASSANDRA-15160: --- Finished testing. What I see is that with the flag you can get the same behavior as 2.1 and without the flag (default) we keep the current behavior. LGTM +1. It would be nice to add the tests below, as it tests the different conditions where the neighbors can be empty, also it would be nice if the Python dtests could be jvm dtests but I think we need https://issues.apache.org/jira/browse/CASSANDRA-16120 merged/released before we could so the Python dtests are fine by me. {code} package org.apache.cassandra.distributed.test; import java.io.IOException; import org.junit.Test; import org.apache.cassandra.distributed.Cluster; import org.apache.cassandra.distributed.api.ConsistencyLevel; import org.apache.cassandra.distributed.api.IInvokableInstance; import org.apache.cassandra.distributed.shared.Versions; import org.assertj.core.api.Assertions; public class RepairFilteringTest extends TestBaseImpl { private static final Versions VERSIONS = Versions.find(); //private static final Versions.Version VERSION = VERSIONS.getLatest(Versions.Major.v22); private static final Versions.Version VERSION = VERSIONS.getLatest(Versions.Major.v4); @Test public void dcFilterOnEmptyDC() throws IOException { try (Cluster cluster = Cluster.build().withVersion(VERSION).withRacks(2, 1, 2).start()) { // 1-2 : datacenter1 // 3-4 : datacenter2 cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1':2, 'datacenter2':0}"); cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (id int PRIMARY KEY, i int)"); for (int i = 0; i < 10; i++) cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (id, i) VALUES (?, ?)", ConsistencyLevel.ALL, i, i); cluster.forEach(i -> i.flush(KEYSPACE)); // choose a node in the DC that doesn't have any replicas IInvokableInstance node = cluster.get(3); Assertions.assertThat(node.config().localDatacenter()).isEqualTo("datacenter2"); // fails with "the local data center must be part of the repair" node.nodetoolResult("repair", "-full", "-dc", "datacenter1", "-dc", "datacenter2", "--ignore-unreplicated-keyspaces", "-st", "0", "-et", "1000", KEYSPACE, "tbl") .asserts().success(); } } @Test public void hostFilterDifferentDC() throws IOException { try (Cluster cluster = Cluster.build().withVersion(VERSION).withRacks(2, 1, 2).start()) { // 1-2 : datacenter1 // 3-4 : datacenter2 cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1':2, 'datacenter2':0}"); cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (id int PRIMARY KEY, i int)"); for (int i = 0; i < 10; i++) cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (id, i) VALUES (?, ?)", ConsistencyLevel.ALL, i, i); cluster.forEach(i -> i.flush(KEYSPACE)); // choose a node in the DC that doesn't have any replicas IInvokableInstance node = cluster.get(3); Assertions.assertThat(node.config().localDatacenter()).isEqualTo("datacenter2"); // fails with "Specified hosts [127.0.0.3, 127.0.0.1] do not share range (0,1000] needed for repair. Either restrict repair ranges with -st/-et options, or specify one of the neighbors that share this range with this node: [].. Check the logs on the repair participants for further details" node.nodetoolResult("repair", "-full", "-hosts", cluster.get(1).broadcastAddress().getAddress().getHostAddress(), "-hosts", node.broadcastAddress().getAddress().getHostAddress(), "--ignore-unreplicated-keyspaces", "-st", "0", "-et", "1000", KEYSPACE, "tbl") .asserts().success(); } } @Test public void emptyDC() throws IOException { try (Cluster cluster = Cluster.build().withVersion(VERSION).withRacks(2, 1, 2).start()) { // 1-2 : datacenter1 // 3-4 : datacenter2 cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1':2, 'datacenter2':0}");
[jira] [Updated] (CASSANDRA-15164) Unable to compute ceiling for max when histogram overflowed">
[ https://issues.apache.org/jira/browse/CASSANDRA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-15164: Bug Category: Parent values: Correctness(12982)Level 1 values: Recoverable Corruption / Loss(12986) Discovered By: User Report Status: Open (was: Triage Needed) > Unable to compute ceiling for max when histogram overflowed"> > - > > Key: CASSANDRA-15164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15164 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter >Reporter: Ankur Jha >Assignee: Caleb Rackliffe >Priority: Urgent > Labels: compaction, partition > > Hi, we are running 6 node Cassandra cluster in production with 3 seed node > but from last night one of our seed nodes is continuously throwing an error > like this;- > cassandra.protocol.ServerError: message="java.lang.IllegalStateException: Unable to compute ceiling for max > when histogram overflowed"> > For a cluster to be up and running I Drained this node. > Can somebody help me out with this? > > Any help or lead would be appreciated > > Note : We are using Cassandra version 3.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15164) Unable to compute ceiling for max when histogram overflowed">
[ https://issues.apache.org/jira/browse/CASSANDRA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe updated CASSANDRA-15164: Labels: compaction partition (was: ) > Unable to compute ceiling for max when histogram overflowed"> > - > > Key: CASSANDRA-15164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15164 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter >Reporter: Ankur Jha >Assignee: Caleb Rackliffe >Priority: Urgent > Labels: compaction, partition > > Hi, we are running 6 node Cassandra cluster in production with 3 seed node > but from last night one of our seed nodes is continuously throwing an error > like this;- > cassandra.protocol.ServerError: message="java.lang.IllegalStateException: Unable to compute ceiling for max > when histogram overflowed"> > For a cluster to be up and running I Drained this node. > Can somebody help me out with this? > > Any help or lead would be appreciated > > Note : We are using Cassandra version 3.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15164) Unable to compute ceiling for max when histogram overflowed">
[ https://issues.apache.org/jira/browse/CASSANDRA-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe reassigned CASSANDRA-15164: --- Assignee: Caleb Rackliffe > Unable to compute ceiling for max when histogram overflowed"> > - > > Key: CASSANDRA-15164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15164 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter >Reporter: Ankur Jha >Assignee: Caleb Rackliffe >Priority: Urgent > > Hi, we are running 6 node Cassandra cluster in production with 3 seed node > but from last night one of our seed nodes is continuously throwing an error > like this;- > cassandra.protocol.ServerError: message="java.lang.IllegalStateException: Unable to compute ceiling for max > when histogram overflowed"> > For a cluster to be up and running I Drained this node. > Can somebody help me out with this? > > Any help or lead would be appreciated > > Note : We are using Cassandra version 3.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15326) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe reassigned CASSANDRA-15326: --- Assignee: Caleb Rackliffe > Unable to compute ceiling for max when histogram overflowed > --- > > Key: CASSANDRA-15326 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15326 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Batch Log >Reporter: HsuML >Assignee: Caleb Rackliffe >Priority: Normal > > I have 9 cassandra nodes. When I create a keyspace that has 15 tables and the > record count of all table are about 8.2 billion. I imported data through my > java loaders, and I found out the system.log has error message. What happened > and how can I solve the error? > Exception in thread Thread[CompactionExecutor:113041,1,main] > java.lang.IllegalStateException: Unable to compute ceiling for max when > histogram overflowed > at > org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:231) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:220) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:115) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.io.sstable.format.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1926) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:424) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:99) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:183) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:153) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:260) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_191] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_191] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[na:1.8.0_191] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_191] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > [apache-cassandra-3.11.4.jar:3.11.4] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15326) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Rackliffe reassigned CASSANDRA-15326: --- Assignee: (was: Caleb Rackliffe) > Unable to compute ceiling for max when histogram overflowed > --- > > Key: CASSANDRA-15326 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15326 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Batch Log >Reporter: HsuML >Priority: Normal > > I have 9 cassandra nodes. When I create a keyspace that has 15 tables and the > record count of all table are about 8.2 billion. I imported data through my > java loaders, and I found out the system.log has error message. What happened > and how can I solve the error? > Exception in thread Thread[CompactionExecutor:113041,1,main] > java.lang.IllegalStateException: Unable to compute ceiling for max when > histogram overflowed > at > org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:231) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:220) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:115) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.io.sstable.format.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1926) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:424) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:99) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:183) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:153) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:260) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_191] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_191] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[na:1.8.0_191] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_191] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > [apache-cassandra-3.11.4.jar:3.11.4] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15406) Show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193867#comment-17193867 ] David Capwell commented on CASSANDRA-15406: --- thanks, I think I can start EOD or tomorrow. > Show the progress of data streaming and index build > > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0, 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . > > PR [https://github.com/apache/cassandra/pull/711] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging
[ https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193861#comment-17193861 ] Yifan Cai commented on CASSANDRA-15958: --- Unit test and dtest passed. [https://app.circleci.com/pipelines/github/yifan-c/cassandra/96/workflows/ddf05402-e19d-4a74-a14b-90a105dea475] +1 to the patch. > org.apache.cassandra.net.ConnectionTest testMessagePurging > -- > > Key: CASSANDRA-15958 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15958 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > Build: > https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/ > Build: > https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/ > java.util.concurrent.TimeoutException > at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258) > at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15160) Add flag to ignore unreplicated keyspaces during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193842#comment-17193842 ] David Capwell commented on CASSANDRA-15160: --- speaking to Marcus more it turns out that this was a regression in 3.0+ as this behavior was valid in 2.1 {code} package org.apache.cassandra.distributed.upgrade; import java.io.IOException; import org.junit.Test; import org.apache.cassandra.distributed.Cluster; import org.apache.cassandra.distributed.api.ConsistencyLevel; import org.apache.cassandra.distributed.api.IInvokableInstance; import org.apache.cassandra.distributed.shared.Versions; import org.assertj.core.api.Assertions; public class RepairFilteringTest extends UpgradeTestBase { @Test public void emptyDC() throws IOException { Versions versions = Versions.find(); Versions.Version version = versions.getLatest(Versions.Major.v22); try (Cluster cluster = Cluster.build().withVersion(version).withRacks(2, 1, 2).start()) { // 1-2 : datacenter1re // 3-4 : datacenter2 cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1':2, 'datacenter2':0}"); cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (id int PRIMARY KEY, i int)"); for (int i = 0; i < 10; i++) cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (id, i) VALUES (?, ?)", ConsistencyLevel.ALL, i, i); cluster.forEach(i -> i.flush(KEYSPACE)); // choose a node in the DC that doesn't have any replicas IInvokableInstance node = cluster.get(3); Assertions.assertThat(node.config().localDatacenter()).isEqualTo("datacenter2"); // fails with [2020-09-10 11:30:04,139] Repair command #1 failed with error Nothing to repair for (0,1000] in distributed_test_keyspace - aborting. Check the logs on the repair participants for further details node.nodetoolResult("repair", "-full", "-st", "0", "-et", "1000", KEYSPACE, "tbl") .asserts().success(); } } } {code} > Add flag to ignore unreplicated keyspaces during repair > --- > > Key: CASSANDRA-15160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15160 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > > When a repair is triggered on a node in 'dc2' for a keyspace with replication > factor {'dc1':3, 'dc2':0} we just ignore the repair in versions < 4. In 4.0 > we fail the repair to make sure the operator does not think the keyspace is > fully repaired. > There might be tooling that relies on the old behaviour though, so we should > add a flag to ignore those unreplicated keyspaces > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16122) Blog / Docs - Create Cassandra on Kubernetes Content
Rahul Singh created CASSANDRA-16122: --- Summary: Blog / Docs - Create Cassandra on Kubernetes Content Key: CASSANDRA-16122 URL: https://issues.apache.org/jira/browse/CASSANDRA-16122 Project: Cassandra Issue Type: New Feature Components: Documentation/Blog, Documentation/Website Reporter: Rahul Singh Assignee: Rahul Singh The Cassandra Kubernetes SIG met and determined that the next best outcome of the group is to create definitive documentation that the current contributors to the existing operators can get behind as a clear guide to Cassandra on Kubernetes that is published on the cassandra.apache.org site. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16122) Blog / Docs - Create Cassandra on Kubernetes Content
[ https://issues.apache.org/jira/browse/CASSANDRA-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Singh updated CASSANDRA-16122: Description: The Cassandra Kubernetes SIG met and determined that the next best outcome of the group is to create definitive documentation that the current contributors to the existing operators can get behind as a clear guide to Cassandra on Kubernetes that is published on the cassandra.apache.org site. Outline / Collaboration Google Doc : [https://docs.google.com/document/d/15sZhAL-m9-a-6iuWmXdyKxsGraaJY9M6_xf9Qj2gSw4/edit?userstoinvite=jim.dickin...@datastax.com=5f5a72f7] was:The Cassandra Kubernetes SIG met and determined that the next best outcome of the group is to create definitive documentation that the current contributors to the existing operators can get behind as a clear guide to Cassandra on Kubernetes that is published on the cassandra.apache.org site. > Blog / Docs - Create Cassandra on Kubernetes Content > - > > Key: CASSANDRA-16122 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16122 > Project: Cassandra > Issue Type: New Feature > Components: Documentation/Blog, Documentation/Website >Reporter: Rahul Singh >Assignee: Rahul Singh >Priority: Normal > > The Cassandra Kubernetes SIG met and determined that the next best outcome of > the group is to create definitive documentation that the current contributors > to the existing operators can get behind as a clear guide to Cassandra on > Kubernetes that is published on the cassandra.apache.org site. > > Outline / Collaboration Google Doc : > [https://docs.google.com/document/d/15sZhAL-m9-a-6iuWmXdyKxsGraaJY9M6_xf9Qj2gSw4/edit?userstoinvite=jim.dickin...@datastax.com=5f5a72f7] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15326) Unable to compute ceiling for max when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193838#comment-17193838 ] Caleb Rackliffe commented on CASSANDRA-15326: - [~mlhsu] When you say "through my java loaders", what particular tooling would that be? Thanks. > Unable to compute ceiling for max when histogram overflowed > --- > > Key: CASSANDRA-15326 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15326 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Batch Log >Reporter: HsuML >Priority: Normal > > I have 9 cassandra nodes. When I create a keyspace that has 15 tables and the > record count of all table are about 8.2 billion. I imported data through my > java loaders, and I found out the system.log has error message. What happened > and how can I solve the error? > Exception in thread Thread[CompactionExecutor:113041,1,main] > java.lang.IllegalStateException: Unable to compute ceiling for max when > histogram overflowed > at > org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:231) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:220) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.io.sstable.metadata.StatsMetadata.getEstimatedDroppableTombstoneRatio(StatsMetadata.java:115) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.io.sstable.format.SSTableReader.getEstimatedDroppableTombstoneRatio(SSTableReader.java:1926) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.worthDroppingTombstones(AbstractCompactionStrategy.java:424) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundSSTables(SizeTieredCompactionStrategy.java:99) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy.getNextBackgroundTask(SizeTieredCompactionStrategy.java:183) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:153) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:260) > ~[apache-cassandra-3.11.4.jar:3.11.4] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_191] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_191] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[na:1.8.0_191] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_191] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > [apache-cassandra-3.11.4.jar:3.11.4] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16119) MockSchema's SSTableReader creation leaks FileHandle and Channel instances
[ https://issues.apache.org/jira/browse/CASSANDRA-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-16119: Fix Version/s: (was: 4.0-beta) 4.0-beta3 Since Version: 4.0-beta2 Source Control Link: https://github.com/apache/cassandra/commit/54d297a192ca452dab5640f33fd6c22fd31e2f9c Resolution: Fixed Status: Resolved (was: Ready to Commit) And committed in {{54d297a192ca452dab5640f33fd6c22fd31e2f9c}}, thanks! > MockSchema's SSTableReader creation leaks FileHandle and Channel instances > -- > > Key: CASSANDRA-16119 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16119 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > {{MockSchema}} creates {{SSTableReader}} instances for testing, but when it > does, it doesn’t seem to ever close the {{FileHandle}} and {{Channel}} > instances from which copies are made for the actual readers. ({{FileHandle}} > itself also internally copies the channel on creation.) This can trigger leak > detection, although perhaps not reliably, from tests like > {{AntiCompactionTest}}. A couple well-placed {{try-with-resources}} blocks > should help us avoid this (and shouldn't risk closing anything too early, > since the close methods for handles and channels seem only to do reference > bookkeeping anyway). > Example: > {noformat} > [junit-timeout] ERROR 16:35:47,747 LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb) to class > org.apache.cassandra.io.util.FileHandle$Cleanup@2072030898:/var/folders/4d/zfjs7m7s6x5_l93k33r5k668gn/T/mocksegmentedfile0tmp > was not released before the reference was garbage collected > [junit-timeout] ERROR 16:35:47,747 Allocate trace > org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb: > [junit-timeout] Thread[main,5,main] > [junit-timeout] at java.lang.Thread.getStackTrace(Thread.java:1559) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.SharedCloseableImpl.(SharedCloseableImpl.java:30) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:74) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:50) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:389) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:124) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:83) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16119) MockSchema's SSTableReader creation leaks FileHandle and Channel instances
[ https://issues.apache.org/jira/browse/CASSANDRA-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-16119: Reviewers: Marcus Eriksson, Sam Tunnicliffe (was: Marcus Eriksson) > MockSchema's SSTableReader creation leaks FileHandle and Channel instances > -- > > Key: CASSANDRA-16119 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16119 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > {{MockSchema}} creates {{SSTableReader}} instances for testing, but when it > does, it doesn’t seem to ever close the {{FileHandle}} and {{Channel}} > instances from which copies are made for the actual readers. ({{FileHandle}} > itself also internally copies the channel on creation.) This can trigger leak > detection, although perhaps not reliably, from tests like > {{AntiCompactionTest}}. A couple well-placed {{try-with-resources}} blocks > should help us avoid this (and shouldn't risk closing anything too early, > since the close methods for handles and channels seem only to do reference > bookkeeping anyway). > Example: > {noformat} > [junit-timeout] ERROR 16:35:47,747 LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb) to class > org.apache.cassandra.io.util.FileHandle$Cleanup@2072030898:/var/folders/4d/zfjs7m7s6x5_l93k33r5k668gn/T/mocksegmentedfile0tmp > was not released before the reference was garbage collected > [junit-timeout] ERROR 16:35:47,747 Allocate trace > org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb: > [junit-timeout] Thread[main,5,main] > [junit-timeout] at java.lang.Thread.getStackTrace(Thread.java:1559) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.SharedCloseableImpl.(SharedCloseableImpl.java:30) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:74) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:50) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:389) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:124) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:83) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15861) Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure
[ https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193818#comment-17193818 ] ZhaoYang commented on CASSANDRA-15861: -- Thanks for the review and feedback > Mutating sstable component may race with entire-sstable-streaming(ZCS) > causing checksum validation failure > -- > > Key: CASSANDRA-15861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15861 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Consistency/Streaming, > Local/Compaction >Reporter: ZhaoYang >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0-beta3 > > > Flaky dtest: [test_dead_sync_initiator - > repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/] > {code:java|title=stacktrace} > Unexpected error found in node logs (see stdout for full details). Errors: > [ERROR [Stream-Deserializer-127.0.0.1:7000-570871f3] 2020-06-03 04:05:19,081 > CassandraEntireSSTableStreamReader.java:145 - [Stream > 6f1c3360-a54f-11ea-a808-2f23710fdc90] Error while reading sstable from stream > for table = keyspace1.standard1 > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db > at > org.apache.cassandra.io.sstable.metadata.MetadataSerializer.maybeValidateChecksum(MetadataSerializer.java:219) > at > org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:198) > at > org.apache.cassandra.io.sstable.metadata.MetadataSerializer.deserialize(MetadataSerializer.java:129) > at > org.apache.cassandra.io.sstable.metadata.MetadataSerializer.mutate(MetadataSerializer.java:226) > at > org.apache.cassandra.db.streaming.CassandraEntireSSTableStreamReader.read(CassandraEntireSSTableStreamReader.java:140) > at > org.apache.cassandra.db.streaming.CassandraIncomingFile.read(CassandraIncomingFile.java:78) > at > org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:49) > at > org.apache.cassandra.streaming.messages.IncomingStreamMessage$1.deserialize(IncomingStreamMessage.java:36) > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:49) > at > org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:181) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Checksums do not match for > /home/cassandra/cassandra/cassandra-dtest/tmp/dtest-te4ty0r9/test/node3/data0/keyspace1/standard1-5f5ab140a54f11eaa8082f23710fdc90/na-2-big-Statistics.db > {code} > > In the above test, it executes "nodetool repair" on node1 and kills node2 > during repair. At the end, node3 reports checksum validation failure on > sstable transferred from node1. > {code:java|title=what happened} > 1. When repair started on node1, it performs anti-compaction which modifies > sstable's repairAt to 0 and pending repair id to session-id. > 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be > transferred to node3. > 3. Before node1 actually sends the files to node3, node2 is killed and node1 > starts to broadcast repair-failure-message to all participants in > {{CoordinatorSession#fail}} > 4. Node1 receives its own repair-failure-message and fails its local repair > sessions at {{LocalSessions#failSession}} which triggers async background > compaction. > 5. Node1's background compaction will mutate sstable's repairAt to 0 and > pending repair id to null via > {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more > in-progress repair. > 6. Node1 actually sends the sstable to node3 where the sstable's STATS > component size is different from the original size recorded in the manifest. > 7. At the end, node3 reports checksum validation failure when it tries to > mutate sstable level and "isTransient" attribute in > {{CassandraEntireSSTableStreamReader#read}}. > {code} > Currently, entire-sstable-streaming requires sstable components to be > immutable, because \{{ComponentManifest}} > with component sizes are sent before sending actual files. This isn't a > problem in legacy streaming as STATS file length didn't matter. > > Ideally it will be great to make sstable STATS metadata immutable, just like
[cassandra] branch trunk updated: Make sure MockSchema.sstable() disposes of its FileHandles properly
This is an automated email from the ASF dual-hosted git repository. samt pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 54d297a Make sure MockSchema.sstable() disposes of its FileHandles properly 54d297a is described below commit 54d297a192ca452dab5640f33fd6c22fd31e2f9c Author: Caleb Rackliffe AuthorDate: Wed Sep 9 16:35:38 2020 -0500 Make sure MockSchema.sstable() disposes of its FileHandles properly Patch by Caleb Rackcliffe; reviewed by Marcus Eriksson and Sam Tunnicliffe for CASSANDRA-16119 --- .../org/apache/cassandra/schema/MockSchema.java| 48 +++--- 1 file changed, 25 insertions(+), 23 deletions(-) diff --git a/test/unit/org/apache/cassandra/schema/MockSchema.java b/test/unit/org/apache/cassandra/schema/MockSchema.java index 40b0f87..5ce8520 100644 --- a/test/unit/org/apache/cassandra/schema/MockSchema.java +++ b/test/unit/org/apache/cassandra/schema/MockSchema.java @@ -111,35 +111,37 @@ public class MockSchema } } // .complete() with size to make sstable.onDiskLength work -@SuppressWarnings("resource") -FileHandle fileHandle = new FileHandle.Builder(new ChannelProxy(tempFile)).bufferSize(size).complete(size); -if (size > 0) +try (FileHandle.Builder builder = new FileHandle.Builder(new ChannelProxy(tempFile)).bufferSize(size); + FileHandle fileHandle = builder.complete(size)) { -try +if (size > 0) { -File file = new File(descriptor.filenameFor(Component.DATA)); -try (RandomAccessFile raf = new RandomAccessFile(file, "rw")) +try { -raf.setLength(size); +File file = new File(descriptor.filenameFor(Component.DATA)); +try (RandomAccessFile raf = new RandomAccessFile(file, "rw")) +{ +raf.setLength(size); +} +} +catch (IOException e) +{ +throw new RuntimeException(e); } } -catch (IOException e) -{ -throw new RuntimeException(e); -} +SerializationHeader header = SerializationHeader.make(cfs.metadata(), Collections.emptyList()); +StatsMetadata metadata = (StatsMetadata) new MetadataCollector(cfs.metadata().comparator) + .finalizeMetadata(cfs.metadata().partitioner.getClass().getCanonicalName(), 0.01f, UNREPAIRED_SSTABLE, null, false, header) + .get(MetadataType.STATS); +SSTableReader reader = SSTableReader.internalOpen(descriptor, components, cfs.metadata, + fileHandle.sharedCopy(), fileHandle.sharedCopy(), indexSummary.sharedCopy(), + new AlwaysPresentFilter(), 1L, metadata, SSTableReader.OpenReason.NORMAL, header); +reader.first = readerBounds(firstToken); +reader.last = readerBounds(lastToken); +if (!keepRef) +reader.selfRef().release(); +return reader; } -SerializationHeader header = SerializationHeader.make(cfs.metadata(), Collections.emptyList()); -StatsMetadata metadata = (StatsMetadata) new MetadataCollector(cfs.metadata().comparator) - .finalizeMetadata(cfs.metadata().partitioner.getClass().getCanonicalName(), 0.01f, UNREPAIRED_SSTABLE, null, false, header) - .get(MetadataType.STATS); -SSTableReader reader = SSTableReader.internalOpen(descriptor, components, cfs.metadata, - fileHandle.sharedCopy(), fileHandle.sharedCopy(), indexSummary.sharedCopy(), - new AlwaysPresentFilter(), 1L, metadata, SSTableReader.OpenReason.NORMAL, header); -reader.first = readerBounds(firstToken); -reader.last = readerBounds(lastToken); -if (!keepRef) -reader.selfRef().release(); -return reader; } public static ColumnFamilyStore newCFS() - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15958) org.apache.cassandra.net.ConnectionTest testMessagePurging
[ https://issues.apache.org/jira/browse/CASSANDRA-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193813#comment-17193813 ] Yifan Cai commented on CASSANDRA-15958: --- Not seeing the link to the CI run. Attaching one: [https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=C-15958] > org.apache.cassandra.net.ConnectionTest testMessagePurging > -- > > Key: CASSANDRA-15958 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15958 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > Build: > https://ci-cassandra.apache.org/job/Cassandra-trunk-test/196/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/ > Build: > https://ci-cassandra.apache.org/job/Cassandra-trunk-test/194/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessagePurging/ > java.util.concurrent.TimeoutException > at org.apache.cassandra.net.AsyncPromise.get(AsyncPromise.java:258) > at org.apache.cassandra.net.FutureDelegate.get(FutureDelegate.java:143) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:268) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:236) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:679) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15160) Add flag to ignore unreplicated keyspaces during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193791#comment-17193791 ] David Capwell commented on CASSANDRA-15160: --- when I run the tests on 4.0 emptyDC and hostFilterDifferentDC fail with a different message ("Repair command #1 failed with error Endpoints can not be empty"), but they still fail (behavior of 3.0). > Add flag to ignore unreplicated keyspaces during repair > --- > > Key: CASSANDRA-15160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15160 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > > When a repair is triggered on a node in 'dc2' for a keyspace with replication > factor {'dc1':3, 'dc2':0} we just ignore the repair in versions < 4. In 4.0 > we fail the repair to make sure the operator does not think the keyspace is > fully repaired. > There might be tooling that relies on the old behaviour though, so we should > add a flag to ignore those unreplicated keyspaces > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs
[ https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193788#comment-17193788 ] Yifan Cai edited comment on CASSANDRA-16120 at 9/10/20, 6:38 PM: - Also reviewed the PR to trunk. Added a few nits. I also verified the changes locally by 1. Check out the API PR and install API snapshot to maven local 2. Check out the trunk PR and run test using {code:none} ant testsome -Dtest.name=org.apache.cassandra.distributed.test.JVMDTestTest -Dtest.methods=instanceLogs ... BUILD SUCCESSFUL Total time: 53 seconds {code} The new test `org.apache.cassandra.distributed.test.JVMDTestTest#instanceLogs` passes. was (Author: yifanc): Also reviewed the PR to trunk. Added a few nits. I also verified the changes locally by 1. Check out the API PR and install API snapshot to maven local 2. Check out the trunk PR and run test using {code:plaintext} ant testsome -Dtest.name=org.apache.cassandra.distributed.test.JVMDTestTest -Dtest.methods=instanceLogs ... BUILD SUCCESSFUL Total time: 53 seconds {code} The new test `org.apache.cassandra.distributed.test.JVMDTestTest#instanceLogs` passes. > Add ability for jvm-dtest to grep instance logs > --- > > Key: CASSANDRA-16120 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16120 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 50m > Remaining Estimate: 0h > > One of the main gaps between python dtest and jvm dtest is python dtest > supports the ability to grep the logs of an instance; we need this capability > as some tests require validating logs were triggered. > Pydocs for common log methods > {code} > | grep_log(self, expr, filename='system.log', from_mark=None) > | Returns a list of lines matching the regular expression in parameter > | in the Cassandra log of this node > | > | grep_log_for_errors(self, filename='system.log') > | Returns a list of errors with stack traces > | in the Cassandra log of this node > | > | grep_log_for_errors_from(self, filename='system.log', seek_start=0) > {code} > {code} > | watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, > verbose=False, filename='system.log') > | Watch the log until one or more (regular) expression are found. > | This methods when all the expressions have been found or the method > | timeouts (a TimeoutError is then raised). On successful completion, > | a list of pair (line matched, match object) is returned. > {code} > Below is a POC showing a way to do such logic > {code} > package org.apache.cassandra.distributed.test; > import java.io.BufferedReader; > import java.io.FileInputStream; > import java.io.IOException; > import java.io.InputStreamReader; > import java.io.UncheckedIOException; > import java.nio.charset.StandardCharsets; > import java.util.Iterator; > import java.util.Spliterator; > import java.util.Spliterators; > import java.util.regex.Matcher; > import java.util.regex.Pattern; > import java.util.stream.Stream; > import java.util.stream.StreamSupport; > import com.google.common.io.Closeables; > import org.junit.Test; > import org.apache.cassandra.distributed.Cluster; > import org.apache.cassandra.utils.AbstractIterator; > public class AllTheLogs extends TestBaseImpl > { >@Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite = System.getProperty("suitename", > "suitename_IS_UNDEFINED"); >String log = String.format("build/test/logs/%s/TEST-%s.log", tag, > suite); >grep(log, "Enqueuing flush of tables").forEach(l -> > System.out.println("I found the thing: " + l)); >} >} >private static Stream grep(String file, String regex) throws > IOException >{ >return grep(file, Pattern.compile(regex)); >} >private static Stream grep(String file, Pattern regex) throws > IOException >{ >BufferedReader reader = new BufferedReader(new InputStreamReader(new > FileInputStream(file), StandardCharsets.UTF_8)); >Iterator it = new AbstractIterator() >{ >protected String computeNext() >{ >try >{ >String s; >while ((s = reader.readLine()) != null) >{ >Matcher m = regex.matcher(s); >if (m.find()) >
[jira] [Commented] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs
[ https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193788#comment-17193788 ] Yifan Cai commented on CASSANDRA-16120: --- Also reviewed the PR to trunk. Added a few nits. I also verified the changes locally by 1. Check out the API PR and install API snapshot to maven local 2. Check out the trunk PR and run test using {code:plaintext} ant testsome -Dtest.name=org.apache.cassandra.distributed.test.JVMDTestTest -Dtest.methods=instanceLogs ... BUILD SUCCESSFUL Total time: 53 seconds {code} The new test `org.apache.cassandra.distributed.test.JVMDTestTest#instanceLogs` passes. > Add ability for jvm-dtest to grep instance logs > --- > > Key: CASSANDRA-16120 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16120 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 50m > Remaining Estimate: 0h > > One of the main gaps between python dtest and jvm dtest is python dtest > supports the ability to grep the logs of an instance; we need this capability > as some tests require validating logs were triggered. > Pydocs for common log methods > {code} > | grep_log(self, expr, filename='system.log', from_mark=None) > | Returns a list of lines matching the regular expression in parameter > | in the Cassandra log of this node > | > | grep_log_for_errors(self, filename='system.log') > | Returns a list of errors with stack traces > | in the Cassandra log of this node > | > | grep_log_for_errors_from(self, filename='system.log', seek_start=0) > {code} > {code} > | watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, > verbose=False, filename='system.log') > | Watch the log until one or more (regular) expression are found. > | This methods when all the expressions have been found or the method > | timeouts (a TimeoutError is then raised). On successful completion, > | a list of pair (line matched, match object) is returned. > {code} > Below is a POC showing a way to do such logic > {code} > package org.apache.cassandra.distributed.test; > import java.io.BufferedReader; > import java.io.FileInputStream; > import java.io.IOException; > import java.io.InputStreamReader; > import java.io.UncheckedIOException; > import java.nio.charset.StandardCharsets; > import java.util.Iterator; > import java.util.Spliterator; > import java.util.Spliterators; > import java.util.regex.Matcher; > import java.util.regex.Pattern; > import java.util.stream.Stream; > import java.util.stream.StreamSupport; > import com.google.common.io.Closeables; > import org.junit.Test; > import org.apache.cassandra.distributed.Cluster; > import org.apache.cassandra.utils.AbstractIterator; > public class AllTheLogs extends TestBaseImpl > { >@Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite = System.getProperty("suitename", > "suitename_IS_UNDEFINED"); >String log = String.format("build/test/logs/%s/TEST-%s.log", tag, > suite); >grep(log, "Enqueuing flush of tables").forEach(l -> > System.out.println("I found the thing: " + l)); >} >} >private static Stream grep(String file, String regex) throws > IOException >{ >return grep(file, Pattern.compile(regex)); >} >private static Stream grep(String file, Pattern regex) throws > IOException >{ >BufferedReader reader = new BufferedReader(new InputStreamReader(new > FileInputStream(file), StandardCharsets.UTF_8)); >Iterator it = new AbstractIterator() >{ >protected String computeNext() >{ >try >{ >String s; >while ((s = reader.readLine()) != null) >{ >Matcher m = regex.matcher(s); >if (m.find()) >return s; >} >reader.close(); >return endOfData(); >} >catch (IOException e) >{ >Closeables.closeQuietly(reader); >throw new UncheckedIOException(e); >} >} >}; >return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, > Spliterator.ORDERED), false); >} > } > {code} > And > {code} > @Test >public void
[jira] [Commented] (CASSANDRA-15160) Add flag to ignore unreplicated keyspaces during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193787#comment-17193787 ] David Capwell commented on CASSANDRA-15160: --- left comments on the commits, but I am unable to replicate the defined behavior in 3.0; ran the below tests in 3.0 and the unreplicated case all failed for me. {code} package org.apache.cassandra.distributed.test; import java.io.IOException; import org.junit.Test; import org.apache.cassandra.distributed.Cluster; import org.apache.cassandra.distributed.api.ConsistencyLevel; import org.apache.cassandra.distributed.api.IInvokableInstance; import org.assertj.core.api.Assertions; public class RepairFilteringTest extends TestBaseImpl { @Test public void dcFilterOnEmptyDC() throws IOException { try (Cluster cluster = Cluster.build().withRacks(2, 1, 2).start()) { // 1-2 : datacenter1 // 3-4 : datacenter2 cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1':2, 'datacenter2':0}"); cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (id int PRIMARY KEY, i int)"); for (int i = 0; i < 10; i++) cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (id, i) VALUES (?, ?)", ConsistencyLevel.ALL, i, i); cluster.forEach(i -> i.flush(KEYSPACE)); // choose a node in the DC that doesn't have any replicas IInvokableInstance node = cluster.get(3); Assertions.assertThat(node.config().localDatacenter()).isEqualTo("datacenter2"); // fails with "the local data center must be part of the repair" node.nodetoolResult("repair", "-full", "-dc", "datacenter1", "-st", "0", "-et", "1000", KEYSPACE, "tbl") .asserts().failure().errorContains("the local data center must be part of the repair"); } } @Test public void hostFilterDifferentDC() throws IOException { try (Cluster cluster = Cluster.build().withRacks(2, 1, 2).start()) { // 1-2 : datacenter1 // 3-4 : datacenter2 cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1':2, 'datacenter2':0}"); cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (id int PRIMARY KEY, i int)"); for (int i = 0; i < 10; i++) cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (id, i) VALUES (?, ?)", ConsistencyLevel.ALL, i, i); cluster.forEach(i -> i.flush(KEYSPACE)); // choose a node in the DC that doesn't have any replicas IInvokableInstance node = cluster.get(3); Assertions.assertThat(node.config().localDatacenter()).isEqualTo("datacenter2"); // fails with "Specified hosts [127.0.0.3, 127.0.0.1] do not share range (0,1000] needed for repair. Either restrict repair ranges with -st/-et options, or specify one of the neighbors that share this range with this node: [].. Check the logs on the repair participants for further details" node.nodetoolResult("repair", "-full", "-hosts", cluster.get(1).broadcastAddress().getAddress().getHostAddress(), "-hosts", node.broadcastAddress().getAddress().getHostAddress(), "-st", "0", "-et", "1000", KEYSPACE, "tbl") .asserts().failure().errorContains("do not share range (0,1000] needed for repair"); } } @Test public void emptyDC() throws IOException { try (Cluster cluster = Cluster.build().withRacks(2, 1, 2).start()) { // 1-2 : datacenter1 // 3-4 : datacenter2 cluster.schemaChange("CREATE KEYSPACE " + KEYSPACE + " WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1':2, 'datacenter2':0}"); cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (id int PRIMARY KEY, i int)"); for (int i = 0; i < 10; i++) cluster.coordinator(1).execute("INSERT INTO " + KEYSPACE + ".tbl (id, i) VALUES (?, ?)", ConsistencyLevel.ALL, i, i); cluster.forEach(i -> i.flush(KEYSPACE)); // choose a node in the DC that doesn't have any replicas IInvokableInstance node = cluster.get(3); Assertions.assertThat(node.config().localDatacenter()).isEqualTo("datacenter2"); // fails with [2020-09-10 11:30:04,139] Repair command #1 failed with error Nothing to repair for (0,1000] in distributed_test_keyspace - aborting. Check the logs on the repair participants for further details node.nodetoolResult("repair", "-full", "-st", "0", "-et", "1000",
[jira] [Commented] (CASSANDRA-15160) Add flag to ignore unreplicated keyspaces during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193786#comment-17193786 ] David Capwell commented on CASSANDRA-15160: --- Curious, if https://issues.apache.org/jira/browse/CASSANDRA-16120 lands would we need python dtest to implement these test? > Add flag to ignore unreplicated keyspaces during repair > --- > > Key: CASSANDRA-15160 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15160 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Repair >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > > When a repair is triggered on a node in 'dc2' for a keyspace with replication > factor {'dc1':3, 'dc2':0} we just ignore the repair in versions < 4. In 4.0 > we fail the repair to make sure the operator does not think the keyspace is > fully repaired. > There might be tooling that relies on the old behaviour though, so we should > add a flag to ignore those unreplicated keyspaces > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15406) Show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193770#comment-17193770 ] Stefan Miklosovic commented on CASSANDRA-15406: --- [~blerer] [~dcapwell] I have rebased this on top of current trunk where CASSANDRA-15861. It is same PR. Jenkins build is here: https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/16/ > Show the progress of data streaming and index build > > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0, 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . > > PR [https://github.com/apache/cassandra/pull/711] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15406) Show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-15406: -- Description: I found that we should supply a command to show the progress of streaming when we do the operation of bootstrap/move/decommission/removenode. For when do data streaming , noboday knows which steps there program are in , so I think a command to show the joing/leaving node's is needed . PR [https://github.com/apache/cassandra/pull/711] was: I found that we should supply a command to show the progress of streaming when we do the operation of bootstrap/move/decommission/removenode. For when do data streaming , noboday knows which steps there program are in , so I think a command to show the joing/leaving node's is needed . PR [https://github.com/apache/cassandra/pull/558] > Show the progress of data streaming and index build > > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0, 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . > > PR [https://github.com/apache/cassandra/pull/711] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16057) Should update in-jvm dtest to expose stdout and stderr for nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16057: -- Source Control Link: https://github.com/apache/cassandra/pull/749 > Should update in-jvm dtest to expose stdout and stderr for nodetool > --- > > Key: CASSANDRA-16057 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16057 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > > Many nodetool commands output to stdout or stderr so running nodetool using > in-jvm dtest should expose that to tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16057) Should update in-jvm dtest to expose stdout and stderr for nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193755#comment-17193755 ] Yifan Cai commented on CASSANDRA-16057: --- Adding the PR to trunk: https://github.com/apache/cassandra/pull/749 (The CI won't run until the above PR to API is merged and a new release is out) The PR did mainly 2 things. - Made the {{out/err}} stream in the nodetool and commands pluggable. {{ConsoleOutputProvider}} is introduced to contain both stream. Bunch of files are touch because of it. - Added {{CapturingConsoleOutputProvider}} to save the outputs and pass them to {{NodeToolResult}}. {{NodeToolTest.java}} shows the usage of asserting with the captured logs in nodetool result. > Should update in-jvm dtest to expose stdout and stderr for nodetool > --- > > Key: CASSANDRA-16057 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16057 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > > Many nodetool commands output to stdout or stderr so running nodetool using > in-jvm dtest should expose that to tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16057) Should update in-jvm dtest to expose stdout and stderr for nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-16057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16057: -- Test and Documentation Plan: jvm dtest Status: Patch Available (was: Open) > Should update in-jvm dtest to expose stdout and stderr for nodetool > --- > > Key: CASSANDRA-16057 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16057 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > > Many nodetool commands output to stdout or stderr so running nodetool using > in-jvm dtest should expose that to tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs
[ https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-16120: -- Reviewers: Yifan Cai, Yifan Cai (was: Yifan Cai) Yifan Cai, Yifan Cai Status: Review In Progress (was: Patch Available) > Add ability for jvm-dtest to grep instance logs > --- > > Key: CASSANDRA-16120 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16120 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 0.5h > Remaining Estimate: 0h > > One of the main gaps between python dtest and jvm dtest is python dtest > supports the ability to grep the logs of an instance; we need this capability > as some tests require validating logs were triggered. > Pydocs for common log methods > {code} > | grep_log(self, expr, filename='system.log', from_mark=None) > | Returns a list of lines matching the regular expression in parameter > | in the Cassandra log of this node > | > | grep_log_for_errors(self, filename='system.log') > | Returns a list of errors with stack traces > | in the Cassandra log of this node > | > | grep_log_for_errors_from(self, filename='system.log', seek_start=0) > {code} > {code} > | watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, > verbose=False, filename='system.log') > | Watch the log until one or more (regular) expression are found. > | This methods when all the expressions have been found or the method > | timeouts (a TimeoutError is then raised). On successful completion, > | a list of pair (line matched, match object) is returned. > {code} > Below is a POC showing a way to do such logic > {code} > package org.apache.cassandra.distributed.test; > import java.io.BufferedReader; > import java.io.FileInputStream; > import java.io.IOException; > import java.io.InputStreamReader; > import java.io.UncheckedIOException; > import java.nio.charset.StandardCharsets; > import java.util.Iterator; > import java.util.Spliterator; > import java.util.Spliterators; > import java.util.regex.Matcher; > import java.util.regex.Pattern; > import java.util.stream.Stream; > import java.util.stream.StreamSupport; > import com.google.common.io.Closeables; > import org.junit.Test; > import org.apache.cassandra.distributed.Cluster; > import org.apache.cassandra.utils.AbstractIterator; > public class AllTheLogs extends TestBaseImpl > { >@Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite = System.getProperty("suitename", > "suitename_IS_UNDEFINED"); >String log = String.format("build/test/logs/%s/TEST-%s.log", tag, > suite); >grep(log, "Enqueuing flush of tables").forEach(l -> > System.out.println("I found the thing: " + l)); >} >} >private static Stream grep(String file, String regex) throws > IOException >{ >return grep(file, Pattern.compile(regex)); >} >private static Stream grep(String file, Pattern regex) throws > IOException >{ >BufferedReader reader = new BufferedReader(new InputStreamReader(new > FileInputStream(file), StandardCharsets.UTF_8)); >Iterator it = new AbstractIterator() >{ >protected String computeNext() >{ >try >{ >String s; >while ((s = reader.readLine()) != null) >{ >Matcher m = regex.matcher(s); >if (m.find()) >return s; >} >reader.close(); >return endOfData(); >} >catch (IOException e) >{ >Closeables.closeQuietly(reader); >throw new UncheckedIOException(e); >} >} >}; >return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, > Spliterator.ORDERED), false); >} > } > {code} > And > {code} > @Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite = System.getProperty("suitename", > "suitename_IS_UNDEFINED"); >//TODO missing way to get node id >
[jira] [Commented] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs
[ https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193730#comment-17193730 ] Yifan Cai commented on CASSANDRA-16120: --- Pretty excited for being able to asserting on the instance logs. I took a first look at the PR to the jvm dtest API repo. Overall it looks good. Added some suggestions for expanding the {{LogAction}} API. > Add ability for jvm-dtest to grep instance logs > --- > > Key: CASSANDRA-16120 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16120 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 0.5h > Remaining Estimate: 0h > > One of the main gaps between python dtest and jvm dtest is python dtest > supports the ability to grep the logs of an instance; we need this capability > as some tests require validating logs were triggered. > Pydocs for common log methods > {code} > | grep_log(self, expr, filename='system.log', from_mark=None) > | Returns a list of lines matching the regular expression in parameter > | in the Cassandra log of this node > | > | grep_log_for_errors(self, filename='system.log') > | Returns a list of errors with stack traces > | in the Cassandra log of this node > | > | grep_log_for_errors_from(self, filename='system.log', seek_start=0) > {code} > {code} > | watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, > verbose=False, filename='system.log') > | Watch the log until one or more (regular) expression are found. > | This methods when all the expressions have been found or the method > | timeouts (a TimeoutError is then raised). On successful completion, > | a list of pair (line matched, match object) is returned. > {code} > Below is a POC showing a way to do such logic > {code} > package org.apache.cassandra.distributed.test; > import java.io.BufferedReader; > import java.io.FileInputStream; > import java.io.IOException; > import java.io.InputStreamReader; > import java.io.UncheckedIOException; > import java.nio.charset.StandardCharsets; > import java.util.Iterator; > import java.util.Spliterator; > import java.util.Spliterators; > import java.util.regex.Matcher; > import java.util.regex.Pattern; > import java.util.stream.Stream; > import java.util.stream.StreamSupport; > import com.google.common.io.Closeables; > import org.junit.Test; > import org.apache.cassandra.distributed.Cluster; > import org.apache.cassandra.utils.AbstractIterator; > public class AllTheLogs extends TestBaseImpl > { >@Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite = System.getProperty("suitename", > "suitename_IS_UNDEFINED"); >String log = String.format("build/test/logs/%s/TEST-%s.log", tag, > suite); >grep(log, "Enqueuing flush of tables").forEach(l -> > System.out.println("I found the thing: " + l)); >} >} >private static Stream grep(String file, String regex) throws > IOException >{ >return grep(file, Pattern.compile(regex)); >} >private static Stream grep(String file, Pattern regex) throws > IOException >{ >BufferedReader reader = new BufferedReader(new InputStreamReader(new > FileInputStream(file), StandardCharsets.UTF_8)); >Iterator it = new AbstractIterator() >{ >protected String computeNext() >{ >try >{ >String s; >while ((s = reader.readLine()) != null) >{ >Matcher m = regex.matcher(s); >if (m.find()) >return s; >} >reader.close(); >return endOfData(); >} >catch (IOException e) >{ >Closeables.closeQuietly(reader); >throw new UncheckedIOException(e); >} >} >}; >return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, > Spliterator.ORDERED), false); >} > } > {code} > And > {code} > @Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite =
[jira] [Commented] (CASSANDRA-16120) Add ability for jvm-dtest to grep instance logs
[ https://issues.apache.org/jira/browse/CASSANDRA-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193716#comment-17193716 ] David Capwell commented on CASSANDRA-16120: --- trunk and dtest patches are up. > Add ability for jvm-dtest to grep instance logs > --- > > Key: CASSANDRA-16120 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16120 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest/java >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-beta > > Time Spent: 20m > Remaining Estimate: 0h > > One of the main gaps between python dtest and jvm dtest is python dtest > supports the ability to grep the logs of an instance; we need this capability > as some tests require validating logs were triggered. > Pydocs for common log methods > {code} > | grep_log(self, expr, filename='system.log', from_mark=None) > | Returns a list of lines matching the regular expression in parameter > | in the Cassandra log of this node > | > | grep_log_for_errors(self, filename='system.log') > | Returns a list of errors with stack traces > | in the Cassandra log of this node > | > | grep_log_for_errors_from(self, filename='system.log', seek_start=0) > {code} > {code} > | watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, > verbose=False, filename='system.log') > | Watch the log until one or more (regular) expression are found. > | This methods when all the expressions have been found or the method > | timeouts (a TimeoutError is then raised). On successful completion, > | a list of pair (line matched, match object) is returned. > {code} > Below is a POC showing a way to do such logic > {code} > package org.apache.cassandra.distributed.test; > import java.io.BufferedReader; > import java.io.FileInputStream; > import java.io.IOException; > import java.io.InputStreamReader; > import java.io.UncheckedIOException; > import java.nio.charset.StandardCharsets; > import java.util.Iterator; > import java.util.Spliterator; > import java.util.Spliterators; > import java.util.regex.Matcher; > import java.util.regex.Pattern; > import java.util.stream.Stream; > import java.util.stream.StreamSupport; > import com.google.common.io.Closeables; > import org.junit.Test; > import org.apache.cassandra.distributed.Cluster; > import org.apache.cassandra.utils.AbstractIterator; > public class AllTheLogs extends TestBaseImpl > { >@Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite = System.getProperty("suitename", > "suitename_IS_UNDEFINED"); >String log = String.format("build/test/logs/%s/TEST-%s.log", tag, > suite); >grep(log, "Enqueuing flush of tables").forEach(l -> > System.out.println("I found the thing: " + l)); >} >} >private static Stream grep(String file, String regex) throws > IOException >{ >return grep(file, Pattern.compile(regex)); >} >private static Stream grep(String file, Pattern regex) throws > IOException >{ >BufferedReader reader = new BufferedReader(new InputStreamReader(new > FileInputStream(file), StandardCharsets.UTF_8)); >Iterator it = new AbstractIterator() >{ >protected String computeNext() >{ >try >{ >String s; >while ((s = reader.readLine()) != null) >{ >Matcher m = regex.matcher(s); >if (m.find()) >return s; >} >reader.close(); >return endOfData(); >} >catch (IOException e) >{ >Closeables.closeQuietly(reader); >throw new UncheckedIOException(e); >} >} >}; >return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, > Spliterator.ORDERED), false); >} > } > {code} > And > {code} > @Test >public void test() throws IOException >{ >try (final Cluster cluster = init(Cluster.build(1).start())) >{ >String tag = System.getProperty("cassandra.testtag", > "cassandra.testtag_IS_UNDEFINED"); >String suite = System.getProperty("suitename", > "suitename_IS_UNDEFINED"); >//TODO missing way to get node id > //cluster.get(1); >String log = >
[jira] [Updated] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-16063: Status: In Progress (was: Patch Available) > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193653#comment-17193653 ] Ekaterina Dimitrova commented on CASSANDRA-16063: - As we talked on Slack, the startup checks happen after the [CommitLog.instance.start()|#L214] (where a new empty segment is created) but before processing the old ones. So with this patch we had to get rid of the empty segment. As suggested by you, I will double check today why as part of CASSANDRA-15295, [CommitLog.instance.start()|#L214] was set before the startup checks and not after them and try to work it out in favor of getting rid of the flag and guarantee downgrade is gonna be seamless for the user in all cases. > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Sorokoumov updated CASSANDRA-14801: - Status: Review In Progress (was: Changes Suggested) > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict Elliott Smith >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16066) Update and rework cassandra-website material to work with Antora
[ https://issues.apache.org/jira/browse/CASSANDRA-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193632#comment-17193632 ] Anthony Grasso commented on CASSANDRA-16066: [~mck] and I have had a chat to summarise what we think the goals are for the tooling of the website and in-tree docs. What we came up with so far is as follows. Goals immediate and long term: * Tooling can generate entire website including in-tree docs and push to {{cassandra-website}} staging branch. * Tooling provides a preview mode for (in-tree) docs when editing. * Address legalities of conflicting licenses (unable to store Antora's UI bundle and generation in-tree). * Generated content is never committed to the master branch of the {{cassandra-website}} (relevant to Cassandra releases). * Continue to maintain staging branch and live branch in {{cassandra-website}}. * Cassandra website CI only tiggers if there are changes in the website repository or when Cassandra in-tree docs change (can be made to listen changes in multiple repositories). * Ability to generate in-tree docs without using Antora so it can be bundled in with a Cassandra release (Simple AsciiDoc build as a starting point). > Update and rework cassandra-website material to work with Antora > > > Key: CASSANDRA-16066 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16066 > Project: Cassandra > Issue Type: Task > Components: Documentation/Website >Reporter: Anthony Grasso >Priority: Normal > Attachments: image-2020-09-05-13-24-13-606.png, > image-2020-09-06-07-17-14-705.png > > > *We want to modernise the way the website is built* > Rework the cassandra-website repository to generate a UI bundle containing > resources that Antora will use when generating the Cassandra documents and > website. > *The existing method is starting to become dated* > The documentation and website templates are currently in markdown format. > Sphinx is used to generate the Cassandra documentation and Jekyll generates > the website content. One of the major issues with the existing website > tooling is that the live preview server (render site as it is being updated) > is really slow. There is a preview server that is really fast, however it is > unable to detect changes to the content and render automatically. > *We are migrating the docs to be rendered with Antora* > The work in CASSANDRA-16029 is converting the document templates to AsciiDoc > format. Sphinx is being replaced by Antora to generate the documentation > content. This change has two advantages: > * More flexibility if the Apache Cassandra documentation look and feel needs > to be updated or redesigned. > * More modern look and feel to the documentation. > *We can use Antora to generate the website as well* > Antora could also be used to generate the Cassandra website content. As > suggested on the [mailing > list|https://www.mail-archive.com/dev@cassandra.apache.org/msg15577.html] > this would require the existing markdown templates to be converted to > AsciiDoc as well. > *Antora needs a UI bundle to style content* > For Antora to generate the document content and potentially the website > content it requires a UI bundle (ui-bundle.zip). The UI bundle contains the > HTML templates (layouts, partials, and helpers), CSS, JavaScript, fonts, and > (site-wide) images. As such, it provides both the visual theme and user > interactions for the documentation. Effectively the UI bundle is the > templates and styling that are applied to the documentation and website > content. > *The [cassandra-website|https://github.com/apache/cassandra-website] > repository can be used to generate the UI bundle* > All the resources associated with templating and styling the documentation > and website can be placed in the > [cassandra-website|https://github.com/apache/cassandra-website] repository. > In this case the repository would serve two purposes; > * Generation of the UI bundle resources. > * Serve the production website content. > *The [cassandra|https://github.com/apache/cassandra] repository would contain > the documentation material, while the rest of the non-versioned pages would > contain that material* > * All other material that is used to generate documentation would live in > the [cassandra|https://github.com/apache/cassandra] repository. In this case > Antora would run on the > [cassandra/doc|https://github.com/apache/cassandra/doc] directory and pull in > a UI bundle released on the GitHub > [cassandra-website|https://github.com/apache/cassandra-website] repository. > The content generated by Antora using the site.yml file located in this repo > can be used to preview the docs for review. > * All other non-versioned material,
[jira] [Updated] (CASSANDRA-15393) Add byte array backed cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-15393: Reviewers: Caleb Rackliffe, Marcus Eriksson (was: Caleb Rackliffe, Marcus Eriksson, Sam Tunnicliffe) > Add byte array backed cells > --- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 20m > Remaining Estimate: 0h > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15393) Add byte array backed cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193628#comment-17193628 ] Marcus Eriksson commented on CASSANDRA-15393: - +1 assuming clean test run and Calebs last nits fixed > Add byte array backed cells > --- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 20m > Remaining Estimate: 0h > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16121) Circleci should run cqlshlib tests as well
[ https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-16121: Bug Category: Parent values: Correctness(12982) Complexity: Normal Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > Circleci should run cqlshlib tests as well > -- > > Key: CASSANDRA-16121 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16121 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > Currently circleci is not running cqlshlib tests. This resulted in some bugs > not being caught before committing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16121) Circleci should run cqlshlib tests as well
[ https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-16121: Fix Version/s: 4.0-beta > Circleci should run cqlshlib tests as well > -- > > Key: CASSANDRA-16121 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16121 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > Currently circleci is not running cqlshlib tests. This resulted in some bugs > not being caught before committing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16121) Circleci should run cqlshlib tests as well
Berenguer Blasi created CASSANDRA-16121: --- Summary: Circleci should run cqlshlib tests as well Key: CASSANDRA-16121 URL: https://issues.apache.org/jira/browse/CASSANDRA-16121 Project: Cassandra Issue Type: Bug Components: Test/unit Reporter: Berenguer Blasi Assignee: Berenguer Blasi Currently circleci is not running cqlshlib tests. This resulted in some bugs not being caught before committing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15406) Show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193502#comment-17193502 ] Benjamin Lerer commented on CASSANDRA-15406: Some of the SSTable components are not. immutable The statistics contains the sstable level that can change. The index summary can also change when an index redistribution is performed. The index redistribution resize the index summaries in order to give more memory to hot sstables and less memory to cold sstables. > Show the progress of data streaming and index build > > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0, 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . > > PR [https://github.com/apache/cassandra/pull/558] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13935) Indexes and UDTs creation should have IF NOT EXISTS on its String representation
[ https://issues.apache.org/jira/browse/CASSANDRA-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-13935: -- Reviewers: Andres de la Peña, Benjamin Lerer (was: Benjamin Lerer) > Indexes and UDTs creation should have IF NOT EXISTS on its String > representation > > > Key: CASSANDRA-13935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13935 > Project: Cassandra > Issue Type: Bug > Components: Feature/2i Index, Legacy/CQL > Environment: Ubuntu 16.04.2 LTS > java version "1.8.0_144" > Java(TM) SE Runtime Environment (build 1.8.0_144-b01) > Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) >Reporter: Javier Canillas >Assignee: Stefan Miklosovic >Priority: Low > Fix For: 4.0-beta > > Attachments: 13935-3.0.txt, 13935-3.11.txt, 13935-trunk.txt > > Time Spent: 20m > Remaining Estimate: 0h > > I came across something that bothers me a lot. I'm using snapshots to backup > data from my Cassandra cluster in case something really bad happens (like > dropping a table or a keyspace). > Exercising the recovery actions from those backups, I discover that the > schema put on the file "schema.cql" as a result of the snapshot has the > "CREATE IF NOT EXISTS" for the table, but not for the indexes. > When restoring from snapshots, and relying on the execution of these schemas > to build up the table structure, everything seems fine for tables without > secondary indexes, but for the ones that make use of them, the execution of > these statements fail miserably. > Here I paste a generated schema.cql content for a table with indexes: > CREATE TABLE IF NOT EXISTS keyspace1.table1 ( > id text PRIMARY KEY, > content text, > last_update_date date, > last_update_date_time timestamp) > WITH ID = f1045fc0-2f59-11e7-95ec-295c3c064920 > AND bloom_filter_fp_chance = 0.01 > AND dclocal_read_repair_chance = 0.1 > AND crc_check_chance = 1.0 > AND default_time_to_live = 864 > AND gc_grace_seconds = 864000 > AND min_index_interval = 128 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE' > AND caching = { 'keys': 'NONE', 'rows_per_partition': 'NONE' } > AND compaction = { 'max_threshold': '32', 'min_threshold': '4', > 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' } > AND compression = { 'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor' } > AND cdc = false > AND extensions = { }; > CREATE INDEX table1_last_update_date_idx ON keyspace1.table1 > (last_update_date); > I think the last part should be: > CREATE INDEX IF NOT EXISTS table1_last_update_date_idx ON keyspace1.table1 > (last_update_date); > // edit by Stefan Miklosovic > PR: https://github.com/apache/cassandra/pull/731 > I have added UDTs as part of this patch as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15406) Show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193488#comment-17193488 ] Stefan Miklosovic commented on CASSANDRA-15406: --- [~blerer] interesting, I was living with an opinion that SSTable is an immutable construct? How it may happen that the reported size differs over time? > Show the progress of data streaming and index build > > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0, 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . > > PR [https://github.com/apache/cassandra/pull/558] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15902) OOM because repair session thread not closed when terminating repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swen Fuhrmann updated CASSANDRA-15902: -- Test and Documentation Plan: * Add unit test exposing the issue * For trunk, add only regression test as unit test Status: Patch Available (was: Open) > OOM because repair session thread not closed when terminating repair > > > Key: CASSANDRA-15902 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15902 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Swen Fuhrmann >Assignee: Swen Fuhrmann >Priority: Normal > Fix For: 3.0.x, 3.11.x > > Attachments: heap-mem-histo.txt, repair-terminated.txt > > > In our cluster, after a while some nodes running slowly out of memory. On > that nodes we observed that Cassandra Reaper terminate repairs with a JMX > call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} because > reaching timeout of 30 min. > In the memory heap dump we see lot of instances of > {{io.netty.util.concurrent.FastThreadLocalThread}} occupy most of the memory: > {noformat} > 119 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by > "sun.misc.Launcher$AppClassLoader @ 0x51a80" occupy 8.445.684.480 (93,96 > %) bytes. {noformat} > In the thread dump we see lot of repair threads: > {noformat} > grep "Repair#" threaddump.txt | wc -l > 50 {noformat} > > The repair jobs are waiting for the validation to finish: > {noformat} > "Repair#152:1" #96170 daemon prio=5 os_prio=0 tid=0x12fc5000 > nid=0x542a waiting on condition [0x7f81ee414000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0007939bcfc8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137) > at > com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1509) > at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:160) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$13/480490520.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) {noformat} > > Thats the line where the threads stuck: > {noformat} > // Wait for validation to complete > Futures.getUnchecked(validations); {noformat} > > The call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} stops > the thread pool executor. It looks like that futures which are in progress > will therefor never be completed and the repair thread waits forever and > won't be finished. > > Environment: > Cassandra version: 3.11.4 and 3.11.6 > Cassandra Reaper: 1.4.0 > JVM memory settings: > {noformat} > -Xms11771M -Xmx11771M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > on another cluster with same issue: > {noformat} > -Xms31744M -Xmx31744M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > Java Runtime: > {noformat} > openjdk version "1.8.0_212" > OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03) > OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode) > {noformat} > > The same issue described in this comment: > https://issues.apache.org/jira/browse/CASSANDRA-14355?focusedCommentId=16992973=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16992973 > As suggested in the comments I created this new specific ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CASSANDRA-15902) OOM because repair session thread not closed when terminating repair
[ https://issues.apache.org/jira/browse/CASSANDRA-15902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-15902: Bug Category: Parent values: Degradation(12984)Level 1 values: Resource Management(12995) Complexity: Normal Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > OOM because repair session thread not closed when terminating repair > > > Key: CASSANDRA-15902 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15902 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Swen Fuhrmann >Assignee: Swen Fuhrmann >Priority: Normal > Fix For: 3.0.x, 3.11.x > > Attachments: heap-mem-histo.txt, repair-terminated.txt > > > In our cluster, after a while some nodes running slowly out of memory. On > that nodes we observed that Cassandra Reaper terminate repairs with a JMX > call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} because > reaching timeout of 30 min. > In the memory heap dump we see lot of instances of > {{io.netty.util.concurrent.FastThreadLocalThread}} occupy most of the memory: > {noformat} > 119 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by > "sun.misc.Launcher$AppClassLoader @ 0x51a80" occupy 8.445.684.480 (93,96 > %) bytes. {noformat} > In the thread dump we see lot of repair threads: > {noformat} > grep "Repair#" threaddump.txt | wc -l > 50 {noformat} > > The repair jobs are waiting for the validation to finish: > {noformat} > "Repair#152:1" #96170 daemon prio=5 os_prio=0 tid=0x12fc5000 > nid=0x542a waiting on condition [0x7f81ee414000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0007939bcfc8> (a > com.google.common.util.concurrent.AbstractFuture$Sync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285) > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > at > com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137) > at > com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1509) > at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:160) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$13/480490520.run(Unknown > Source) > at java.lang.Thread.run(Thread.java:748) {noformat} > > Thats the line where the threads stuck: > {noformat} > // Wait for validation to complete > Futures.getUnchecked(validations); {noformat} > > The call to {{StorageServiceMBean.forceTerminateAllRepairSessions()}} stops > the thread pool executor. It looks like that futures which are in progress > will therefor never be completed and the repair thread waits forever and > won't be finished. > > Environment: > Cassandra version: 3.11.4 and 3.11.6 > Cassandra Reaper: 1.4.0 > JVM memory settings: > {noformat} > -Xms11771M -Xmx11771M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > on another cluster with same issue: > {noformat} > -Xms31744M -Xmx31744M -XX:+UseG1GC -XX:MaxGCPauseMillis=100 > -XX:+ParallelRefProcEnabled -XX:MaxMetaspaceSize=100M {noformat} > Java Runtime: > {noformat} > openjdk version "1.8.0_212" > OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03) > OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode) > {noformat} > > The same issue described in this comment: > https://issues.apache.org/jira/browse/CASSANDRA-14355?focusedCommentId=16992973=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16992973 > As suggested in the comments I created this new specific ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CASSANDRA-16119) MockSchema's SSTableReader creation leaks FileHandle and Channel instances
[ https://issues.apache.org/jira/browse/CASSANDRA-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193459#comment-17193459 ] Marcus Eriksson edited comment on CASSANDRA-16119 at 9/10/20, 8:28 AM: --- lgtm, +1 was (Author: krummas): lgtm > MockSchema's SSTableReader creation leaks FileHandle and Channel instances > -- > > Key: CASSANDRA-16119 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16119 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > {{MockSchema}} creates {{SSTableReader}} instances for testing, but when it > does, it doesn’t seem to ever close the {{FileHandle}} and {{Channel}} > instances from which copies are made for the actual readers. ({{FileHandle}} > itself also internally copies the channel on creation.) This can trigger leak > detection, although perhaps not reliably, from tests like > {{AntiCompactionTest}}. A couple well-placed {{try-with-resources}} blocks > should help us avoid this (and shouldn't risk closing anything too early, > since the close methods for handles and channels seem only to do reference > bookkeeping anyway). > Example: > {noformat} > [junit-timeout] ERROR 16:35:47,747 LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb) to class > org.apache.cassandra.io.util.FileHandle$Cleanup@2072030898:/var/folders/4d/zfjs7m7s6x5_l93k33r5k668gn/T/mocksegmentedfile0tmp > was not released before the reference was garbage collected > [junit-timeout] ERROR 16:35:47,747 Allocate trace > org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb: > [junit-timeout] Thread[main,5,main] > [junit-timeout] at java.lang.Thread.getStackTrace(Thread.java:1559) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.SharedCloseableImpl.(SharedCloseableImpl.java:30) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:74) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:50) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:389) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:124) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:83) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16119) MockSchema's SSTableReader creation leaks FileHandle and Channel instances
[ https://issues.apache.org/jira/browse/CASSANDRA-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-16119: Reviewers: Marcus Eriksson, Marcus Eriksson (was: Marcus Eriksson) Marcus Eriksson, Marcus Eriksson (was: Marcus Eriksson) Status: Review In Progress (was: Patch Available) > MockSchema's SSTableReader creation leaks FileHandle and Channel instances > -- > > Key: CASSANDRA-16119 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16119 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > {{MockSchema}} creates {{SSTableReader}} instances for testing, but when it > does, it doesn’t seem to ever close the {{FileHandle}} and {{Channel}} > instances from which copies are made for the actual readers. ({{FileHandle}} > itself also internally copies the channel on creation.) This can trigger leak > detection, although perhaps not reliably, from tests like > {{AntiCompactionTest}}. A couple well-placed {{try-with-resources}} blocks > should help us avoid this (and shouldn't risk closing anything too early, > since the close methods for handles and channels seem only to do reference > bookkeeping anyway). > Example: > {noformat} > [junit-timeout] ERROR 16:35:47,747 LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb) to class > org.apache.cassandra.io.util.FileHandle$Cleanup@2072030898:/var/folders/4d/zfjs7m7s6x5_l93k33r5k668gn/T/mocksegmentedfile0tmp > was not released before the reference was garbage collected > [junit-timeout] ERROR 16:35:47,747 Allocate trace > org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb: > [junit-timeout] Thread[main,5,main] > [junit-timeout] at java.lang.Thread.getStackTrace(Thread.java:1559) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.SharedCloseableImpl.(SharedCloseableImpl.java:30) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:74) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:50) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:389) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:124) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:83) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16119) MockSchema's SSTableReader creation leaks FileHandle and Channel instances
[ https://issues.apache.org/jira/browse/CASSANDRA-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-16119: Status: Ready to Commit (was: Review In Progress) lgtm > MockSchema's SSTableReader creation leaks FileHandle and Channel instances > -- > > Key: CASSANDRA-16119 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16119 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Fix For: 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > {{MockSchema}} creates {{SSTableReader}} instances for testing, but when it > does, it doesn’t seem to ever close the {{FileHandle}} and {{Channel}} > instances from which copies are made for the actual readers. ({{FileHandle}} > itself also internally copies the channel on creation.) This can trigger leak > detection, although perhaps not reliably, from tests like > {{AntiCompactionTest}}. A couple well-placed {{try-with-resources}} blocks > should help us avoid this (and shouldn't risk closing anything too early, > since the close methods for handles and channels seem only to do reference > bookkeeping anyway). > Example: > {noformat} > [junit-timeout] ERROR 16:35:47,747 LEAK DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb) to class > org.apache.cassandra.io.util.FileHandle$Cleanup@2072030898:/var/folders/4d/zfjs7m7s6x5_l93k33r5k668gn/T/mocksegmentedfile0tmp > was not released before the reference was garbage collected > [junit-timeout] ERROR 16:35:47,747 Allocate trace > org.apache.cassandra.utils.concurrent.Ref$State@487c0fdb: > [junit-timeout] Thread[main,5,main] > [junit-timeout] at java.lang.Thread.getStackTrace(Thread.java:1559) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101) > [junit-timeout] at > org.apache.cassandra.utils.concurrent.SharedCloseableImpl.(SharedCloseableImpl.java:30) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:74) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle.(FileHandle.java:50) > [junit-timeout] at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:389) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:124) > [junit-timeout] at > org.apache.cassandra.schema.MockSchema.sstable(MockSchema.java:83) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15406) Show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193448#comment-17193448 ] Benjamin Lerer commented on CASSANDRA-15406: [~stefan.miklosovic] the changes in CASSANDRA-15861 should have improved the situation on the server side of part. The size might not be always accurate for a zero copy streaming if the sstable was modified between the streaming plan creation and the sstable transfert but hopefully it should not happen too often. > Show the progress of data streaming and index build > > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0, 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . > > PR [https://github.com/apache/cassandra/pull/558] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13935) Indexes and UDTs creation should have IF NOT EXISTS on its String representation
[ https://issues.apache.org/jira/browse/CASSANDRA-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193445#comment-17193445 ] Benjamin Lerer commented on CASSANDRA-13935: CI results looks good. +1 on my side > Indexes and UDTs creation should have IF NOT EXISTS on its String > representation > > > Key: CASSANDRA-13935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13935 > Project: Cassandra > Issue Type: Bug > Components: Feature/2i Index, Legacy/CQL > Environment: Ubuntu 16.04.2 LTS > java version "1.8.0_144" > Java(TM) SE Runtime Environment (build 1.8.0_144-b01) > Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) >Reporter: Javier Canillas >Assignee: Stefan Miklosovic >Priority: Low > Fix For: 4.0-beta > > Attachments: 13935-3.0.txt, 13935-3.11.txt, 13935-trunk.txt > > Time Spent: 20m > Remaining Estimate: 0h > > I came across something that bothers me a lot. I'm using snapshots to backup > data from my Cassandra cluster in case something really bad happens (like > dropping a table or a keyspace). > Exercising the recovery actions from those backups, I discover that the > schema put on the file "schema.cql" as a result of the snapshot has the > "CREATE IF NOT EXISTS" for the table, but not for the indexes. > When restoring from snapshots, and relying on the execution of these schemas > to build up the table structure, everything seems fine for tables without > secondary indexes, but for the ones that make use of them, the execution of > these statements fail miserably. > Here I paste a generated schema.cql content for a table with indexes: > CREATE TABLE IF NOT EXISTS keyspace1.table1 ( > id text PRIMARY KEY, > content text, > last_update_date date, > last_update_date_time timestamp) > WITH ID = f1045fc0-2f59-11e7-95ec-295c3c064920 > AND bloom_filter_fp_chance = 0.01 > AND dclocal_read_repair_chance = 0.1 > AND crc_check_chance = 1.0 > AND default_time_to_live = 864 > AND gc_grace_seconds = 864000 > AND min_index_interval = 128 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE' > AND caching = { 'keys': 'NONE', 'rows_per_partition': 'NONE' } > AND compaction = { 'max_threshold': '32', 'min_threshold': '4', > 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' } > AND compression = { 'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor' } > AND cdc = false > AND extensions = { }; > CREATE INDEX table1_last_update_date_idx ON keyspace1.table1 > (last_update_date); > I think the last part should be: > CREATE INDEX IF NOT EXISTS table1_last_update_date_idx ON keyspace1.table1 > (last_update_date); > // edit by Stefan Miklosovic > PR: https://github.com/apache/cassandra/pull/731 > I have added UDTs as part of this patch as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org