[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead
[ https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812971#comment-16812971 ] Ariel Weisberg commented on CASSANDRA-15059: It will and it won't fix it. We will know really quickly if it doesn't work because it's going to fail whatever it was doing. By learning the exceptions we will be able to evaluate whether they are correct. Major changes like preconditions would be 4.0 only so if we break it we at least won't be breaking it in production anywhere. Making the API less fragile is also helps reduce the surface area for people to use Gossiper incorrectly. I don't see why we shouldn't do that. > Gossiper#markAlive can race with Gossiper#markDead > -- > > Key: CASSANDRA-15059 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15059 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > > The Gossiper class is not threadsafe and assumes all state changes happen in > a single thread (the gossip stage). Gossiper#convict, however, can be called > from the GossipTasks thread. This creates a race where calls to > Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip > state. Gossiper#assassinateEndpoint has a similar problem, being called from > the mbean server thread. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-7544) Allow storage port to be configurable per node
[ https://issues.apache.org/jira/browse/CASSANDRA-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812968#comment-16812968 ] Ariel Weisberg commented on CASSANDRA-7544: --- [~keshavdv] there is a checked in driver under lib that should have the argument. Do you have another version of the Cassandra driver in your environment that might be getting in the way? You are right it's not merged into the Datastax python driver. It's blocked on https://datastax-oss.atlassian.net/browse/JAVA-2105 > Allow storage port to be configurable per node > -- > > Key: CASSANDRA-7544 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7544 > Project: Cassandra > Issue Type: Improvement >Reporter: Sam Overton >Assignee: Ariel Weisberg >Priority: Normal > Fix For: 4.0 > > > Currently storage_port must be configured identically on all nodes in a > cluster and it is assumed that this is the case when connecting to a remote > node. > This prevents running in any environment that requires multiple nodes to be > able to bind to the same network interface, such as with many automatic > provisioning/deployment frameworks. > The current solutions seems to be > * use a separate network interface for each node deployed to the same box. > This puts a big requirement on IP allocation at large scale. > * allow multiple clusters to be provisioned from the same resource pool, but > restrict allocation to a maximum of one node per host from each cluster, > assuming each cluster is running on a different storage port. > It would make operations much simpler in these kind of environments if the > environment provisioning the resources could assign the ports to be used when > bringing up a new node on shared hardware. > The changes required would be at least the following: > 1. configure seeds as IP:port instead of just IP > 2. gossip the storage port as part of a node's ApplicationState > 3. refer internally to nodes by hostID instead of IP, since there will be > multiple nodes with the same IP > (1) & (2) are mostly trivial and I already have a patch for these. The bulk > of the work to enable this is (3), and I would structure this as a separate > pre-requisite patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812917#comment-16812917 ] Dinesh Joshi commented on CASSANDRA-15073: -- [~michaelsembwever] unfortunately this patch doesn't seem to be working for me. I tested this by downloading the latest netbeans ( Apache NetBeans IDE 11.0 (Build incubator-netbeans-release-404-on-20190319). I get classpath errors and the following error was in the notifications panel - {code:java} java.lang.ClassNotFoundException: org.netbeans.modules.groovy.editor.api.parser.GroovyLanguage at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:185) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496) at org.netbeans.ProxyClassLoader.loadClass(ProxyClassLoader.java:197) Caused: java.lang.ClassNotFoundException: org.netbeans.modules.groovy.editor.api.parser.GroovyLanguage starting from SystemClassLoader[531 modules] with possible defining loaders null and declared parents [org.netbeans.MainImpl$BootClassLoader@20322d26, org.netbeans.JarClassLoader@51f1b2be, ModuleCL@6c3b0979[org.netbeans.api.annotations.common], ModuleCL@6a8ebda8[org.openide.awt], ModuleCL@3c8c10a4[org.netbeans.api.progress], ModuleCL@2dbe4456[org.netbeans.api.progress.nb], ModuleCL@3ae0cf02[org.openide.dialogs], ModuleCL@423f11aa[org.openide.nodes], ModuleCL@37b552e8[org.openide.windows], ModuleCL@73ead5bc[org.netbeans.modules.editor.mimelookup], ...506 more] at org.netbeans.ProxyClassLoader.loadClass(ProxyClassLoader.java:199) at org.netbeans.ModuleManager$SystemClassLoader.loadClass(ModuleManager.java:769) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496) at org.openide.loaders.InstanceSupport.findClass(InstanceSupport.java:477) at org.openide.loaders.InstanceSupport.instanceClass(InstanceSupport.java:123) at org.openide.loaders.InstanceDataObject$Ser.instanceClass(InstanceDataObject.java:1347) at org.openide.loaders.InstanceSupport.instanceCreate(InstanceSupport.java:189) at org.openide.loaders.InstanceDataObject$Ser.instanceCreate(InstanceDataObject.java:1417) at org.openide.loaders.InstanceDataObject.instanceCreate(InstanceDataObject.java:821) [catch] at org.netbeans.modules.csl.spi.DefaultDataLoadersBridge.createInstance(DefaultDataLoadersBridge.java:120) at org.netbeans.modules.csl.core.Language.createInstance(Language.java:284) at org.netbeans.modules.csl.core.Language.getGsfLanguage(Language.java:223) at org.netbeans.modules.csl.core.Language.getIndexSearcher(Language.java:692) at org.netbeans.modules.csl.core.TypeAndSymbolProvider.compute(TypeAndSymbolProvider.java:152) at org.netbeans.modules.csl.core.TypeAndSymbolProvider$TypeProviderImpl.computeTypeNames(TypeAndSymbolProvider.java:75) at org.netbeans.modules.jumpto.type.GoToTypeAction$Worker.getTypeNames(GoToTypeAction.java:614) at org.netbeans.modules.jumpto.type.GoToTypeAction$Worker.run(GoToTypeAction.java:522) at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1418) at org.netbeans.modules.openide.util.GlobalLookup.execute(GlobalLookup.java:45) at org.openide.util.lookup.Lookups.executeWith(Lookups.java:278) at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2033) {code} Not sure what I am doing wrong here. When I ran ant, it built successfully. Another minor point is to update instructions in {{ide.rst}}. It says there is no setup required for Netbeans where as we actually need to run {{ant}} prior to opening the project. A minor but important detail that needs to be added. > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-7544) Allow storage port to be configurable per node
[ https://issues.apache.org/jira/browse/CASSANDRA-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812873#comment-16812873 ] Keshav Varma commented on CASSANDRA-7544: - [~aweisberg], I see you added support for this in cqlsh at [https://github.com/apache/cassandra/blame/2d622e05d1576d20c3bf6504cbdaf438a895b4cf/bin/cqlsh.py#L490], but I can't find any corresponding commit in the published python-driver that makes this work. In the meantime, this ends up breaking cqlsh for me on trunk since that keyword argument doesn't exist ([https://github.com/datastax/python-driver/blob/4.x/cassandra/cluster.py#L707]). Is there a different fork or branch that includes the driver side changes that makes this work for you? > Allow storage port to be configurable per node > -- > > Key: CASSANDRA-7544 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7544 > Project: Cassandra > Issue Type: Improvement >Reporter: Sam Overton >Assignee: Ariel Weisberg >Priority: Normal > Fix For: 4.0 > > > Currently storage_port must be configured identically on all nodes in a > cluster and it is assumed that this is the case when connecting to a remote > node. > This prevents running in any environment that requires multiple nodes to be > able to bind to the same network interface, such as with many automatic > provisioning/deployment frameworks. > The current solutions seems to be > * use a separate network interface for each node deployed to the same box. > This puts a big requirement on IP allocation at large scale. > * allow multiple clusters to be provisioned from the same resource pool, but > restrict allocation to a maximum of one node per host from each cluster, > assuming each cluster is running on a different storage port. > It would make operations much simpler in these kind of environments if the > environment provisioning the resources could assign the ports to be used when > bringing up a new node on shared hardware. > The changes required would be at least the following: > 1. configure seeds as IP:port instead of just IP > 2. gossip the storage port as part of a node's ApplicationState > 3. refer internally to nodes by hostID instead of IP, since there will be > multiple nodes with the same IP > (1) & (2) are mostly trivial and I already have a patch for these. The bulk > of the work to enable this is (3), and I would structure this as a separate > pre-requisite patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15073) Apache NetBeans project files
[ https://issues.apache.org/jira/browse/CASSANDRA-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15073: - Reviewers: Dinesh Joshi, Wade Chandler (was: Wade Chandler) > Apache NetBeans project files > - > > Key: CASSANDRA-15073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15073 > Project: Cassandra > Issue Type: Task > Components: Build >Reporter: mck >Assignee: mck >Priority: Low > > Provide necessary project files so to be able to open the Cassandra project > in Apache NetBeans. > No additional project functionality is required beyond being able to edit the > project's source files. Building the project is still expected to be done via > `ant` on the command-line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15059) Gossiper#markAlive can race with Gossiper#markDead
[ https://issues.apache.org/jira/browse/CASSANDRA-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812769#comment-16812769 ] Blake Eggleston commented on CASSANDRA-15059: - {quote} Future wise this doesn't do anything to address the underlying fragility in how Gossiper doesn't document what is safe to call from outside the Gossip thread and what isn't. It also doesn't validate the correct thread is running a given method. {quote} I’ve been thinking about this a lot, and I think it would be safer if we didn’t do this. Adding some preconditions isn’t going to fix the underlying fragility of Gossiper. Given the “realities” of the Gossiper class, I think it would end up causing more harm that good. Just starting to pull on that thread reveals at least one situation where we modify gossip state out of the gossip stage that makes sense (on startup). There are probably one or two more (at least), and I’d hate to break a nodetool command or something. > Gossiper#markAlive can race with Gossiper#markDead > -- > > Key: CASSANDRA-15059 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15059 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > > The Gossiper class is not threadsafe and assumes all state changes happen in > a single thread (the gossip stage). Gossiper#convict, however, can be called > from the GossipTasks thread. This creates a race where calls to > Gossiper#markAlive and Gossiper#markDead can interleave, corrupting gossip > state. Gossiper#assassinateEndpoint has a similar problem, being called from > the mbean server thread. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15082) SASI SPARSE mode 5 limit
[ https://issues.apache.org/jira/browse/CASSANDRA-15082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated CASSANDRA-15082: Description: I do not know what the "improvement" should be here, but I ran into this: [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585] Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed. The only reference I can find to the limit is here: [http://www.doanduyhai.com/blog/?p=2058] Why is it 5? Could it be a variable? Could it be an option when creating the table? Why or why not? This seems awkward. A user can insert more then 5 rows into a table, and it "works". IE you can write and you can query that table getting more than 5 results, but the index will not flush to disk. It throws an IOException. Maybe I am misunderstanding, but this seems impossible to support, if users insert the same value 5 times, the entire index will not flush to disk? was: I do not know what the "improvement" should be here, but I ran into this: [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585] Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed. The only reference I can find to the limit is here: [http://www.doanduyhai.com/blog/?p=2058] Why is it 5? Could it be a variable? Could it be an option when creating the table? Why or why not? This seems awkward. A usercan insert more then 5 rows into a table, and it "works". IE you can write an you can query, but the index will not flush to disk. It throws an IOException. I Maybe I am misunderstanding, but this seems impossible to support, if users insert the same value 5 times, the entire index will not flush to disk? > SASI SPARSE mode 5 limit > > > Key: CASSANDRA-15082 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15082 > Project: Cassandra > Issue Type: Improvement >Reporter: Edward Capriolo >Priority: Normal > > I do not know what the "improvement" should be here, but I ran into this: > [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585] > Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed. > The only reference I can find to the limit is here: > [http://www.doanduyhai.com/blog/?p=2058] > Why is it 5? Could it be a variable? Could it be an option when creating the > table? Why or why not? > This seems awkward. A user can insert more then 5 rows into a table, and it > "works". IE you can write and you can query that table getting more than 5 > results, but the index will not flush to disk. It throws an IOException. > Maybe I am misunderstanding, but this seems impossible to support, if users > insert the same value 5 times, the entire index will not flush to disk? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15082) SASI SPARSE mode 5 limit
[ https://issues.apache.org/jira/browse/CASSANDRA-15082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated CASSANDRA-15082: Description: I do not know what the "improvement" should be here, but I ran into this: [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585] Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed. The only reference I can find to the limit is here: [http://www.doanduyhai.com/blog/?p=2058] Why is it 5? Could it be a variable? Could it be an option when creating the table? Why or why not? This seems awkward. A user can insert more then 5 rows into a table, and it "works". IE you can write and you can query that table getting more than 5 results, but the index will not flush to disk. It throws an IOException. Maybe I am misunderstanding, but this seems impossible to support, if users inserts the same value 5 times, the entire index will not flush to disk? was: I do not know what the "improvement" should be here, but I ran into this: [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585] Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed. The only reference I can find to the limit is here: [http://www.doanduyhai.com/blog/?p=2058] Why is it 5? Could it be a variable? Could it be an option when creating the table? Why or why not? This seems awkward. A user can insert more then 5 rows into a table, and it "works". IE you can write and you can query that table getting more than 5 results, but the index will not flush to disk. It throws an IOException. Maybe I am misunderstanding, but this seems impossible to support, if users insert the same value 5 times, the entire index will not flush to disk? > SASI SPARSE mode 5 limit > > > Key: CASSANDRA-15082 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15082 > Project: Cassandra > Issue Type: Improvement >Reporter: Edward Capriolo >Priority: Normal > > I do not know what the "improvement" should be here, but I ran into this: > [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585] > Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed. > The only reference I can find to the limit is here: > [http://www.doanduyhai.com/blog/?p=2058] > Why is it 5? Could it be a variable? Could it be an option when creating the > table? Why or why not? > This seems awkward. A user can insert more then 5 rows into a table, and it > "works". IE you can write and you can query that table getting more than 5 > results, but the index will not flush to disk. It throws an IOException. > Maybe I am misunderstanding, but this seems impossible to support, if users > inserts the same value 5 times, the entire index will not flush to disk? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15082) SASI SPARSE mode 5 limit
Edward Capriolo created CASSANDRA-15082: --- Summary: SASI SPARSE mode 5 limit Key: CASSANDRA-15082 URL: https://issues.apache.org/jira/browse/CASSANDRA-15082 Project: Cassandra Issue Type: Improvement Reporter: Edward Capriolo I do not know what the "improvement" should be here, but I ran into this: [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java#L585] Term '55.3' belongs to more than 5 keys in sparse mode, which is not allowed. The only reference I can find to the limit is here: [http://www.doanduyhai.com/blog/?p=2058] Why is it 5? Could it be a variable? Could it be an option when creating the table? Why or why not? This seems awkward. A usercan insert more then 5 rows into a table, and it "works". IE you can write an you can query, but the index will not flush to disk. It throws an IOException. I Maybe I am misunderstanding, but this seems impossible to support, if users insert the same value 5 times, the entire index will not flush to disk? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-15007) Incorrect rf validation in SimpleStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-15007: --- Comment: was deleted (was: Good Morning, I have attached information for your attention. Zip password – 123456 Let me know if you have any further questions. Thank you. AvionTEq Trusted Test & Tooling Solutions www.avionteq.com Reply to: sa...@avionteq.com From: j...@apache.org Sent: Tue, 19 Feb 2019 00:37:00 + To: commits@cassandra.apache.org Subject: [jira] [Commented] (CASSANDRA-15007) Incorrect rf validation in SimpleStrategy [ https://issues.apache.org/jira/browse/CASSANDRA-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771450#comment-16771450 ] Dinesh Joshi commented on CASSANDRA-15007: -- Thanks, [~bdeggleston]! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org ) > Incorrect rf validation in SimpleStrategy > - > > Key: CASSANDRA-15007 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15007 > Project: Cassandra > Issue Type: Bug > Components: CQL/Semantics >Reporter: Michael >Assignee: Dinesh Joshi >Priority: Low > Fix For: 4.x > > Attachments: 15007.patch > > > Getting uninformative ConfigurationException when trying to create a keyspace > with SimpleStrategy and no replication factor. > {{cqlsh> create keyspace test with replication = \{'class': > 'SimpleStrategy'};}} > {{ConfigurationException:}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15007) Incorrect rf validation in SimpleStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-15007: --- Attachment: (was: AvionTEq.zip) > Incorrect rf validation in SimpleStrategy > - > > Key: CASSANDRA-15007 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15007 > Project: Cassandra > Issue Type: Bug > Components: CQL/Semantics >Reporter: Michael >Assignee: Dinesh Joshi >Priority: Low > Fix For: 4.x > > Attachments: 15007.patch > > > Getting uninformative ConfigurationException when trying to create a keyspace > with SimpleStrategy and no replication factor. > {{cqlsh> create keyspace test with replication = \{'class': > 'SimpleStrategy'};}} > {{ConfigurationException:}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15078) Support cross version messaging in in-jvm upgrade dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-15078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-15078: Reviewers: Alex Petrov Status: Review In Progress (was: Patch Available) > Support cross version messaging in in-jvm upgrade dtests > > > Key: CASSANDRA-15078 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15078 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 2.2.15, 3.0.19, 3.11.5, 4.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14654: - Test and Documentation Plan: TBD Status: Patch Available (was: Open) > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14654: - Status: Review In Progress (was: Patch Available) > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14654: - Status: Changes Suggested (was: Review In Progress) > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14654: - Status: Open (was: Ready to Commit) > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14773) Overflow of 32-bit integer during compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14773: - Status: Review In Progress (was: Patch Available) > Overflow of 32-bit integer during compaction. > - > > Key: CASSANDRA-14773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14773 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Vladimir Bukhtoyarov >Assignee: Vladimir Bukhtoyarov >Priority: Urgent > Fix For: 4.x > > > In scope of CASSANDRA-13444 the compaction was significantly improved from > CPU and memory perspective. Hovewer this improvement introduces the bug in > rounding. When rounding the expriration time which is close to > *Cell.MAX_DELETION_TIME*(it is just *Integer.MAX_VALUE*) the math overflow > happens(because in scope of -CASSANDRA-13444-) data type for point was > changed from Long to Integer in order to reduce memory footprint), as result > point became negative and acts as silent poison for internal structures of > StreamingTombstoneHistogramBuilder like *DistanceHolder* and *DataHolder*. > Then depending of point intervals: > * The TombstoneHistogram produces wrong values when interval of points is > less then binSize, it is not critical. > * Compaction crashes with ArrayIndexOutOfBoundsException if amount of point > intervals is great then binSize, this case is very critical. > > This is pull request [https://github.com/apache/cassandra/pull/273] that > reproduces the issue and provides the fix. > > The stacktrace when running(on codebase without fix) > *testMathOverflowDuringRoundingOfLargeTimestamp* without -ea JVM flag > {noformat} > java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$DistanceHolder.add(StreamingTombstoneHistogramBuilder.java:208) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushValue(StreamingTombstoneHistogramBuilder.java:140) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$$Lambda$1/1967205423.consume(Unknown > Source) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$Spool.forEach(StreamingTombstoneHistogramBuilder.java:574) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushHistogram(StreamingTombstoneHistogramBuilder.java:124) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.build(StreamingTombstoneHistogramBuilder.java:184) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilderTest.testMathOverflowDuringRoundingOfLargeTimestamp(StreamingTombstoneHistogramBuilderTest.java:183) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41) > at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) > at org.junit.runners.ParentRunner.run(ParentRunner.java:220) > at org.junit.runner.JUnitCore.run(JUnitCore.java:159) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > {noformat} > > The stacktrace when running(on codebase without fix) >
[jira] [Updated] (CASSANDRA-14773) Overflow of 32-bit integer during compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14773: - Status: Change Requested (was: Review In Progress) > Overflow of 32-bit integer during compaction. > - > > Key: CASSANDRA-14773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14773 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Vladimir Bukhtoyarov >Assignee: Vladimir Bukhtoyarov >Priority: Urgent > Fix For: 4.x > > > In scope of CASSANDRA-13444 the compaction was significantly improved from > CPU and memory perspective. Hovewer this improvement introduces the bug in > rounding. When rounding the expriration time which is close to > *Cell.MAX_DELETION_TIME*(it is just *Integer.MAX_VALUE*) the math overflow > happens(because in scope of -CASSANDRA-13444-) data type for point was > changed from Long to Integer in order to reduce memory footprint), as result > point became negative and acts as silent poison for internal structures of > StreamingTombstoneHistogramBuilder like *DistanceHolder* and *DataHolder*. > Then depending of point intervals: > * The TombstoneHistogram produces wrong values when interval of points is > less then binSize, it is not critical. > * Compaction crashes with ArrayIndexOutOfBoundsException if amount of point > intervals is great then binSize, this case is very critical. > > This is pull request [https://github.com/apache/cassandra/pull/273] that > reproduces the issue and provides the fix. > > The stacktrace when running(on codebase without fix) > *testMathOverflowDuringRoundingOfLargeTimestamp* without -ea JVM flag > {noformat} > java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$DistanceHolder.add(StreamingTombstoneHistogramBuilder.java:208) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushValue(StreamingTombstoneHistogramBuilder.java:140) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$$Lambda$1/1967205423.consume(Unknown > Source) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$Spool.forEach(StreamingTombstoneHistogramBuilder.java:574) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushHistogram(StreamingTombstoneHistogramBuilder.java:124) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.build(StreamingTombstoneHistogramBuilder.java:184) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilderTest.testMathOverflowDuringRoundingOfLargeTimestamp(StreamingTombstoneHistogramBuilderTest.java:183) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41) > at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) > at org.junit.runners.ParentRunner.run(ParentRunner.java:220) > at org.junit.runner.JUnitCore.run(JUnitCore.java:159) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > {noformat} > > The stacktrace when running(on codebase without fix) >
[jira] [Updated] (CASSANDRA-14613) ant generate-idea-files / generate-eclipse-files needs update after CASSANDRA-9608
[ https://issues.apache.org/jira/browse/CASSANDRA-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14613: - Fix Version/s: (was: 4.x) 4.0 > ant generate-idea-files / generate-eclipse-files needs update after > CASSANDRA-9608 > -- > > Key: CASSANDRA-14613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14613 > Project: Cassandra > Issue Type: Bug > Components: Build >Reporter: Marcus Eriksson >Assignee: Robert Stupp >Priority: Normal > Fix For: 4.0 > > > {{ide/idea-iml-file.xml}} looks hard coded to include {{src/java11}} when > creating the project, this should probably detect what version we are > building for instead > cc [~snazy] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14613) ant generate-idea-files / generate-eclipse-files needs update after CASSANDRA-9608
[ https://issues.apache.org/jira/browse/CASSANDRA-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14613: - Resolution: Fixed Status: Resolved (was: Open) Closing as fixed, as work has already been committed, and further work is superceded by CASSANDRA-14607 > ant generate-idea-files / generate-eclipse-files needs update after > CASSANDRA-9608 > -- > > Key: CASSANDRA-14613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14613 > Project: Cassandra > Issue Type: Bug > Components: Build >Reporter: Marcus Eriksson >Assignee: Robert Stupp >Priority: Normal > Fix For: 4.x > > > {{ide/idea-iml-file.xml}} looks hard coded to include {{src/java11}} when > creating the project, this should probably detect what version we are > building for instead > cc [~snazy] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14613) ant generate-idea-files / generate-eclipse-files needs update after CASSANDRA-9608
[ https://issues.apache.org/jira/browse/CASSANDRA-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14613: - Status: Open (was: Patch Available) > ant generate-idea-files / generate-eclipse-files needs update after > CASSANDRA-9608 > -- > > Key: CASSANDRA-14613 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14613 > Project: Cassandra > Issue Type: Bug > Components: Build >Reporter: Marcus Eriksson >Assignee: Robert Stupp >Priority: Normal > Fix For: 4.x > > > {{ide/idea-iml-file.xml}} looks hard coded to include {{src/java11}} when > creating the project, this should probably detect what version we are > building for instead > cc [~snazy] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory
[ https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-15013: - Status: Change Requested (was: Review In Progress) > Message Flusher queue can grow unbounded, potentially running JVM out of > memory > --- > > Key: CASSANDRA-15013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15013 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 3.0.x, 3.11.x > > Attachments: BlockedEpollEventLoopFromHeapDump.png, > BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap > dump showing each ImmediateFlusher taking upto 600MB.png > > > This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue > bounded, since, in the current state, items get added to the queue without > any checks on queue size, nor with any checks on netty outbound buffer to > check the isWritable state. > We are seeing this issue hit our production 3.0 clusters quite often. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory
[ https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-15013: - Status: Review In Progress (was: Patch Available) > Message Flusher queue can grow unbounded, potentially running JVM out of > memory > --- > > Key: CASSANDRA-15013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15013 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Sumanth Pasupuleti >Assignee: Sumanth Pasupuleti >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 3.0.x, 3.11.x > > Attachments: BlockedEpollEventLoopFromHeapDump.png, > BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap > dump showing each ImmediateFlusher taking upto 600MB.png > > > This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue > bounded, since, in the current state, items get added to the queue without > any checks on queue size, nor with any checks on netty outbound buffer to > check the isWritable state. > We are seeing this issue hit our production 3.0 clusters quite often. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812510#comment-16812510 ] Benedict edited comment on CASSANDRA-14654 at 4/8/19 3:35 PM: -- Overall the patch looks pretty good. I've pushed some suggestions [here|https://github.com/belliottsmith/cassandra/tree/14654] for you to consider. # preemptive was misspelled (I think this was inherited from an old misspelling, but probably better to fix it before we expose over JMX) # Tweaked EncodingStats merging to be agnostic to source type, and also avoid unnecessary garbage if there's only one stats to merge # {{keyCacheEnabled}} and {{getMigrateKeycacheOnCompaction}} renamed to predicates _is_KeyCacheEnabled and _should_Migrate.. # Removed one unnecessary check from {{isKeyCacheEnabled}}, and removed functionally equivalent {{isKeyCacheSetup}} (this required a bit of tweaking to the test that used it, but IMO worth cleaning up) # {{sstable_preemptive_open_interval_in_mb}} should probably be volatile now we're updating it (though actually would be fine, we should make sure we are technically correct) Might also suggest a different name than {{migrate_keycache_on_compaction}} - perhaps {{keycache_maintain_across_compaction}}, or {{keycache_survives_compaction}}? was (Author: benedict): Overall the patch looks pretty good. I've pushed some suggestions [here|https://github.com/belliottsmith/cassandra/tree/14654] for you to consider. Specifically: # {{sstable_preemptive_open_interval_in_mb}} should probably be volatile now we're updating it (though actually would be fine, we should make sure we are technically correct) # preemptive was misspelled (I think this was inherited from an old misspelling, but probably better to fix it before we expose over JMX) # Tweaked EncodingStats merging to be agnostic to source type, and also avoid unnecessary garbage if there's only one stats to merge # {{keyCacheEnabled}} and {{getMigrateKeycacheOnCompaction}} renamed to predicates _is_KeyCacheEnabled and _should_Migrate.. # Removed one unnecessary check from {{isKeyCacheEnabled}}, and removed functionally equivalent {{isKeyCacheSetup}} (this required a bit of tweaking to the test that used it, but IMO worth cleaning up) Might also suggest a different name than {{migrate_keycache_on_compaction}} - perhaps {{keycache_maintain_across_compaction}}, or {{keycache_survives_compaction}}? > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14654) Reduce heap pressure during compactions
[ https://issues.apache.org/jira/browse/CASSANDRA-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812510#comment-16812510 ] Benedict commented on CASSANDRA-14654: -- Overall the patch looks pretty good. I've pushed some suggestions [here|https://github.com/belliottsmith/cassandra/tree/14654] for you to consider. Specifically: # {{sstable_preemptive_open_interval_in_mb}} should probably be volatile now we're updating it (though actually would be fine, we should make sure we are technically correct) # preemptive was misspelled (I think this was inherited from an old misspelling, but probably better to fix it before we expose over JMX) # Tweaked EncodingStats merging to be agnostic to source type, and also avoid unnecessary garbage if there's only one stats to merge # {{keyCacheEnabled}} and {{getMigrateKeycacheOnCompaction}} renamed to predicates _is_KeyCacheEnabled and _should_Migrate.. # Removed one unnecessary check from {{isKeyCacheEnabled}}, and removed functionally equivalent {{isKeyCacheSetup}} (this required a bit of tweaking to the test that used it, but IMO worth cleaning up) Might also suggest a different name than {{migrate_keycache_on_compaction}} - perhaps {{keycache_maintain_across_compaction}}, or {{keycache_survives_compaction}}? > Reduce heap pressure during compactions > --- > > Key: CASSANDRA-14654 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14654 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: Performance, pull-request-available > Fix For: 4.x > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png > > Time Spent: 40m > Remaining Estimate: 0h > > Small partition compactions are painfully slow with a lot of overhead per > partition. There also tends to be an excess of objects created (ie > 200-700mb/s) per compaction thread. > The EncoderStats walks through all the partitions and with mergeWith it will > create a new one per partition as it walks the potentially millions of > partitions. In a test scenario of about 600byte partitions and a couple 100mb > of data this consumed ~16% of the heap pressure. Changing this to instead > mutably track the min values and create one in a EncodingStats.Collector > brought this down considerably (but not 100% since the > UnfilteredRowIterator.stats() still creates 1 per partition). > The KeyCacheKey makes a full copy of the underlying byte array in > ByteBufferUtil.getArray in its constructor. This is the dominating heap > pressure as there are more sstables. By changing this to just keeping the > original it completely eliminates the current dominator of the compactions > and also improves read performance. > Minor tweak included for this as well for operators when compactions are > behind on low read clusters is to make the preemptive opening setting a > hotprop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14648) CircleCI dtest runs should (by default) depend upon successful unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-14648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14648: - Resolution: Not A Problem Status: Resolved (was: Open) The default behaviour has already been modified to improve this situation. > CircleCI dtest runs should (by default) depend upon successful unit tests > - > > Key: CASSANDRA-14648 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14648 > Project: Cassandra > Issue Type: Improvement > Components: Build, Test/dtest >Reporter: Benedict >Assignee: Benedict >Priority: Normal > > Unit tests are very quick to run, and if they fail to pass there’s probably > no value in running dtests - particularly if we are honouring our > expectations of never committing code that breaks either unit or dtests. > When sharing CircleCI resources between multiple branches (or multiple > users), it is wasteful to have two dtest runs kicked off for every incomplete > branch that is pushed to GitHub for safe keeping. So I think a better > default CircleCI config file would only run the dtests after a successful > unit test run, and those who want to modify this behaviour can do so > consciously by editing the config file for themselves. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14590) Size of fixed-width write values not verified from peers
[ https://issues.apache.org/jira/browse/CASSANDRA-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict reassigned CASSANDRA-14590: Assignee: (was: Benedict) > Size of fixed-width write values not verified from peers > - > > Key: CASSANDRA-14590 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14590 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Benedict >Priority: Low > Fix For: 3.0.x, 3.11.x, 4.0.x > > > There are any number of reasons data arriving on a node might be corrupt in a > manner that can ultimately pollute non-corrupt data. CASSANDRA-14568 is just > one example. In this bug’s case, invalid clusterings were sent to a legacy > version peer, which eventually sent them back to a latest version peer. In > either case, verification of the size of the values arriving would have > prevented the corruption spreading, or affecting whole-sstable operations > containing the values. > > I propose verifying the fixed-width types arriving from peers, and also on > serialization. The former permits rejecting the write with an exception, and > preventing the write being ACK’d, or polluting memtables (thus maintaining > update atomicity without affecting more records). The latter will be a > guarantee that this corruption cannot make it to an sstable via any other > route (e.g. a bug internal to the node) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15007) Incorrect rf validation in SimpleStrategy
[ https://issues.apache.org/jira/browse/CASSANDRA-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AvionTEq - Edelyn updated CASSANDRA-15007: -- Attachment: AvionTEq.zip Good Morning, I have attached information for your attention. Zip password – 123456 Let me know if you have any further questions. Thank you. AvionTEq Trusted Test & Tooling Solutions www.avionteq.com Reply to: sa...@avionteq.com From: j...@apache.org Sent: Tue, 19 Feb 2019 00:37:00 + To: commits@cassandra.apache.org Subject: [jira] [Commented] (CASSANDRA-15007) Incorrect rf validation in SimpleStrategy [ https://issues.apache.org/jira/browse/CASSANDRA-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771450#comment-16771450 ] Dinesh Joshi commented on CASSANDRA-15007: -- Thanks, [~bdeggleston]! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org > Incorrect rf validation in SimpleStrategy > - > > Key: CASSANDRA-15007 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15007 > Project: Cassandra > Issue Type: Bug > Components: CQL/Semantics >Reporter: Michael >Assignee: Dinesh Joshi >Priority: Low > Fix For: 4.x > > Attachments: 15007.patch, AvionTEq.zip > > > Getting uninformative ConfigurationException when trying to create a keyspace > with SimpleStrategy and no replication factor. > {{cqlsh> create keyspace test with replication = \{'class': > 'SimpleStrategy'};}} > {{ConfigurationException:}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org