[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506130#comment-15506130 ] Benedict commented on CASSANDRA-12668: -- So, just to crystalise my thoughts on this. Given this presents with contention, there are in all likelihood three possible causes: * B-Tree is just too inherently worse here than SnapTreeMap was * The extra work now done during memtable partition update increases the time taken, and hence number of concurrent operations * One of the many other changes increases the number of concurrent operations (e.g. changes in threading or increased concurrent_writes), or the size of the partitions in a given memtable (e.g. fewer memtable flushes), is increasing the cost of contention Ruling out the first one is probably most straightforwardly done by swapping back in SnapTreeMap in a patch and running your tests. This wouldn't be trivial, but also should be far from terribly difficult, and would avoid striking around in the dark. Otherwise, the suggestion I made in CASSANDRA-7546 stands as my preferred solution to this kind of problem: deferred updates under contention. If a swap of head fails, just tag the update onto a queue, and have all readers merge the queue before responding, with all writers merging all deferred updates along with their own. Readers may also update the b-tree with the result of their merge of there's no contention. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505044#comment-15505044 ] sankalp kohli commented on CASSANDRA-12668: --- We tried the fanout = 4 in tests and found no change. I will update once we test the actual use case. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504996#comment-15504996 ] Benedict commented on CASSANDRA-12668: -- To respond to your edit: bq. Is there any warning in NEWS.txt You're only just bringing this particular aspect of tradeoff to light now. Tradeoffs happen whether we know about them or not, and the fact it's taken so long to come to light suggests it was perhaps a perfectly reasonable hidden tradeoff. But, anyway, we can't talk about a tradeoff meaningfully until we know both sides of the trade. Let's leave that discussion until we actually know the cause. If you're right about the cause, the absolute worst case scenario IMO is to provide a patch to make the collection pluggable. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504958#comment-15504958 ] Benedict commented on CASSANDRA-12668: -- (breaking out of nesting hell) I don't really have the inclination for a philosophical debate about what should or should not be prioritised. However the number of clusters we've had in the past falling over due to runaway compaction is a more prevalent use case that this change was designed to help fix. Which one is more important, I cannot say with certainty, but it's never as black and white as "never make anything worse for anyone" - e.g. thrift users are getting, well, short thrift (hehe) at present. bq. These are not statements Erm... Anyway. The important qualifier is that they were qualitative/correlative, i.e. lacking any numbers for comparison or direct mechanisms/causes for the action, nor any information by which we could make any guesses as to such. i.e., it's much too speculative to be reaching the conclusions you have about cause, or having this detailed a discussion at present, really. There are two ways to attack this: inwards or outwards. Either try to provide all of the contextual information to find avenues to explore via cluster testing, or isolate the b-tree and snaptree to see how they compare under varying levels of contention (with semantics as in the databases, i.e. cas-swapped head). > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504919#comment-15504919 ] sankalp kohli commented on CASSANDRA-12668: --- "Lower throughput != unusable" The cluster was working in 2.0 and is no longer working in 2.1. That is unusable. Applications needs a certain throughput otherwise they will have ever increasing backlog. "Everything is about trade-offs, as was the swapping of the data structure in the first place. There's rarely a 100% free lunch." The tradeoff where a use case is made unusable is not a trade off we should make. "Just some fairly broad qualitative/correlative statements." These are not statements but practical experience with a cluster which has been made unusable in 2.1. I will give out more information that you are asking. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504747#comment-15504747 ] Benedict commented on CASSANDRA-12668: -- Come on, let's avoid hyperbole. Lower throughput != unusable, and if you were to reduce the size of your memtables the increased GC burden would probably be coped with too. Everything is about trade-offs, as was the swapping of the data structure in the first place. There's rarely a 100% free lunch. Still, much more information is needed before anything informative can be said. You've still given very few details; we still have nothing about the size of the partitions, their number, the rate of updates (total and per partition), the size of the datums, the total size provided for the memtables. The configuration parameters such as flush queue size, memtable_cleanup_threshold and concurrent_writes. No profiling information, no heap numbers. Just some fairly broad qualitative/correlative statements. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504699#comment-15504699 ] sankalp kohli commented on CASSANDRA-12668: --- The assertions are from real cluster as well but we also did the same work in testing as well. "Of course, it may well also be that for your test case it is inherently worse; not every use case can be improved." I agree you cannot improve every use case but here we have made a use case worse here. The reason in lower thought-put is due to locking added in 7546. But the root cause is still still the memtable Btree change. I will try with different fan-factor and see if it helps. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504578#comment-15504578 ] Benedict commented on CASSANDRA-12668: -- There's a big difference between a test and a real life cluster though - are all your assertions wrt the testing only? The amount of contention, size and number of your partitions as well as the number of non-contending operations are all hugely important to the emergent behaviour here. You also seem to be switching between discussing throughput and GC burden. The lower throughput may be because the 7546 synchronous behaviour kicks in - which it does once it detects ~10MB/s of waste, which depending on your test could easily be triggered. Of course, it may well also be that for your test case it is inherently worse; not every use case can be improved. While the SnapTreeMap pointer was always updated via CoW, the tree itself was only modified internally, so fewer nodes would be discarded on a failed modification. It was also only a binary tree, potentially cutting garbage. However I recall each node occupying ~100 bytes, which is much more than the b-tree per-item overhead, and I recall it being slower to boot (meaning more overlapping operations) You could try reducing the fan-factor in the b-tree to reduce the time of updates, and cost of failed updates, which might cause the behaviour to tend closer to that of a snaptree (e.g. a fan factor of 4 would average a ternary tree, as opposed to its current default of 16). > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504509#comment-15504509 ] Brandon Williams commented on CASSANDRA-12668: -- My suggestion would be to ttop it (https://github.com/aragozin/jvm-tools) to see where the garbage is coming from. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504501#comment-15504501 ] sankalp kohli commented on CASSANDRA-12668: --- By "always synchronous" I assume you mean always locking instead of using CAS? We did a test where you always write to a few CQL partition simultaneous to create contention. We have seen 2.0 has a higher throughput than 2.1 and looking at allocation points to this memtable issue. Then we made the configuration changes added in 7546 to always lock and that reduced the throughput. Looking at the heap dumps did not point that memtable is smaller in 2.1 vs 2.0. So I dont think this is an issue. Apart from the testing, the only clusters this is an issue is where we have contention and hence this is change is an issue. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504471#comment-15504471 ] sankalp kohli commented on CASSANDRA-12668: --- The cluster was doing constant Java GC and was not able to stay up. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504466#comment-15504466 ] Benedict commented on CASSANDRA-12668: -- Did configuring 7546 to *always* synchronous behaviour not resolve the problem? If not, it doesn't seem like contention was the problem. The decline in write performance from contention is somewhat unrelated to GC - certainly it will produce a GC burden, but that's only a small portion of its effect. The B-Tree can incur a higher GC overhead anyway, also, although in reality I would expect it to be hard to spot. What makes you pin the blame here specifically? Is the only indication GC failure? There were of course many other changes in 2.1 that could be impacting things wrt GC. Something as simple as memtable space being accurately calculated may now be putting your heap under increased pressure than it had been under 2.0 (which may have never successfully calculated its multiplier - this was quite common - in which case memtable occupancy was a fraction of that specified) All things considered, a bit more investigation (or information from your investigation) is needed before anything can helpfully be said. > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504453#comment-15504453 ] Brandon Williams commented on CASSANDRA-12668: -- Can you add a little more color to 'fall apart'? > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504422#comment-15504422 ] sankalp kohli commented on CASSANDRA-12668: --- cc [~benedict] [~brandon.williams] > Memtable Contention in 2.1 > -- > > Key: CASSANDRA-12668 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12668 > Project: Cassandra > Issue Type: Bug >Reporter: sankalp kohli > > We added a new Btree implementation in 2.1 which causes write performance to > go down in Cassandra if there is lot of contention in the memtable for a CQL > partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the > cluster to fall apart due to GC. We tried making the defaults added in > CASSANDRA-7546 configurable but that did not help. Is there anyway to fix > this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)