[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-20 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506130#comment-15506130
 ] 

Benedict commented on CASSANDRA-12668:
--

So, just to crystalise my thoughts on this.  Given this presents with 
contention, there are in all likelihood three possible causes:

* B-Tree is just too inherently worse here than SnapTreeMap was
* The extra work now done during memtable partition update increases the time 
taken, and hence number of concurrent operations
* One of the many other changes increases the number of concurrent operations 
(e.g. changes in threading or increased concurrent_writes), or the size of the 
partitions in a given memtable (e.g. fewer memtable flushes), is increasing the 
cost of contention

Ruling out the first one is probably most straightforwardly done by swapping 
back in SnapTreeMap in a patch and running your tests.  This wouldn't be 
trivial, but also should be far from terribly difficult, and would avoid 
striking around in the dark.

Otherwise, the suggestion I made in CASSANDRA-7546 stands as my preferred 
solution to this kind of problem: deferred updates under contention.  If a swap 
of head fails, just tag the update onto a queue, and have all readers merge the 
queue before responding, with all writers merging all deferred updates along 
with their own.  Readers may also update the b-tree with the result of their 
merge of there's no contention.


> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15505044#comment-15505044
 ] 

sankalp kohli commented on CASSANDRA-12668:
---

We tried the fanout = 4 in tests and found no change. I will update once we 
test the actual use case. 

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504996#comment-15504996
 ] 

Benedict commented on CASSANDRA-12668:
--

To respond to your edit:

bq. Is there any warning in NEWS.txt 

You're only just bringing this particular aspect of tradeoff to light now.  
Tradeoffs happen whether we know about them or not, and the fact it's taken so 
long to come to light suggests it was perhaps a perfectly reasonable hidden 
tradeoff.

But, anyway, we can't talk about a tradeoff meaningfully until we know both 
sides of the trade.  Let's leave that discussion until we actually know the 
cause.

If you're right about the cause, the absolute worst case scenario IMO is to 
provide a patch to make the collection pluggable.

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504958#comment-15504958
 ] 

Benedict commented on CASSANDRA-12668:
--

(breaking out of nesting hell)

I don't really have the inclination for a philosophical debate about what 
should or should not be prioritised.  However the number of clusters we've had 
in the past falling over due to runaway compaction is a more prevalent use case 
that this change was designed to help fix.  Which one is more important, I 
cannot say with certainty, but it's never as black and white as "never make 
anything worse for anyone"  - e.g. thrift users are getting, well, short thrift 
(hehe) at present.

bq. These are not statements

Erm... 

Anyway.  The important qualifier is that they were qualitative/correlative, 
i.e. lacking any numbers for comparison or direct mechanisms/causes for the 
action, nor any information by which we could make any guesses as to such.  
i.e., it's much too speculative to be reaching the conclusions you have about 
cause, or having this detailed a discussion at present, really.

There are two ways to attack this: inwards or outwards.  Either try to provide 
all of the contextual information to find avenues to explore via cluster 
testing, or isolate the b-tree and snaptree to see how they compare under 
varying levels of contention (with semantics as in the databases, i.e. 
cas-swapped head).

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504919#comment-15504919
 ] 

sankalp kohli commented on CASSANDRA-12668:
---

"Lower throughput != unusable"

The cluster was working in 2.0 and is no longer working in 2.1. That is 
unusable. Applications needs a certain throughput otherwise they will have ever 
increasing backlog. 

"Everything is about trade-offs, as was the swapping of the data structure in 
the first place. There's rarely a 100% free lunch."
The tradeoff where a use case is made unusable is not a trade off we should 
make. 


"Just some fairly broad qualitative/correlative statements."
These are not statements but practical experience with a cluster which has been 
made unusable in 2.1. I will give out more information that you are asking.  

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504747#comment-15504747
 ] 

Benedict commented on CASSANDRA-12668:
--

Come on, let's avoid hyperbole.

Lower throughput != unusable, and if you were to reduce the size of your 
memtables the increased GC burden would probably be coped with too.  Everything 
is about trade-offs, as was the swapping of the data structure in the first 
place.  There's rarely a 100% free lunch.

Still, much more information is needed before anything informative can be said. 
 You've still given very few details; we still have nothing about the size of 
the partitions, their number, the rate of updates (total and per partition), 
the size of the datums, the total size provided for the memtables.  The 
configuration parameters such as flush queue size, memtable_cleanup_threshold 
and concurrent_writes.  No profiling information, no heap numbers.  Just some 
fairly broad qualitative/correlative statements.






> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504699#comment-15504699
 ] 

sankalp kohli commented on CASSANDRA-12668:
---

The assertions are from real cluster as well but we also did the same work in 
testing as well. 

"Of course, it may well also be that for your test case it is inherently worse; 
not every use case can be improved."
I agree you cannot improve every use case but here we have made a use case 
worse here. 

The reason in lower thought-put is due to locking added in 7546. But the root 
cause is still still the memtable Btree change. 

I will try with different fan-factor and see if it helps.  

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504578#comment-15504578
 ] 

Benedict commented on CASSANDRA-12668:
--

There's a big difference between a test and a real life cluster though - are 
all your assertions wrt the testing only?  The amount of contention, size and 
number of your partitions as well as the number of non-contending operations 
are all hugely important to the emergent behaviour here.  

You also seem to be switching between discussing throughput and GC burden. The 
lower throughput may be because the 7546 synchronous behaviour kicks in - which 
it does once it detects ~10MB/s of waste, which depending on your test could 
easily be triggered.

Of course, it may well also be that for your test case it is inherently worse; 
not every use case can be improved.

While the SnapTreeMap pointer was always updated via CoW, the tree itself was 
only modified internally, so fewer nodes would be discarded on a failed 
modification.  It was also only a binary tree, potentially cutting garbage.  
However I recall each node occupying ~100 bytes, which is much more than the 
b-tree per-item overhead, and I recall it being slower to boot (meaning more 
overlapping operations)

You could try reducing the fan-factor in the b-tree to reduce the time of 
updates, and cost of failed updates, which might cause the behaviour to tend 
closer to that of a snaptree (e.g. a fan factor of 4 would average a ternary 
tree, as opposed to its current default of 16).





> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504509#comment-15504509
 ] 

Brandon Williams commented on CASSANDRA-12668:
--

My suggestion would be to ttop it (https://github.com/aragozin/jvm-tools) to 
see where the garbage is coming from.

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504501#comment-15504501
 ] 

sankalp kohli commented on CASSANDRA-12668:
---

By "always synchronous" I assume you mean always locking instead of using CAS? 

We did a test where you always write to a few CQL partition simultaneous to 
create contention. We have seen 2.0 has a higher throughput than 2.1 and 
looking at allocation points to this memtable issue.

Then we made the configuration changes added in  7546 to always lock and that 
reduced the throughput. 

Looking at the heap dumps did not point that memtable is smaller in 2.1 vs 2.0. 
So I dont think this is an issue. 

Apart from the testing, the only clusters this is an issue is where we have 
contention and hence this is change is an issue. 

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504471#comment-15504471
 ] 

sankalp kohli commented on CASSANDRA-12668:
---

The cluster was doing constant Java GC and was not able to stay up. 

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504466#comment-15504466
 ] 

Benedict commented on CASSANDRA-12668:
--

Did configuring 7546 to *always* synchronous behaviour not resolve the problem? 
 If not, it doesn't seem like contention was the problem.

The decline in write performance from contention is somewhat unrelated to GC - 
certainly it will produce a GC burden, but that's only a small portion of its 
effect.  The B-Tree can incur a higher GC overhead anyway, also, although in 
reality I would expect it to be hard to spot.  

What makes you pin the blame here specifically?  Is the only indication GC 
failure?

There were of course many other changes in 2.1 that could be impacting things 
wrt GC.  Something as simple as memtable space being accurately calculated may 
now be putting your heap under increased pressure than it had been under 2.0 
(which may have never successfully calculated its multiplier - this was quite 
common - in which case memtable occupancy was a fraction of that specified)

All things considered, a bit more investigation (or information from your 
investigation) is needed before anything can helpfully be said.

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504453#comment-15504453
 ] 

Brandon Williams commented on CASSANDRA-12668:
--

Can you add a little more color to 'fall apart'?

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504422#comment-15504422
 ] 

sankalp kohli commented on CASSANDRA-12668:
---

cc [~benedict] [~brandon.williams]

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)