[jira] [Comment Edited] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-20 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506130#comment-15506130
 ] 

Benedict edited comment on CASSANDRA-12668 at 9/20/16 9:24 AM:
---

So, just to crystalise my thoughts on this.  Given this presents with 
contention, there are in all likelihood three possible causes:

* B-Tree is just too inherently worse here than SnapTreeMap was
* The extra work now done (besides b-tree) during memtable partition update 
increases the time taken, and hence number of concurrent operations
* One of the many other changes increases the number of concurrent operations 
(e.g. changes in threading or increased concurrent_writes), or the size of the 
partitions in a given memtable (e.g. fewer memtable flushes), is increasing the 
cost of contention

Ruling out the first one is probably most straightforwardly done by swapping 
back in SnapTreeMap in a patch and running your tests.  This wouldn't be 
trivial, but also should be far from terribly difficult, and would avoid 
striking around in the dark.

Otherwise, the suggestion I made in CASSANDRA-7546 stands as my preferred 
solution to this kind of problem: deferred updates under contention.  If a swap 
of head fails, just tag the update onto a queue, and have all readers merge the 
queue before responding, with all writers merging all deferred updates along 
with their own.  Readers may also update the b-tree with the result of their 
merge of there's no contention.



was (Author: benedict):
So, just to crystalise my thoughts on this.  Given this presents with 
contention, there are in all likelihood three possible causes:

* B-Tree is just too inherently worse here than SnapTreeMap was
* The extra work now done during memtable partition update increases the time 
taken, and hence number of concurrent operations
* One of the many other changes increases the number of concurrent operations 
(e.g. changes in threading or increased concurrent_writes), or the size of the 
partitions in a given memtable (e.g. fewer memtable flushes), is increasing the 
cost of contention

Ruling out the first one is probably most straightforwardly done by swapping 
back in SnapTreeMap in a patch and running your tests.  This wouldn't be 
trivial, but also should be far from terribly difficult, and would avoid 
striking around in the dark.

Otherwise, the suggestion I made in CASSANDRA-7546 stands as my preferred 
solution to this kind of problem: deferred updates under contention.  If a swap 
of head fails, just tag the update onto a queue, and have all readers merge the 
queue before responding, with all writers merging all deferred updates along 
with their own.  Readers may also update the b-tree with the result of their 
merge of there's no contention.


> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504919#comment-15504919
 ] 

sankalp kohli edited comment on CASSANDRA-12668 at 9/19/16 10:48 PM:
-

"Lower throughput != unusable"

The cluster was working in 2.0 and is no longer working in 2.1. That is 
unusable. Applications needs a certain throughput otherwise they will have ever 
increasing backlog. 

"Everything is about trade-offs, as was the swapping of the data structure in 
the first place. There's rarely a 100% free lunch."
The tradeoff where a use case is made unusable is not a trade off we should 
make. Is there any warning in NEWS.txt which says this tradeoff is made? 


"Just some fairly broad qualitative/correlative statements."
These are not statements but practical experience with a cluster which has been 
made unusable in 2.1. I will give out more information that you are asking.  


was (Author: kohlisankalp):
"Lower throughput != unusable"

The cluster was working in 2.0 and is no longer working in 2.1. That is 
unusable. Applications needs a certain throughput otherwise they will have ever 
increasing backlog. 

"Everything is about trade-offs, as was the swapping of the data structure in 
the first place. There's rarely a 100% free lunch."
The tradeoff where a use case is made unusable is not a trade off we should 
make. 


"Just some fairly broad qualitative/correlative statements."
These are not statements but practical experience with a cluster which has been 
made unusable in 2.1. I will give out more information that you are asking.  

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504699#comment-15504699
 ] 

sankalp kohli edited comment on CASSANDRA-12668 at 9/19/16 9:14 PM:


The assertions are from real cluster as well but we also did the same work in 
testing as well. 

"Of course, it may well also be that for your test case it is inherently worse; 
not every use case can be improved."
I agree you cannot improve every use case but here we have made a use case 
unusable. 

The reason in lower thought-put is due to locking added in 7546. But the root 
cause is still still the memtable Btree change. 

I will try with different fan-factor and see if it helps.  


was (Author: kohlisankalp):
The assertions are from real cluster as well but we also did the same work in 
testing as well. 

"Of course, it may well also be that for your test case it is inherently worse; 
not every use case can be improved."
I agree you cannot improve every use case but here we have made a use case 
worse here. 

The reason in lower thought-put is due to locking added in 7546. But the root 
cause is still still the memtable Btree change. 

I will try with different fan-factor and see if it helps.  

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12668) Memtable Contention in 2.1

2016-09-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504471#comment-15504471
 ] 

sankalp kohli edited comment on CASSANDRA-12668 at 9/19/16 7:49 PM:


The cluster was doing constant Java GC and was not able to stay up. Is there 
anything else you are looking for?


was (Author: kohlisankalp):
The cluster was doing constant Java GC and was not able to stay up. 

> Memtable Contention in 2.1
> --
>
> Key: CASSANDRA-12668
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12668
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We added a new Btree implementation in 2.1 which causes write performance to 
> go down in Cassandra if there is  lot of contention in the memtable for a CQL 
> partition. Upgrading a cluster from 2.0 to 2.1 with contention causes the 
> cluster to fall apart due to GC. We tried making the defaults added in 
> CASSANDRA-7546 configurable but that did not help. Is there anyway to fix 
> this issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)