[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog

2018-08-21 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14631:
-
Labels: blog  (was: )

> Add RSS support for Cassandra blog
> --
>
> Key: CASSANDRA-14631
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14631
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jacques-Henri Berthemet
>Assignee: Jeff Beck
>Priority: Major
>  Labels: blog
> Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 
> PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png
>
>
> It would be convenient to add RSS support to Cassandra blog:
> [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html]
> And maybe also for other resources like new versions, but this ticket is 
> about blog.
>  
> {quote}From: Scott Andreas
> Sent: Wednesday, August 08, 2018 6:53 PM
> To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org]
> Subject: Re: Apache Cassandra Blog is now live
>  
> Please feel free to file a ticket (label: Documentation and Website).
>  
> It looks like Jekyll, the static site generator used to build the website, 
> has a plugin that generates Atom feeds if someone would like to work on 
> adding one: [https://github.com/jekyll/jekyll-feed]
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"

2018-08-21 Thread Jeremy Hanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14661:
-
Labels: blog  (was: )

> Blog Post: "Testing Apache Cassandra 4.0"
> -
>
> Key: CASSANDRA-14661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14661
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: C. Scott Andreas
>Assignee: C. Scott Andreas
>Priority: Minor
>  Labels: blog
> Attachments: CASSANDRA-14661.diff, rendered.png
>
>
> This is a blog post highlighting some of the approaches being used to test 
> Apache Cassandra 4.0. The patch attached applies as an SVN diff to the 
> website repo (outside the project's primary Git repo).
> SVN patch containing the post and rendered screenshot attached.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster

2018-08-21 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14660:
---
Fix Version/s: (was: 4.0.x)
   4.x
   3.11.x
   3.0.x

> Improve TokenMetaData cache populating performance for large cluster
> 
>
> Key: CASSANDRA-14660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14660
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
> Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP
>Reporter: Pengchao Wang
>Priority: Critical
>  Labels: Performance
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: 14660-trunk.txt, TokenMetaDataBenchmark.java
>
>
> TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent 
> token and topology view on coordinations without paying read lock cost. Upon 
> first read the method acquire a synchronize lock and generate a copy of major 
> token meta data structures and cached it, and upon every token meta data 
> changes(due to gossip changes), the cache get cleared and next read will 
> taking care of cache population.
> For small to medium size clusters this strategy works pretty well. But large 
> clusters can actually be suffered from the locking since cache populating is 
> much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
> 3.0.15)  each cache population take about 500~700ms, and during that there 
> are no requests can go through since synchronize lock was acquired. This 
> caused waves of timeouts errors when there are large amount gossip messages 
> propagating cross the cluster, such as in the case of cluster restarting.
> Base on profiling we found that the cost mostly comes from copying 
> tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
> TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
> TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying 
> from already ordered data. But guava's TreeMultiMap copying missed that 
> optimization and make it ~10 times slower than it actually need to be on our 
> size of cluster.
> The patch attached to the issue replace the reverse TreeMultiMap to a 
> vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
> copy it O(N) time.
> I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
> simulates a large cluster then measures average latency for TokenMetaData 
> cache populating.
> Benchmark result before and after that patch:
> {code:java}
> trunk: 
> before 100ms, after 13ms
> 3.0.x: 
> before 199ms, after 15ms
>  {code}
> (On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) 
> optimization is not applied because the key comparator is dynamically created 
> and TreeMap cannot determine the source and dest are in same order)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster

2018-08-21 Thread Jeff Jirsa (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14660:
---
Labels: Performance  (was: )

> Improve TokenMetaData cache populating performance for large cluster
> 
>
> Key: CASSANDRA-14660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14660
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
> Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP
>Reporter: Pengchao Wang
>Priority: Critical
>  Labels: Performance
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: 14660-trunk.txt, TokenMetaDataBenchmark.java
>
>
> TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent 
> token and topology view on coordinations without paying read lock cost. Upon 
> first read the method acquire a synchronize lock and generate a copy of major 
> token meta data structures and cached it, and upon every token meta data 
> changes(due to gossip changes), the cache get cleared and next read will 
> taking care of cache population.
> For small to medium size clusters this strategy works pretty well. But large 
> clusters can actually be suffered from the locking since cache populating is 
> much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
> 3.0.15)  each cache population take about 500~700ms, and during that there 
> are no requests can go through since synchronize lock was acquired. This 
> caused waves of timeouts errors when there are large amount gossip messages 
> propagating cross the cluster, such as in the case of cluster restarting.
> Base on profiling we found that the cost mostly comes from copying 
> tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
> TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
> TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying 
> from already ordered data. But guava's TreeMultiMap copying missed that 
> optimization and make it ~10 times slower than it actually need to be on our 
> size of cluster.
> The patch attached to the issue replace the reverse TreeMultiMap to a 
> vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
> copy it O(N) time.
> I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
> simulates a large cluster then measures average latency for TokenMetaData 
> cache populating.
> Benchmark result before and after that patch:
> {code:java}
> trunk: 
> before 100ms, after 13ms
> 3.0.x: 
> before 199ms, after 15ms
>  {code}
> (On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) 
> optimization is not applied because the key comparator is dynamically created 
> and TreeMap cannot determine the source and dest are in same order)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"

2018-08-21 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14661:
-
  Reviewer: Nate McCall
Attachment: CASSANDRA-14661.diff
Status: Patch Available  (was: Open)

> Blog Post: "Testing Apache Cassandra 4.0"
> -
>
> Key: CASSANDRA-14661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14661
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: C. Scott Andreas
>Assignee: C. Scott Andreas
>Priority: Minor
> Attachments: CASSANDRA-14661.diff, rendered.png
>
>
> This is a blog post highlighting some of the approaches being used to test 
> Apache Cassandra 4.0. The patch attached applies as an SVN diff to the 
> website repo (outside the project's primary Git repo).
> SVN patch containing the post and rendered screenshot attached.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"

2018-08-21 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14661:
-
Attachment: (was: CASSANDRA-14661.diff)

> Blog Post: "Testing Apache Cassandra 4.0"
> -
>
> Key: CASSANDRA-14661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14661
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: C. Scott Andreas
>Assignee: C. Scott Andreas
>Priority: Minor
> Attachments: CASSANDRA-14661.diff, rendered.png
>
>
> This is a blog post highlighting some of the approaches being used to test 
> Apache Cassandra 4.0. The patch attached applies as an SVN diff to the 
> website repo (outside the project's primary Git repo).
> SVN patch containing the post and rendered screenshot attached.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"

2018-08-21 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14661:
-
Flags: Patch

> Blog Post: "Testing Apache Cassandra 4.0"
> -
>
> Key: CASSANDRA-14661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14661
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: C. Scott Andreas
>Assignee: C. Scott Andreas
>Priority: Minor
> Attachments: CASSANDRA-14661.diff, rendered.png
>
>
> This is a blog post highlighting some of the approaches being used to test 
> Apache Cassandra 4.0. The patch attached applies as an SVN diff to the 
> website repo (outside the project's primary Git repo).
> SVN patch containing the post and rendered screenshot attached.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"

2018-08-21 Thread C. Scott Andreas (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

C. Scott Andreas updated CASSANDRA-14661:
-
Attachment: CASSANDRA-14661.diff

> Blog Post: "Testing Apache Cassandra 4.0"
> -
>
> Key: CASSANDRA-14661
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14661
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: C. Scott Andreas
>Assignee: C. Scott Andreas
>Priority: Minor
> Attachments: CASSANDRA-14661.diff, rendered.png
>
>
> This is a blog post highlighting some of the approaches being used to test 
> Apache Cassandra 4.0. The patch attached applies as an SVN diff to the 
> website repo (outside the project's primary Git repo).
> SVN patch containing the post and rendered screenshot attached.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"

2018-08-21 Thread C. Scott Andreas (JIRA)
C. Scott Andreas created CASSANDRA-14661:


 Summary: Blog Post: "Testing Apache Cassandra 4.0"
 Key: CASSANDRA-14661
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14661
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation and Website
Reporter: C. Scott Andreas
Assignee: C. Scott Andreas
 Attachments: CASSANDRA-14661.diff, rendered.png

This is a blog post highlighting some of the approaches being used to test 
Apache Cassandra 4.0. The patch attached applies as an SVN diff to the website 
repo (outside the project's primary Git repo).

SVN patch containing the post and rendered screenshot attached.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster

2018-08-21 Thread Pengchao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengchao Wang updated CASSANDRA-14660:
--
Attachment: 14660-trunk.txt
Status: Patch Available  (was: Open)

> Improve TokenMetaData cache populating performance for large cluster
> 
>
> Key: CASSANDRA-14660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14660
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
> Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP
>Reporter: Pengchao Wang
>Priority: Critical
> Fix For: 4.0.x
>
> Attachments: 14660-trunk.txt, TokenMetaDataBenchmark.java
>
>
> TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent 
> token and topology view on coordinations without paying read lock cost. Upon 
> first read the method acquire a synchronize lock and generate a copy of major 
> token meta data structures and cached it, and upon every token meta data 
> changes(due to gossip changes), the cache get cleared and next read will 
> taking care of cache population.
> For small to medium size clusters this strategy works pretty well. But large 
> clusters can actually be suffered from the locking since cache populating is 
> much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
> 3.0.15)  each cache population take about 500~700ms, and during that there 
> are no requests can go through since synchronize lock was acquired. This 
> caused waves of timeouts errors when there are large amount gossip messages 
> propagating cross the cluster, such as in the case of cluster restarting.
> Base on profiling we found that the cost mostly comes from copying 
> tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
> TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
> TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying 
> from already ordered data. But guava's TreeMultiMap copying missed that 
> optimization and make it ~10 times slower than it actually need to be on our 
> size of cluster.
> The patch attached to the issue replace the reverse TreeMultiMap to a 
> vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
> copy it O(N) time.
> I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
> simulates a large cluster then measures average latency for TokenMetaData 
> cache populating.
> Benchmark result before and after that patch:
> {code:java}
> trunk: 
> before 100ms, after 13ms
> 3.0.x: 
> before 199ms, after 15ms
>  {code}
> (On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) 
> optimization is not applied because the key comparator is dynamically created 
> and TreeMap cannot determine the source and dest are in same order)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster

2018-08-21 Thread Pengchao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengchao Wang updated CASSANDRA-14660:
--
Description: 
TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token 
and topology view on coordinations without paying read lock cost. Upon first 
read the method acquire a synchronize lock and generate a copy of major token 
meta data structures and cached it, and upon every token meta data changes(due 
to gossip changes), the cache get cleared and next read will taking care of 
cache population.

For small to medium size clusters this strategy works pretty well. But large 
clusters can actually be suffered from the locking since cache populating is 
much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
3.0.15)  each cache population take about 500~700ms, and during that there are 
no requests can go through since synchronize lock was acquired. This caused 
waves of timeouts errors when there are large amount gossip messages 
propagating cross the cluster, such as in the case of cluster restarting.

Base on profiling we found that the cost mostly comes from copying 
tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying 
from already ordered data. But guava's TreeMultiMap copying missed that 
optimization and make it ~10 times slower than it actually need to be on our 
size of cluster.

The patch attached to the issue replace the reverse TreeMultiMap to a 
vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
copy it O(N) time.

I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
simulates a large cluster then measures average latency for TokenMetaData cache 
populating.

Benchmark result before and after that patch:
{code:java}
trunk: 
before 100ms, after 13ms
3.0.x: 
before 199ms, after 15ms
 {code}
(On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) 
optimization is not applied because the key comparator is dynamically created 
and TreeMap cannot determine the source and dest are in same order)

  was:
TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token 
and topology view on coordinations without paying read lock cost. Upon first 
read the method acquire a synchronize lock and generate a copy of major token 
meta data structures and cached it, and upon every token meta data changes(due 
to gossip changes), the cache get cleared and next read will taking care of 
cache population.

For small to medium size clusters this strategy works pretty well. But large 
clusters can actually be suffered from the locking since cache populating is 
much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
3.0.15)  each cache population take about 500~700ms, and during that there are 
no requests can go through since synchronize lock was acquired. This caused 
waves of timeouts errors when there are large amount gossip messages 
propagating cross the cluster, such as in the case of cluster restarting.

Base on profiling we found that the cost mostly comes from copying 
tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying 
from already ordered data. But guava's TreeMultiMap copying missed that 
optimization and make it ~10 times slower than it actually need to be on our 
size of cluster.

The patch attached to the issue replace the reverse TreeMultiMap to a 
vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
copy it O(n) time.

I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
simulates a large cluster then measures average latency for TokenMetaData cache 
populating.

Benchmark result before and after that patch:
{code:java}
trunk: 
before 100ms, after 13ms
3.0.x: 
before 199ms, after 15ms
 {code}
(On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) 
optimization is not applied because the key comparator is dynamically created 
and TreeMap cannot determine the source and dest are in same order)


> Improve TokenMetaData cache populating performance for large cluster
> 
>
> Key: CASSANDRA-14660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14660
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
> Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP
>Reporter: Pengchao Wang
>Priority: Critical
> Fix For: 4.0.x
>
> Attachments: 

[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster

2018-08-21 Thread Pengchao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengchao Wang updated CASSANDRA-14660:
--
Description: 
TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token 
and topology view on coordinations without paying read lock cost. Upon first 
read the method acquire a synchronize lock and generate a copy of major token 
meta data structures and cached it, and upon every token meta data changes(due 
to gossip changes), the cache get cleared and next read will taking care of 
cache population.

For small to medium size clusters this strategy works pretty well. But large 
clusters can actually be suffered from the locking since cache populating is 
much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
3.0.15)  each cache population take about 500~700ms, and during that there are 
no requests can go through since synchronize lock was acquired. This caused 
waves of timeouts errors when there are large amount gossip messages 
propagating cross the cluster, such as in the case of cluster restarting.

Base on profiling we found that the cost mostly comes from copying 
tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying 
from already ordered data. But guava's TreeMultiMap copying missed that 
optimization and make it ~10 times slower than it actually need to be on our 
size of cluster.

The patch attached to the issue replace the reverse TreeMultiMap to a 
vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
copy it O(n) time.

I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
simulates a large cluster then measures average latency for TokenMetaData cache 
populating.

Benchmark result before and after that patch:
{code:java}
trunk: 
before 100ms, after 13ms
3.0.x: 
before 199ms, after 15ms
 {code}
(On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) 
optimization is not applied because the key comparator is dynamically created 
and TreeMap cannot determine the source and dest are in same order)

  was:
TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token 
and topology view on coordinations without paying read lock cost. Upon first 
read the method acquire a synchronize lock and generate a copy of major token 
meta data structures and cached it, and upon every token meta data changes(due 
to gossip changes), the cache get cleared and next read will taking care of 
cache population.

 

For small to medium size clusters this strategy works pretty well. But large 
clusters can actually be suffered from the locking since cache populating is 
much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
3.0.15)  each cache population take about 500~700ms, and during that there are 
no requests can go through since synchronize lock was acquired. This caused 
waves of timeouts errors when there are large amount gossip messages 
propagating cross the cluster, such as in the case of cluster restarting.

 

Base on profiling we found that the cost mostly comes from copying 
tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying 
from already ordered data. But guava's TreeMultiMap copying missed that 
optimization and make it ~10 times slower than it actually need to be on our 
size of cluster.

 

The patch attached to the issue replace the reverse TreeMultiMap to a 
vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
copy it O(n) time.

 

I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
simulates a large cluster then measures average latency for TokenMetaData cache 
populating.

Benchmark result before and after that patch:

 
{code:java}
trunk: 
before 100ms, after 13ms
3.0.x: 
before 199ms, after 15ms
{code}
 

 

(On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) 
optimization is not applied because the key comparator is dynamically created 
and TreeMap cannot determine the source and dest are in same order)


> Improve TokenMetaData cache populating performance for large cluster
> 
>
> Key: CASSANDRA-14660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14660
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
> Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP
>Reporter: Pengchao Wang
>Priority: Critical
> Fix For: 4.0.x
>
> 

[jira] [Created] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster

2018-08-21 Thread Pengchao Wang (JIRA)
Pengchao Wang created CASSANDRA-14660:
-

 Summary: Improve TokenMetaData cache populating performance for 
large cluster
 Key: CASSANDRA-14660
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14660
 Project: Cassandra
  Issue Type: Improvement
  Components: Coordination
 Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP
Reporter: Pengchao Wang
 Fix For: 4.0.x
 Attachments: TokenMetaDataBenchmark.java

TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token 
and topology view on coordinations without paying read lock cost. Upon first 
read the method acquire a synchronize lock and generate a copy of major token 
meta data structures and cached it, and upon every token meta data changes(due 
to gossip changes), the cache get cleared and next read will taking care of 
cache population.

 

For small to medium size clusters this strategy works pretty well. But large 
clusters can actually be suffered from the locking since cache populating is 
much slower. On one of our largest cluster (~1000 nodes,  125k tokens, C* 
3.0.15)  each cache population take about 500~700ms, and during that there are 
no requests can go through since synchronize lock was acquired. This caused 
waves of timeouts errors when there are large amount gossip messages 
propagating cross the cluster, such as in the case of cluster restarting.

 

Base on profiling we found that the cost mostly comes from copying 
tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use 
TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in 
TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying 
from already ordered data. But guava's TreeMultiMap copying missed that 
optimization and make it ~10 times slower than it actually need to be on our 
size of cluster.

 

The patch attached to the issue replace the reverse TreeMultiMap to a 
vanilla TreeMap> in SortedBiMultiValueMap to make sure we can 
copy it O(n) time.

 

I also attached a benchmark script (TokenMetaDataBenchmark.java), which 
simulates a large cluster then measures average latency for TokenMetaData cache 
populating.

Benchmark result before and after that patch:

 
{code:java}
trunk: 
before 100ms, after 13ms
3.0.x: 
before 199ms, after 15ms
{code}
 

 

(On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) 
optimization is not applied because the key comparator is dynamically created 
and TreeMap cannot determine the source and dest are in same order)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14648) CircleCI dtest runs should (by default) depend upon successful unit tests

2018-08-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588112#comment-16588112
 ] 

Dinesh Joshi commented on CASSANDRA-14648:
--

Automatically running CircleCI jobs due to a git push is a waste of resources.

[~aweisberg] fyi I have experienced 40-90 min wait times. I'm waiting on a 
build that has been queued for 46 minutes as I type.

> CircleCI dtest runs should (by default) depend upon successful unit tests
> -
>
> Key: CASSANDRA-14648
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14648
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Unit tests are very quick to run, and if they fail to pass there’s probably 
> no value in running dtests - particularly if we are honouring our 
> expectations of never committing code that breaks either unit or dtests.
> When sharing CircleCI resources between multiple branches (or multiple 
> users), it is wasteful to have two dtest runs kicked off for every incomplete 
> branch that is pushed to GitHub for safe keeping.  So I think a better 
> default CircleCI config file would only run the dtests after a successful 
> unit test run, and those who want to modify this behaviour can do so 
> consciously by editing the config file for themselves.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14659) Disable old protocol versions on demand

2018-08-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588063#comment-16588063
 ] 

Dinesh Joshi commented on CASSANDRA-14659:
--

DTest PR is here: 
https://github.com/apache/cassandra-dtest/compare/master...dineshjoshi:14659-master?expand=1

> Disable old protocol versions on demand
> ---
>
> Key: CASSANDRA-14659
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14659
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: usability
>
> This patch allows the operators to disable older protocol versions on demand. 
> To use it, you can set {{native_transport_allow_older_protocols}} to false or 
> use nodetool disableolderprotocolversions. Cassandra will reject requests 
> from client coming in on any version except the current version. This will 
> help operators selectively reject connections from clients that do not 
> support the latest protoocol.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10726) Read repair inserts should not be blocking

2018-08-21 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10726:
--
Fix Version/s: (was: 4.x)
   4.0

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.0
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14659) Disable old protocol versions on demand

2018-08-21 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14659:
-
Description: This patch allows the operators to disable older protocol 
versions on demand. To use it, you can set 
{{native_transport_allow_older_protocols}} to false or use nodetool 
disableolderprotocolversions. Cassandra will reject requests from client coming 
in on any version except the current version. This will help operators 
selectively reject connections from clients that do not support the latest 
protoocol.  (was: This patch allows the operators to disable older protocol 
versions on demand. To use it, you can set 
native_transport_honor_older_protocols to false or use nodetool 
disableolderprotocolversions. Cassandra will reject requests from client coming 
in on any version except the current version. This will help operators 
selectively reject connections from clients that do not support the latest 
protoocol.)

> Disable old protocol versions on demand
> ---
>
> Key: CASSANDRA-14659
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14659
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: usability
>
> This patch allows the operators to disable older protocol versions on demand. 
> To use it, you can set {{native_transport_allow_older_protocols}} to false or 
> use nodetool disableolderprotocolversions. Cassandra will reject requests 
> from client coming in on any version except the current version. This will 
> help operators selectively reject connections from clients that do not 
> support the latest protoocol.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13630) support large internode messages with netty

2018-08-21 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-13630:
-
Reviewer: Dinesh Joshi  (was: Ariel Weisberg)

> support large internode messages with netty
> ---
>
> Key: CASSANDRA-13630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13630
> Project: Cassandra
>  Issue Type: Task
>  Components: Streaming and Messaging
>Reporter: Jason Brown
>Assignee: Jason Brown
>Priority: Major
> Fix For: 4.0
>
>
> As part of CASSANDRA-8457, we decided to punt on large mesages to reduce the 
> scope of that ticket. However, we still need that functionality to ship a 
> correctly operating internode messaging subsystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13630) support large internode messages with netty

2018-08-21 Thread Jason Brown (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587988#comment-16587988
 ] 

Jason Brown commented on CASSANDRA-13630:
-

Picking this back up after a year, I realize that my previous solution only 
solved part of the problem. I solved the "don't allocate an enormous buffer" 
problem, but I was still allocating an "enourmous buffer"'s worth of memory at 
the same time, albeit across multiple buffers. I believe this is ultimately 
what [~aweisberg]'s concerns with the previous solution encompassed, and I 
fully agree. Further, the previous patch attempted to do more than just solve 
the large buffer problem, it optimized allocating small buffers. With this new 
insight, that optimization is best left to a separate ticket.

Thus, the new solution focuses only on the large buffer problem. The high-level 
overview of this patch is:
 - use the existing {{ByteBufDataOutputStreamPlus}} to chunk up the large 
message into small buffers, and use {{ByteBufDataOutputStreamPlus}} 's existing 
rateLimiting mechanism to make sure we don't keep too much outstanding data in 
the channel
 - rework the inbound side to allow a blocking-style message deserialization.
 - Refactoring to make serilization/deserialization code reusable as well as 
some clean up.

In order to support both serialization and deserialization of arbitrarily large 
messages and our blocking-style {{IVersionedSerializers}}, I need to perform 
the those activities on a separate (background) thread. On the outbound side 
this is achieved with a new {{ChannelWriter}} subclass. On the inbound side, 
there is a fair bit of refactoring, but the thread for deserialization is in 
{{MessageInHandler.BlockingBufferHandler}}. Both of these "background threads" 
are implemented as {{ThreadExecutorServices}} so that if no large messages are 
being sent or received, the thread can be shutdown (and save the system 
resources).

On the outbound side, it is easy to know if a specific 
{{OutboundMessagingConnection}} will be sending large messages, as we can look 
at it's {{OutboundConnectionIdentifier}}. The inbound side does not have that 
luxury, and my previous patch attempted to do some overly clever things. The 
simpler solution is to add a flag to the internode messaging protocol that 
advises the receiving side that the connection will be used for large messages, 
and the receiver can setup appropriately. We already have a flags section in 
the internode messaging protocol's header, and many unused bits within that. 
Further, peers that are unaware of the new bit flag (i.e. any cassandra version 
less than 4.0) will completely ignore the flag as they do not attempt to 
interpret those bits. Thus, this change is rather safe, from a 
protocol/handshake perspective. In fact, I'd like to backport this protocol 
change to 3.0 and 3.11 to have the flag sent out on new connections. The flag 
will be completely ignored on those versions, except when, during a cluster 
upgrade, the 3.0 node connects to a 4.0 node, the 4.0 node will know that the 
connection will contain large messages and can setup the receive side 
appropriately. In no way would operators be required to minor upgrade to those 
versions of 3.X which contain the upgraded messaging version flag (before 
upgrading to 4.0), but it would help make the upgrade to 4.0 smoother from a 
performance/memory management perspective.

The other major aspect of this ticket was a refactoring mostly to move the 
serialization/deserialization out of {{MessageOutHandler}}/{{MessageInHandler}} 
so that logic could be invoked outside of the context of a netty handler. This 
also allowed me to clean up {{MessageIn}} and {{MessageOut}}, as well. Note: 
I've eliminated {{BaseMessageInHandler}} and moved the version-specific 
messaging parsing into classes derived from the new 
{{MessageIn.MessageInProcessor}}. {{MessageInHandler}} now determines if it 
needs to do non-blocking or blocking deserialization, and handles the buffers 
appropriately. {{MessageInHandler}} now derives from 
{{ChannelInboundHandlerAdapter}}, so the error handling changed slightly. The 
refacorings also affected where the unit tests are layed out (corresponding to 
where the logic/code unit test now lives), so I moved things around in there, 
as well.
||13630||
|[branch|https://github.com/jasobrown/cassandra/tree/13630]|
|[utests & 
dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/13630]|

I also needed to make a trivial [change to one 
dtest|https://github.com/jasobrown/cassandra-dtest/tree/13630].

To make the review easier, I've [opened a PR 
here|https://github.com/apache/cassandra/pull/253]

> support large internode messages with netty
> ---
>
> Key: CASSANDRA-13630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13630
>  

[jira] [Updated] (CASSANDRA-10726) Read repair inserts should not be blocking

2018-08-21 Thread Blake Eggleston (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-10726:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to trunk as 
[644676b088be5177ef1d0cdaf450306ea28d8a12|https://github.com/apache/cassandra/commit/644676b088be5177ef1d0cdaf450306ea28d8a12],
 thanks

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Improve read repair blocking behavior

2018-08-21 Thread bdeggleston
Repository: cassandra
Updated Branches:
  refs/heads/trunk 994da255c -> 644676b08


Improve read repair blocking behavior

Patch by Blake Eggleston; reviewed by Marcus Eriksson and Alex Petrov
for CASSANDRA-10726


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/644676b0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/644676b0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/644676b0

Branch: refs/heads/trunk
Commit: 644676b088be5177ef1d0cdaf450306ea28d8a12
Parents: 994da25
Author: Blake Eggleston 
Authored: Mon May 14 14:24:03 2018 -0700
Committer: Blake Eggleston 
Committed: Tue Aug 21 11:01:10 2018 -0700

--
 CHANGES.txt |   1 +
 .../apache/cassandra/db/ConsistencyLevel.java   |  10 +-
 .../cassandra/db/PartitionRangeReadCommand.java |   1 +
 .../org/apache/cassandra/db/ReadCommand.java|   1 +
 .../db/SinglePartitionReadCommand.java  |   1 +
 .../cassandra/metrics/ReadRepairMetrics.java|   5 +
 .../apache/cassandra/net/MessagingService.java  |   1 +
 .../apache/cassandra/service/StorageProxy.java  |  65 +++-
 .../service/reads/AbstractReadExecutor.java |  40 +-
 .../reads/repair/BlockingPartitionRepair.java   | 243 +
 .../reads/repair/BlockingReadRepair.java| 229 ++--
 .../reads/repair/BlockingReadRepairs.java   | 114 ++
 .../service/reads/repair/NoopReadRepair.java|  29 +-
 .../repair/PartitionIteratorMergeListener.java  |  13 +-
 .../service/reads/repair/ReadRepair.java|  44 ++-
 .../service/reads/repair/RepairListener.java|  34 --
 .../reads/repair/RowIteratorMergeListener.java  |  31 +-
 .../service/reads/DataResolverTest.java |  18 +-
 .../service/reads/repair/ReadRepairTest.java| 361 +++
 .../reads/repair/TestableReadRepair.java|  41 ++-
 20 files changed, 1046 insertions(+), 236 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 9fbaf25..b34979a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Improve read repair blocking behavior (CASSANDRA-10726)
  * Add a virtual table to expose settings (CASSANDRA-14573)
  * Fix up chunk cache handling of metrics (CASSANDRA-14628)
  * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/src/java/org/apache/cassandra/db/ConsistencyLevel.java
--
diff --git a/src/java/org/apache/cassandra/db/ConsistencyLevel.java 
b/src/java/org/apache/cassandra/db/ConsistencyLevel.java
index 8f3a51c..d37da0a 100644
--- a/src/java/org/apache/cassandra/db/ConsistencyLevel.java
+++ b/src/java/org/apache/cassandra/db/ConsistencyLevel.java
@@ -138,12 +138,20 @@ public enum ConsistencyLevel
 }
 }
 
+/**
+ * Determine if this consistency level meets or exceeds the consistency 
requirements of the given cl for the given keyspace
+ */
+public boolean satisfies(ConsistencyLevel other, Keyspace keyspace)
+{
+return blockFor(keyspace) >= other.blockFor(keyspace);
+}
+
 public boolean isDatacenterLocal()
 {
 return isDCLocal;
 }
 
-public boolean isLocal(InetAddressAndPort endpoint)
+public static boolean isLocal(InetAddressAndPort endpoint)
 {
 return 
DatabaseDescriptor.getLocalDataCenter().equals(DatabaseDescriptor.getEndpointSnitch().getDatacenter(endpoint));
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java
--
diff --git a/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java 
b/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java
index a6641d4..c312acc 100644
--- a/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java
+++ b/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java
@@ -24,6 +24,7 @@ import java.util.List;
 import com.google.common.annotations.VisibleForTesting;
 import com.google.common.collect.Iterables;
 
+import org.apache.cassandra.dht.Token;
 import org.apache.cassandra.schema.TableMetadata;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.db.filter.*;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/src/java/org/apache/cassandra/db/ReadCommand.java
--
diff --git a/src/java/org/apache/cassandra/db/ReadCommand.java 
b/src/java/org/apache/cassandra/db/ReadCommand.java
index 

[jira] [Commented] (CASSANDRA-14592) Reconcile should not be dependent on nowInSec

2018-08-21 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587746#comment-16587746
 ] 

Aleksey Yeschenko commented on CASSANDRA-14592:
---

+1

> Reconcile should not be dependent on nowInSec
> -
>
> Key: CASSANDRA-14592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14592
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
> Fix For: 4.0
>
>
> To have the arrival time of a mutation on a replica determine the 
> reconciliation priority seems to provide for unintuitive database behaviour.  
> It seems we should formalise our reconciliation logic in a manner that does 
> not depend on this, and modify our internal APIs to prevent this dependency.
>  
> Take the following example, where both writes have the same timestamp:
>  
> Write X with a value A, TTL of 1s
> Write Y with a value B, no TTL
>  
> If X and Y arrive on replicas in < 1s, X and Y are both live, so record Y 
> wins the reconciliation.  The value B appears in the database.
> However, if X and Y arrive on replicas in > 1s, X is now (effectively) a 
> tombstone.  This wins the reconciliation race, and NO value is the result.
>  
> Note that the weirdness of this is more pronounced than it might first 
> appear.  If write X gets stuck in hints for a period on the coordinator to 
> one replica, the value B appears in the database until the hint is replayed.  
> So now we’re in a very uncertain state - will hints get replayed or not?  If 
> they do, the value B will disappear; if they don’t it won’t.  This is despite 
> a QUORUM of replicas ACKing both writes, and a QUORUM of readers being 
> engaged on read; the database still changes state to the user suddenly at 
> some arbitrary future point in time.
>  
> It seems to me that a simple solution to this, is to permit TTL’d data to 
> always win a reconciliation against non-TTL’d data (of same timestamp), so 
> that we are consistent across TTLs being transformed into tombstones.
>  
> 4.0 seems like a good opportunity to fix this behaviour, and mention in 
> CHANGES.txt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking

2018-08-21 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587730#comment-16587730
 ] 

Marcus Eriksson commented on CASSANDRA-10726:
-

this lgtm, +1

> Read repair inserts should not be blocking
> --
>
> Key: CASSANDRA-10726
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10726
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination
>Reporter: Richard Low
>Assignee: Blake Eggleston
>Priority: Major
> Fix For: 4.x
>
>
> Today, if there’s a digest mismatch in a foreground read repair, the insert 
> to update out of date replicas is blocking. This means, if it fails, the read 
> fails with a timeout. If a node is dropping writes (maybe it is overloaded or 
> the mutation stage is backed up for some other reason), all reads to a 
> replica set could fail. Further, replicas dropping writes get more out of 
> sync so will require more read repair.
> The comment on the code for why the writes are blocking is:
> {code}
> // wait for the repair writes to be acknowledged, to minimize impact on any 
> replica that's
> // behind on writes in case the out-of-sync row is read multiple times in 
> quick succession
> {code}
> but the bad side effect is that reads timeout. Either the writes should not 
> be blocking or we should return success for the read even if the write times 
> out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14659) Disable old protocol versions on demand

2018-08-21 Thread Dinesh Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587684#comment-16587684
 ] 

Dinesh Joshi commented on CASSANDRA-14659:
--

||version||
|[branch|https://github.com/dineshjoshi/cassandra/tree/restrict-protocol-version]|
|[utests  
dtests|https://circleci.com/gh/dineshjoshi/workflows/cassandra/tree/restrict-protocol-version]|
||

> Disable old protocol versions on demand
> ---
>
> Key: CASSANDRA-14659
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14659
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: usability
>
> This patch allows the operators to disable older protocol versions on demand. 
> To use it, you can set native_transport_honor_older_protocols to false or use 
> nodetool disableolderprotocolversions. Cassandra will reject requests from 
> client coming in on any version except the current version. This will help 
> operators selectively reject connections from clients that do not support the 
> latest protoocol.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14659) Disable old protocol versions on demand

2018-08-21 Thread Dinesh Joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-14659:
-
Labels: usability  (was: )
Status: Patch Available  (was: Open)

> Disable old protocol versions on demand
> ---
>
> Key: CASSANDRA-14659
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14659
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dinesh Joshi
>Assignee: Dinesh Joshi
>Priority: Major
>  Labels: usability
>
> This patch allows the operators to disable older protocol versions on demand. 
> To use it, you can set native_transport_honor_older_protocols to false or use 
> nodetool disableolderprotocolversions. Cassandra will reject requests from 
> client coming in on any version except the current version. This will help 
> operators selectively reject connections from clients that do not support the 
> latest protoocol.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14659) Disable old protocol versions on demand

2018-08-21 Thread Dinesh Joshi (JIRA)
Dinesh Joshi created CASSANDRA-14659:


 Summary: Disable old protocol versions on demand
 Key: CASSANDRA-14659
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14659
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Dinesh Joshi
Assignee: Dinesh Joshi


This patch allows the operators to disable older protocol versions on demand. 
To use it, you can set native_transport_honor_older_protocols to false or use 
nodetool disableolderprotocolversions. Cassandra will reject requests from 
client coming in on any version except the current version. This will help 
operators selectively reject connections from clients that do not support the 
latest protoocol.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587656#comment-16587656
 ] 

Benedict commented on CASSANDRA-13651:
--

The latest, i.e. [~burmanm]'s.

It simplifies the code, and as I say, I'm not sure the original patch was as 
well justified as we thought - our benchmarking methodology was flawed at the 
time (using a single connection).

I suppose arguably there's value in maintaining the current behaviour for those 
users with a single connection and without epoll, but since epoll is now the 
default it's probably better to improve code clarity.

I'm open to dispute, of course, in which case we can revisit Corentin's patch 
(or try to reproduce the old benchmarks and see what we might be losing in 
modern C* in the worst case).  In this case, I would probably prefer to have a 
LegacyFlusher and a Flusher - the latter the cleaned code contributed by 
[~burmanm], the former the old unadulterated code, and to select between them 
based on the config property (with a default being determined by epoll usage 
status)

> Large amount of CPU used by epoll_wait(.., .., .., 0)
> -
>
> Key: CASSANDRA-13651
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13651
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Corentin Chary
>Assignee: Corentin Chary
>Priority: Major
> Fix For: 4.x
>
> Attachments: cpu-usage.png
>
>
> I was trying to profile Cassandra under my workload and I kept seeing this 
> backtrace:
> {code}
> epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms
> io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java 
> (native)
> io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) 
> Native.java:111
> io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) 
> EpollEventLoop.java:230
> io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run() 
> SingleThreadEventExecutor.java:858
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() 
> DefaultThreadFactory.java:138
> java.lang.Thread.run() Thread.java:745
> {code}
> At fist I though that the profiler might not be able to profile native code 
> properly, but I wen't further and I realized that most of the CPU was used by 
> {{epoll_wait()}} calls with a timeout of zero.
> Here is the output of perf on this system, which confirms that most of the 
> overhead was with timeout == 0.
> {code}
> Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): 
> 11594448
> Overhead  Trace output
>   
>  ◆
>   90.06%  epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, 
> timeout: 0x   
> ▒
>5.77%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x   
> ▒
>1.98%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x03e8   
> ▒
>0.04%  epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.04%  epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, 
> timeout: 0x   
> ▒
>0.03%  epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.02%  epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, 
> timeout: 0x
> {code}
> Running this time with perf record -ag for call traces:
> {code}
> # Children  Self   sys   usr  Trace output
> 
> #         
> 
> #
>  8.61% 8.61% 0.00% 8.61%  epfd: 0x00a7, events: 
> 0x7fca452d6000, maxevents: 0x1000, timeout: 0x
> |
> ---0x1000200af313
>|  
>   

[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)

2018-08-21 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587633#comment-16587633
 ] 

Jeff Jirsa commented on CASSANDRA-13651:


There are two patches here, [~iksaif]'s patch and [~burmanm]'s follow-patch, 
which did you each +1?


> Large amount of CPU used by epoll_wait(.., .., .., 0)
> -
>
> Key: CASSANDRA-13651
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13651
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Corentin Chary
>Assignee: Corentin Chary
>Priority: Major
> Fix For: 4.x
>
> Attachments: cpu-usage.png
>
>
> I was trying to profile Cassandra under my workload and I kept seeing this 
> backtrace:
> {code}
> epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms
> io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java 
> (native)
> io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) 
> Native.java:111
> io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) 
> EpollEventLoop.java:230
> io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run() 
> SingleThreadEventExecutor.java:858
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() 
> DefaultThreadFactory.java:138
> java.lang.Thread.run() Thread.java:745
> {code}
> At fist I though that the profiler might not be able to profile native code 
> properly, but I wen't further and I realized that most of the CPU was used by 
> {{epoll_wait()}} calls with a timeout of zero.
> Here is the output of perf on this system, which confirms that most of the 
> overhead was with timeout == 0.
> {code}
> Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): 
> 11594448
> Overhead  Trace output
>   
>  ◆
>   90.06%  epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, 
> timeout: 0x   
> ▒
>5.77%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x   
> ▒
>1.98%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x03e8   
> ▒
>0.04%  epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.04%  epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, 
> timeout: 0x   
> ▒
>0.03%  epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.02%  epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, 
> timeout: 0x
> {code}
> Running this time with perf record -ag for call traces:
> {code}
> # Children  Self   sys   usr  Trace output
> 
> #         
> 
> #
>  8.61% 8.61% 0.00% 8.61%  epfd: 0x00a7, events: 
> 0x7fca452d6000, maxevents: 0x1000, timeout: 0x
> |
> ---0x1000200af313
>|  
> --8.61%--0x7fca6117bdac
>   0x7fca60459804
>   epoll_wait
>  2.98% 2.98% 0.00% 2.98%  epfd: 0x00a7, events: 
> 0x7fca452d6000, maxevents: 0x1000, timeout: 0x03e8
> |
> ---0x1000200af313
>0x7fca6117b830
>0x7fca60459804
>epoll_wait
> {code}
> That looks like a lot of CPU used to wait for nothing. I'm not sure if pref 
> reports a per-CPU percentage or a per-system percentage, but that would be 
> still be 10% of the total CPU usage of Cassandra at the minimum.
> I went further and found the code of all that: We schedule a lot of 
> {{Message::Flusher}} with a deadline of 10 usec (5 per messages I think) but 
> netty+epoll only support 

[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)

2018-08-21 Thread Norman Maurer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587629#comment-16587629
 ] 

Norman Maurer commented on CASSANDRA-13651:
---

I am no committer but the netty project lead so from the point of view of netty 
usage I am also +1 on this.

> Large amount of CPU used by epoll_wait(.., .., .., 0)
> -
>
> Key: CASSANDRA-13651
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13651
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Corentin Chary
>Assignee: Corentin Chary
>Priority: Major
> Fix For: 4.x
>
> Attachments: cpu-usage.png
>
>
> I was trying to profile Cassandra under my workload and I kept seeing this 
> backtrace:
> {code}
> epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms
> io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java 
> (native)
> io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) 
> Native.java:111
> io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) 
> EpollEventLoop.java:230
> io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run() 
> SingleThreadEventExecutor.java:858
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() 
> DefaultThreadFactory.java:138
> java.lang.Thread.run() Thread.java:745
> {code}
> At fist I though that the profiler might not be able to profile native code 
> properly, but I wen't further and I realized that most of the CPU was used by 
> {{epoll_wait()}} calls with a timeout of zero.
> Here is the output of perf on this system, which confirms that most of the 
> overhead was with timeout == 0.
> {code}
> Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): 
> 11594448
> Overhead  Trace output
>   
>  ◆
>   90.06%  epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, 
> timeout: 0x   
> ▒
>5.77%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x   
> ▒
>1.98%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x03e8   
> ▒
>0.04%  epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.04%  epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, 
> timeout: 0x   
> ▒
>0.03%  epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.02%  epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, 
> timeout: 0x
> {code}
> Running this time with perf record -ag for call traces:
> {code}
> # Children  Self   sys   usr  Trace output
> 
> #         
> 
> #
>  8.61% 8.61% 0.00% 8.61%  epfd: 0x00a7, events: 
> 0x7fca452d6000, maxevents: 0x1000, timeout: 0x
> |
> ---0x1000200af313
>|  
> --8.61%--0x7fca6117bdac
>   0x7fca60459804
>   epoll_wait
>  2.98% 2.98% 0.00% 2.98%  epfd: 0x00a7, events: 
> 0x7fca452d6000, maxevents: 0x1000, timeout: 0x03e8
> |
> ---0x1000200af313
>0x7fca6117b830
>0x7fca60459804
>epoll_wait
> {code}
> That looks like a lot of CPU used to wait for nothing. I'm not sure if pref 
> reports a per-CPU percentage or a per-system percentage, but that would be 
> still be 10% of the total CPU usage of Cassandra at the minimum.
> I went further and found the code of all that: We schedule a lot of 
> {{Message::Flusher}} with a deadline of 10 usec (5 per messages I think) but 
> netty+epoll only 

[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587624#comment-16587624
 ] 

Benedict commented on CASSANDRA-13651:
--

So, I assume we're now defaulting to epoll in most cases, and this behaviour 
comes from a period where this wasn't the default (and was probably poorly 
justified at the time - AFAICR we used to benchmark with only a single 
connection, where this behaviour would be more beneficial).

It's a shame we no longer have any standard benchmarking tools for the project, 
but it seems we have multiple data points demonstrating a win (or no loss), and 
the code is simpler after the patch.

So, I'm +1 on the patch.  I will get a circleci run going shortly.

> Large amount of CPU used by epoll_wait(.., .., .., 0)
> -
>
> Key: CASSANDRA-13651
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13651
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Corentin Chary
>Assignee: Corentin Chary
>Priority: Major
> Fix For: 4.x
>
> Attachments: cpu-usage.png
>
>
> I was trying to profile Cassandra under my workload and I kept seeing this 
> backtrace:
> {code}
> epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms
> io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java 
> (native)
> io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) 
> Native.java:111
> io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) 
> EpollEventLoop.java:230
> io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run() 
> SingleThreadEventExecutor.java:858
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() 
> DefaultThreadFactory.java:138
> java.lang.Thread.run() Thread.java:745
> {code}
> At fist I though that the profiler might not be able to profile native code 
> properly, but I wen't further and I realized that most of the CPU was used by 
> {{epoll_wait()}} calls with a timeout of zero.
> Here is the output of perf on this system, which confirms that most of the 
> overhead was with timeout == 0.
> {code}
> Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): 
> 11594448
> Overhead  Trace output
>   
>  ◆
>   90.06%  epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, 
> timeout: 0x   
> ▒
>5.77%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x   
> ▒
>1.98%  epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, 
> timeout: 0x03e8   
> ▒
>0.04%  epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.04%  epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, 
> timeout: 0x   
> ▒
>0.03%  epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, 
> timeout: 0x   
> ▒
>0.02%  epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, 
> timeout: 0x
> {code}
> Running this time with perf record -ag for call traces:
> {code}
> # Children  Self   sys   usr  Trace output
> 
> #         
> 
> #
>  8.61% 8.61% 0.00% 8.61%  epfd: 0x00a7, events: 
> 0x7fca452d6000, maxevents: 0x1000, timeout: 0x
> |
> ---0x1000200af313
>|  
> --8.61%--0x7fca6117bdac
>   0x7fca60459804
>   epoll_wait
>  2.98% 2.98% 0.00% 2.98%  epfd: 0x00a7, events: 
> 0x7fca452d6000, maxevents: 0x1000, timeout: 0x03e8
> |
> ---0x1000200af313
>0x7fca6117b830
>

[jira] [Updated] (CASSANDRA-14592) Reconcile should not be dependent on nowInSec

2018-08-21 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14592:
--
Reviewer: Aleksey Yeschenko

> Reconcile should not be dependent on nowInSec
> -
>
> Key: CASSANDRA-14592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14592
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
> Fix For: 4.0
>
>
> To have the arrival time of a mutation on a replica determine the 
> reconciliation priority seems to provide for unintuitive database behaviour.  
> It seems we should formalise our reconciliation logic in a manner that does 
> not depend on this, and modify our internal APIs to prevent this dependency.
>  
> Take the following example, where both writes have the same timestamp:
>  
> Write X with a value A, TTL of 1s
> Write Y with a value B, no TTL
>  
> If X and Y arrive on replicas in < 1s, X and Y are both live, so record Y 
> wins the reconciliation.  The value B appears in the database.
> However, if X and Y arrive on replicas in > 1s, X is now (effectively) a 
> tombstone.  This wins the reconciliation race, and NO value is the result.
>  
> Note that the weirdness of this is more pronounced than it might first 
> appear.  If write X gets stuck in hints for a period on the coordinator to 
> one replica, the value B appears in the database until the hint is replayed.  
> So now we’re in a very uncertain state - will hints get replayed or not?  If 
> they do, the value B will disappear; if they don’t it won’t.  This is despite 
> a QUORUM of replicas ACKing both writes, and a QUORUM of readers being 
> engaged on read; the database still changes state to the user suddenly at 
> some arbitrary future point in time.
>  
> It seems to me that a simple solution to this, is to permit TTL’d data to 
> always win a reconciliation against non-TTL’d data (of same timestamp), so 
> that we are consistent across TTLs being transformed into tombstones.
>  
> 4.0 seems like a good opportunity to fix this behaviour, and mention in 
> CHANGES.txt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14592) Reconcile should not be dependent on nowInSec

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587593#comment-16587593
 ] 

Benedict commented on CASSANDRA-14592:
--

Pushed an update that addresses (I think, it's been a while) Aleksey's offline 
review comments.

We collaborated to modify the reconcile semantics a little further, so that 
reconciliation is as consistent as possible.  Now the only situations that 
might arise with inconsistent reconciliation occur when one cell is expiring, 
another is a tombstone, and only at the point where both are logically a 
tombstone.  Specifically, we now prefer:

# The most recent timestamp
# If either are a tombstone or expiring
## If one is regular, select the tombstone or expiring
## If one is expiring, select the tombstone
## The most recent deletion time
# The highest value (by raw ByteBuffer comparison)

> Reconcile should not be dependent on nowInSec
> -
>
> Key: CASSANDRA-14592
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14592
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
> Fix For: 4.0
>
>
> To have the arrival time of a mutation on a replica determine the 
> reconciliation priority seems to provide for unintuitive database behaviour.  
> It seems we should formalise our reconciliation logic in a manner that does 
> not depend on this, and modify our internal APIs to prevent this dependency.
>  
> Take the following example, where both writes have the same timestamp:
>  
> Write X with a value A, TTL of 1s
> Write Y with a value B, no TTL
>  
> If X and Y arrive on replicas in < 1s, X and Y are both live, so record Y 
> wins the reconciliation.  The value B appears in the database.
> However, if X and Y arrive on replicas in > 1s, X is now (effectively) a 
> tombstone.  This wins the reconciliation race, and NO value is the result.
>  
> Note that the weirdness of this is more pronounced than it might first 
> appear.  If write X gets stuck in hints for a period on the coordinator to 
> one replica, the value B appears in the database until the hint is replayed.  
> So now we’re in a very uncertain state - will hints get replayed or not?  If 
> they do, the value B will disappear; if they don’t it won’t.  This is despite 
> a QUORUM of replicas ACKing both writes, and a QUORUM of readers being 
> engaged on read; the database still changes state to the user suddenly at 
> some arbitrary future point in time.
>  
> It seems to me that a simple solution to this, is to permit TTL’d data to 
> always win a reconciliation against non-TTL’d data (of same timestamp), so 
> that we are consistent across TTLs being transformed into tombstones.
>  
> 4.0 seems like a good opportunity to fix this behaviour, and mention in 
> CHANGES.txt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14573) Expose settings in virtual table

2018-08-21 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587566#comment-16587566
 ] 

Aleksey Yeschenko commented on CASSANDRA-14573:
---

Committed the end version with some very minor tweaks on top as 
[994da255cec95982f52d20c91cb18eb7f9e45fc3|https://github.com/apache/cassandra/commit/994da255cec95982f52d20c91cb18eb7f9e45fc3]
 to 4.0.

Cheers.

> Expose settings in virtual table
> 
>
> Key: CASSANDRA-14573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14573
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>  Labels: pull-request-available, virtual-tables
> Fix For: 4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Allow both viewing what the settings are (currently impossible for some) and 
> allow changing some settings.
> Example:
> {code:java}
> UPDATE system_info.settings SET value = 'false' WHERE setting = 
> 'hinted_handoff_enabled';
> SELECT * FROM system_info.settings WHERE writable = True;
>  setting  | value  | writable
> --++--
>   batch_size_fail_threshold_in_kb | 50 | True
>   batch_size_warn_threshold_in_kb |  5 | True
>  cas_contention_timeout_in_ms |   1000 | True
>  compaction_throughput_mb_per_sec | 16 | True
> concurrent_compactors |  2 | True
>concurrent_validations | 2147483647 | True
>   counter_write_request_timeout_in_ms |   5000 | True
>hinted_handoff_enabled |  false | True
> hinted_handoff_throttle_in_kb |   1024 | True
>   incremental_backups |  false | True
>  inter_dc_stream_throughput_outbound_megabits_per_sec |200 | True
> phi_convict_threshold |8.0 | True
>   range_request_timeout_in_ms |  1 | True
>read_request_timeout_in_ms |   5000 | True
> request_timeout_in_ms |  1 | True
>   stream_throughput_outbound_megabits_per_sec |200 | True
>   tombstone_failure_threshold | 10 | True
>  tombstone_warn_threshold |   1000 | True
>truncate_request_timeout_in_ms |  6 | True
>   write_request_timeout_in_ms |   2000 | 
> True{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14573) Expose settings in virtual table

2018-08-21 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14573:
--
   Resolution: Fixed
Fix Version/s: (was: 4.x)
   4.0
   Status: Resolved  (was: Patch Available)

> Expose settings in virtual table
> 
>
> Key: CASSANDRA-14573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14573
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
>  Labels: pull-request-available, virtual-tables
> Fix For: 4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Allow both viewing what the settings are (currently impossible for some) and 
> allow changing some settings.
> Example:
> {code:java}
> UPDATE system_info.settings SET value = 'false' WHERE setting = 
> 'hinted_handoff_enabled';
> SELECT * FROM system_info.settings WHERE writable = True;
>  setting  | value  | writable
> --++--
>   batch_size_fail_threshold_in_kb | 50 | True
>   batch_size_warn_threshold_in_kb |  5 | True
>  cas_contention_timeout_in_ms |   1000 | True
>  compaction_throughput_mb_per_sec | 16 | True
> concurrent_compactors |  2 | True
>concurrent_validations | 2147483647 | True
>   counter_write_request_timeout_in_ms |   5000 | True
>hinted_handoff_enabled |  false | True
> hinted_handoff_throttle_in_kb |   1024 | True
>   incremental_backups |  false | True
>  inter_dc_stream_throughput_outbound_megabits_per_sec |200 | True
> phi_convict_threshold |8.0 | True
>   range_request_timeout_in_ms |  1 | True
>read_request_timeout_in_ms |   5000 | True
> request_timeout_in_ms |  1 | True
>   stream_throughput_outbound_megabits_per_sec |200 | True
>   tombstone_failure_threshold | 10 | True
>  tombstone_warn_threshold |   1000 | True
>truncate_request_timeout_in_ms |  6 | True
>   write_request_timeout_in_ms |   2000 | 
> True{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Add a virtual table to expose settings [Forced Update!]

2018-08-21 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk e7e0e0b23 -> 994da255c (forced update)


Add a virtual table to expose settings

patch by Chris Lohfink; reviewed by Aleksey Yeschenko for
CASSANDRA-14573


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/994da255
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/994da255
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/994da255

Branch: refs/heads/trunk
Commit: 994da255cec95982f52d20c91cb18eb7f9e45fc3
Parents: 6d1446f
Author: Chris Lohfink 
Authored: Tue Aug 14 14:05:12 2018 -0500
Committer: Aleksey Yeshchenko 
Committed: Tue Aug 21 16:08:47 2018 +0100

--
 CHANGES.txt |   1 +
 .../cassandra/db/virtual/SettingsTable.java | 189 ++
 .../db/virtual/SystemViewsKeyspace.java |   1 +
 .../cassandra/db/virtual/SettingsTableTest.java | 245 +++
 4 files changed, 436 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/994da255/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index aeaf8ce..9fbaf25 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Add a virtual table to expose settings (CASSANDRA-14573)
  * Fix up chunk cache handling of metrics (CASSANDRA-14628)
  * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652)
  * Incomplete handling of exceptions when decoding incoming messages 
(CASSANDRA-14574)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/994da255/src/java/org/apache/cassandra/db/virtual/SettingsTable.java
--
diff --git a/src/java/org/apache/cassandra/db/virtual/SettingsTable.java 
b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java
new file mode 100644
index 000..34debc6
--- /dev/null
+++ b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java
@@ -0,0 +1,189 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.db.virtual;
+
+import java.lang.reflect.Field;
+import java.lang.reflect.Modifier;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.function.BiConsumer;
+import java.util.stream.Collectors;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Functions;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+
+import org.apache.cassandra.audit.AuditLogOptions;
+import org.apache.cassandra.config.*;
+import org.apache.cassandra.db.DecoratedKey;
+import org.apache.cassandra.db.marshal.UTF8Type;
+import org.apache.cassandra.dht.LocalPartitioner;
+import org.apache.cassandra.schema.TableMetadata;
+import org.apache.cassandra.transport.ServerError;
+
+final class SettingsTable extends AbstractVirtualTable
+{
+private static final String NAME = "name";
+private static final String VALUE = "value";
+
+@VisibleForTesting
+static final Map FIELDS =
+Arrays.stream(Config.class.getFields())
+  .filter(f -> !Modifier.isStatic(f.getModifiers()))
+  .collect(Collectors.toMap(Field::getName, Functions.identity()));
+
+@VisibleForTesting
+final Map> overrides =
+ImmutableMap.>builder()
+.put("audit_logging_options", this::addAuditLoggingOptions)
+.put("client_encryption_options", 
this::addEncryptionOptions)
+.put("server_encryption_options", 
this::addEncryptionOptions)
+.put("transparent_data_encryption_options", 
this::addTransparentEncryptionOptions)
+.build();
+
+private final Config config;
+
+SettingsTable(String keyspace)
+{
+this(keyspace, DatabaseDescriptor.getRawConfig());
+}
+
+SettingsTable(String keyspace, Config config)
+{
+super(TableMetadata.builder(keyspace, "settings")
+   .comment("current settings")

cassandra git commit: Add a virtual table to expose settings (CASSANDRA-14573)

2018-08-21 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk 6d1446ff0 -> e7e0e0b23


Add a virtual table to expose settings (CASSANDRA-14573)

patch by Chris Lohfink; reviewed by Aleksey Yeschenko for
CASSANDRA-14573


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e7e0e0b2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e7e0e0b2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e7e0e0b2

Branch: refs/heads/trunk
Commit: e7e0e0b233acf3584147435d0989a9e2474c09e4
Parents: 6d1446f
Author: Chris Lohfink 
Authored: Tue Aug 14 14:05:12 2018 -0500
Committer: Aleksey Yeshchenko 
Committed: Tue Aug 21 15:57:40 2018 +0100

--
 CHANGES.txt |   1 +
 .../cassandra/db/virtual/SettingsTable.java | 189 ++
 .../db/virtual/SystemViewsKeyspace.java |   1 +
 .../cassandra/db/virtual/SettingsTableTest.java | 245 +++
 4 files changed, 436 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e7e0e0b2/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index aeaf8ce..9fbaf25 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Add a virtual table to expose settings (CASSANDRA-14573)
  * Fix up chunk cache handling of metrics (CASSANDRA-14628)
  * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652)
  * Incomplete handling of exceptions when decoding incoming messages 
(CASSANDRA-14574)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e7e0e0b2/src/java/org/apache/cassandra/db/virtual/SettingsTable.java
--
diff --git a/src/java/org/apache/cassandra/db/virtual/SettingsTable.java 
b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java
new file mode 100644
index 000..34debc6
--- /dev/null
+++ b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java
@@ -0,0 +1,189 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.db.virtual;
+
+import java.lang.reflect.Field;
+import java.lang.reflect.Modifier;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.function.BiConsumer;
+import java.util.stream.Collectors;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Functions;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+
+import org.apache.cassandra.audit.AuditLogOptions;
+import org.apache.cassandra.config.*;
+import org.apache.cassandra.db.DecoratedKey;
+import org.apache.cassandra.db.marshal.UTF8Type;
+import org.apache.cassandra.dht.LocalPartitioner;
+import org.apache.cassandra.schema.TableMetadata;
+import org.apache.cassandra.transport.ServerError;
+
+final class SettingsTable extends AbstractVirtualTable
+{
+private static final String NAME = "name";
+private static final String VALUE = "value";
+
+@VisibleForTesting
+static final Map FIELDS =
+Arrays.stream(Config.class.getFields())
+  .filter(f -> !Modifier.isStatic(f.getModifiers()))
+  .collect(Collectors.toMap(Field::getName, Functions.identity()));
+
+@VisibleForTesting
+final Map> overrides =
+ImmutableMap.>builder()
+.put("audit_logging_options", this::addAuditLoggingOptions)
+.put("client_encryption_options", 
this::addEncryptionOptions)
+.put("server_encryption_options", 
this::addEncryptionOptions)
+.put("transparent_data_encryption_options", 
this::addTransparentEncryptionOptions)
+.build();
+
+private final Config config;
+
+SettingsTable(String keyspace)
+{
+this(keyspace, DatabaseDescriptor.getRawConfig());
+}
+
+SettingsTable(String keyspace, Config config)
+{
+super(TableMetadata.builder(keyspace, "settings")
+   .comment("current 

[jira] [Commented] (CASSANDRA-9989) Optimise BTree.Buider

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587507#comment-16587507
 ] 

Benedict commented on CASSANDRA-9989:
-

Thanks Jay.  The patch looks good overall, but I have few improvement comments, 
and some questions:

# How did you arrive at your {{childrenNum}} calculation, and are we certain it 
is correct?  This is pretty critical for correctness, and hard to test fully, 
so it would be nice to have some comments justifying it.
# Why decrement {{left}} instead of just counting up the number of values 
written?
# Why is TREE_SIZE indexed from 1, not 0?
# It would be nice if we removed MAX_DEPTH, and just truncated TREE_SIZE to the 
correct maximum in our static block

I'm also torn on the splitting of the last two nodes - this is consistent with 
the current {{NodeBuilder}} logic, but it does complicate the code a little 
versus evenly splitting the remaining size amongst all the children.

I've pushed a patch 
[here|https://github.com/belliottsmith/cassandra/tree/9989-suggest] with some 
tweaks to the {{LongBTreeTest}} to stress the new code paths more more, and it 
would be great if we could run this against the final patch for a while, with a 
tweak to the parameters to increase further the ratio of tests that use this 
code path.

> Optimise BTree.Buider
> -
>
> Key: CASSANDRA-9989
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9989
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Benedict
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 9989-trunk.txt
>
>
> BTree.Builder could reduce its copying, and exploit toArray more efficiently, 
> with some work. It's not very important right now because we don't make as 
> much use of its bulk-add methods as we otherwise might, however over time 
> this work will become more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14409) Transient Replication: Support ring changes when transient replication is in use (add/remove node, change RF, add/remove DC)

2018-08-21 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587397#comment-16587397
 ] 

Alex Petrov commented on CASSANDRA-14409:
-

[Link for 
compare|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4],
 may be useful (at least until rebase).

Some preliminary comments (from looking at the code and running unit tests, 
without writing additional dtests so far):
  * 
[here|https://github.com/aweisberg/cassandra/commit/62578c85865c474774633f5337affa8c2ce0eb07#diff-350352a1ac9b039efbee2eeb8978a9c9R149]
 we can do just one iteration and add to one of the sets depending on whether 
the range is full or transient, similar change can be done in 
{{RangeStreamer#fetchAsync}} with {{remaining}} and {{skipped}}. Generally, 
{{fetchAsync}} uses a two-step logic which is quite hard to follow, since it 
becomes hard to track which ranges are actually making it to the end and does 
quite some unnecessary iteration.
  * 
[here|https://github.com/aweisberg/cassandra/commit/62578c85865c474774633f5337affa8c2ce0eb07#diff-d4e3b82e9bebfd2cb466b4a30af07fa4R1132],
 we can simplify the statement to {{... !needsCleanupTransient || 
!sstable.isRepaired()}}
  * 
[this|https://github.com/apache/cassandra/compare/trunk...aweisberg:14409-4#diff-fad052638059f53b1a6d479dbd05f2f2R323]
 might just use {{Replica#isLocal}}. Also, naming is inconsistent with 
{{isNotAlive}}, "is" is missing. Same inconsistency with 
{{endpointNotReplicatedAnymore}}.
  * maybe use a single debug statement 
[here|https://github.com/apache/cassandra/compare/trunk...aweisberg:14409-4#diff-fad052638059f53b1a6d479dbd05f2f2R299]
  * Might be worth to remove {{pritnlns}} from {{MoveTransientTest}} 
  * Should we stick to {{workMap}} or {{rangeMap}} in {{RangeStreamer}}?
  * 
[here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-fad052638059f53b1a6d479dbd05f2f2R219]
 we might want to avoid iteration if tracing is not enabled
  * in Range streamer, we keep calling {{Keyspace#open}} many times even though 
we could just do it less times and pass a key space instance where applicable
  * duplicate import 
[here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-a4c2ce49cb3a3429a8d6376929a90f7fR33]
  * 
[this|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-433b489a9a55c01dc4b021b012af3af6R59]
 might need an extra comment, since this task is used in repair as well as in 
streaming. Same goes for 
[here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-eb7f85ba9cf3e87d842aad8b82557d19R82].
 It's not completely transparent why we strip node information and 
transientness. I'd rather fix the code that prevents us from not doing It or 
would just use ranges if this information is not important. 
  * 
[here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-ce3f6856b405c96859d9a50d9977e0b9R1285]
 (and in some other cases where pair is used), I'd just use a specialised Pair 
class that would give more precise definition of what left and right is. 
Namely: transient and full.
  * do we need to update legacy transferred ranges 
[here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-ce3f6856b405c96859d9a50d9977e0b9R1322]?
 However, this is there after [59b5b6bef0fa76bf5740b688fcd4d9cf525760d0], so 
not directly related to this commit.
  * optimised range fetch map calculation feels not fully done. We need to 
change the {{RangeFetchMapCalculator}} instead of unwrapping and then 
re-wrapping items in it instead. What it currently does looks very fragile.
  * {{RangeStreamer#calculateRangesToFetchWithPreferredEndpoints}} also looks 
unfinished, since it returns {{ReplicaMultimap}}, but we 
always convert it to {{Multimap>}} 
in {{StorageService#calculateRangesToFetchWithPreferredEndpoints}} (same name, 
different return type, which makes it additionally difficult to follow), and 
then back to {{RangeFetchMapCalculator}} and 
{{convertPreferredEndpointsToWorkMap}}. 
  * it seems that we keep splitting full and transient replicas in a bunch of 
places, maybe we should add some auxiliary method to perform this job 
efficiently with a clear interface and use it everywhere?
  * in {{StorageService}}, use of {{Map.Entry}} 
simultaneously with {{Pair}} really complicates the logic. 
Once again hints we might need a specialised class for replica 
source/destination pairs.
  * in multiple places ({{StorageService}} and {{PendingRepairManager}}), we're 
still calling {{getAddressReplicas}} that materialises map only to get 
{{ReplicaSet}} by address 
  * it looks like the distinction between transient and full 

[jira] [Updated] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table

2018-08-21 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14626:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Expose buffer cache metrics in caches virtual table
> ---
>
> Key: CASSANDRA-14626
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14626
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Benjamin Lerer
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache 
> metrics in the caches virtual table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table

2018-08-21 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587360#comment-16587360
 ] 

Aleksey Yeschenko commented on CASSANDRA-14626:
---

Committed as 
[6d1446ff062ac322b203e16ea0bf0ed8fd1fa5ca|https://github.com/apache/cassandra/commit/6d1446ff062ac322b203e16ea0bf0ed8fd1fa5ca]
 to 4.0, thanks.

> Expose buffer cache metrics in caches virtual table
> ---
>
> Key: CASSANDRA-14626
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14626
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Benjamin Lerer
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache 
> metrics in the caches virtual table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Add chunks cache metrics to caches virtual table

2018-08-21 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk e6b8e7a72 -> 6d1446ff0


Add chunks cache metrics to caches virtual table

patch by Aleksey Yeschenko; reviewed by Benedict Elliott Smith for
CASSANDRA-14626


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6d1446ff
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6d1446ff
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6d1446ff

Branch: refs/heads/trunk
Commit: 6d1446ff062ac322b203e16ea0bf0ed8fd1fa5ca
Parents: e6b8e7a
Author: Aleksey Yeshchenko 
Authored: Wed Aug 8 16:51:11 2018 +0100
Committer: Aleksey Yeshchenko 
Committed: Tue Aug 21 13:35:41 2018 +0100

--
 CHANGES.txt   | 2 +-
 src/java/org/apache/cassandra/db/virtual/CachesTable.java | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6d1446ff/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 5fa28f5..aeaf8ce 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -12,7 +12,7 @@
  * Remove hardcoded java11 jvm args in idea workspace files (CASSANDRA-14627)
  * Update netty to 4.1.128 (CASSANDRA-14633)
  * Add a virtual table to expose thread pools (CASSANDRA-14523)
- * Add a virtual table to expose caches (CASSANDRA-14538)
+ * Add a virtual table to expose caches (CASSANDRA-14538, CASSANDRA-14626)
  * Fix toDate function for timestamp arguments (CASSANDRA-14502)
  * Revert running dtests by default in circleci (CASSANDRA-14614)
  * Stream entire SSTables when possible (CASSANDRA-14556)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6d1446ff/src/java/org/apache/cassandra/db/virtual/CachesTable.java
--
diff --git a/src/java/org/apache/cassandra/db/virtual/CachesTable.java 
b/src/java/org/apache/cassandra/db/virtual/CachesTable.java
index e5f80f7..5a265e6 100644
--- a/src/java/org/apache/cassandra/db/virtual/CachesTable.java
+++ b/src/java/org/apache/cassandra/db/virtual/CachesTable.java
@@ -17,6 +17,7 @@
  */
 package org.apache.cassandra.db.virtual;
 
+import org.apache.cassandra.cache.ChunkCache;
 import org.apache.cassandra.db.marshal.*;
 import org.apache.cassandra.dht.LocalPartitioner;
 import org.apache.cassandra.metrics.CacheMetrics;
@@ -69,9 +70,13 @@ final class CachesTable extends AbstractVirtualTable
 public DataSet data()
 {
 SimpleDataSet result = new SimpleDataSet(metadata());
+
+if (null != ChunkCache.instance)
+addRow(result, "chunks", ChunkCache.instance.metrics);
 addRow(result, "counters", 
CacheService.instance.counterCache.getMetrics());
 addRow(result, "keys", CacheService.instance.keyCache.getMetrics());
 addRow(result, "rows", CacheService.instance.rowCache.getMetrics());
+
 return result;
 }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587318#comment-16587318
 ] 

Benedict commented on CASSANDRA-14626:
--

+1 with no comments to the actual intended patch here (i.e. that builds on the 
CASSANDRA-14628 patch)

> Expose buffer cache metrics in caches virtual table
> ---
>
> Key: CASSANDRA-14626
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14626
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Benjamin Lerer
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache 
> metrics in the caches virtual table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics

2018-08-21 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587316#comment-16587316
 ] 

Aleksey Yeschenko commented on CASSANDRA-14628:
---

Cheers, committed as 
[e6b8e7a72f783ed0e1b5a2c04381f89b533229a4|https://github.com/apache/cassandra/commit/e6b8e7a72f783ed0e1b5a2c04381f89b533229a4]
 to 4.0.

> Clean up cache-related metrics
> --
>
> Key: CASSANDRA-14628
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14628
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate 
> of pre-existing {{CacheMetrics}}. I believe it was done initially because the 
> authors thought there was no way to register hits with {{Caffeine}}, only 
> misses, but that's not quite true. All we need is to provide a 
> {{StatsCounter}} object when building the cache and update our metrics from 
> there.
> The patch removes the redundant code and streamlines chunk cache metrics to 
> use more idiomatic tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14628) Clean up cache-related metrics

2018-08-21 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-14628:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Clean up cache-related metrics
> --
>
> Key: CASSANDRA-14628
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14628
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate 
> of pre-existing {{CacheMetrics}}. I believe it was done initially because the 
> authors thought there was no way to register hits with {{Caffeine}}, only 
> misses, but that's not quite true. All we need is to provide a 
> {{StatsCounter}} object when building the cache and update our metrics from 
> there.
> The patch removes the redundant code and streamlines chunk cache metrics to 
> use more idiomatic tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



cassandra git commit: Fix up chunk cache handling of metrics

2018-08-21 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk 8b9515bd2 -> e6b8e7a72


Fix up chunk cache handling of metrics

patch by Aleksey Yeschenko; reviewed by Benedict Elliott Smith for
CASSANDRA-14628


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e6b8e7a7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e6b8e7a7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e6b8e7a7

Branch: refs/heads/trunk
Commit: e6b8e7a72f783ed0e1b5a2c04381f89b533229a4
Parents: 8b9515b
Author: Aleksey Yeshchenko 
Authored: Wed Aug 8 15:17:26 2018 +0100
Committer: Aleksey Yeshchenko 
Committed: Tue Aug 21 12:33:42 2018 +0100

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/cache/ChunkCache.java  |  36 +++---
 .../cassandra/cache/InstrumentingCache.java |   3 +-
 .../apache/cassandra/metrics/CacheMetrics.java  | 116 +++
 .../cassandra/metrics/CacheMissMetrics.java | 114 --
 .../cassandra/metrics/ChunkCacheMetrics.java|  92 +++
 6 files changed, 179 insertions(+), 183 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6b8e7a7/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index d906879..5fa28f5 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0
+ * Fix up chunk cache handling of metrics (CASSANDRA-14628)
  * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652)
  * Incomplete handling of exceptions when decoding incoming messages 
(CASSANDRA-14574)
  * Add diagnostic events for user audit logging (CASSANDRA-13668)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6b8e7a7/src/java/org/apache/cassandra/cache/ChunkCache.java
--
diff --git a/src/java/org/apache/cassandra/cache/ChunkCache.java 
b/src/java/org/apache/cassandra/cache/ChunkCache.java
index 9284377..0edb681 100644
--- a/src/java/org/apache/cassandra/cache/ChunkCache.java
+++ b/src/java/org/apache/cassandra/cache/ChunkCache.java
@@ -29,11 +29,10 @@ import com.google.common.collect.Iterables;
 import com.google.common.util.concurrent.MoreExecutors;
 
 import com.github.benmanes.caffeine.cache.*;
-import com.codahale.metrics.Timer;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.io.sstable.CorruptSSTableException;
 import org.apache.cassandra.io.util.*;
-import org.apache.cassandra.metrics.CacheMissMetrics;
+import org.apache.cassandra.metrics.ChunkCacheMetrics;
 import org.apache.cassandra.utils.memory.BufferPool;
 
 public class ChunkCache
@@ -47,7 +46,7 @@ public class ChunkCache
 public static final ChunkCache instance = enabled ? new ChunkCache() : 
null;
 
 private final LoadingCache cache;
-public final CacheMissMetrics metrics;
+public final ChunkCacheMetrics metrics;
 
 static class Key
 {
@@ -135,29 +134,25 @@ public class ChunkCache
 }
 }
 
-public ChunkCache()
+private ChunkCache()
 {
+metrics = new ChunkCacheMetrics(this);
 cache = Caffeine.newBuilder()
-.maximumWeight(cacheSize)
-.executor(MoreExecutors.directExecutor())
-.weigher((key, buffer) -> ((Buffer) buffer).buffer.capacity())
-.removalListener(this)
-.build(this);
-metrics = new CacheMissMetrics("ChunkCache", this);
+.maximumWeight(cacheSize)
+.executor(MoreExecutors.directExecutor())
+.weigher((key, buffer) -> ((Buffer) 
buffer).buffer.capacity())
+.removalListener(this)
+.recordStats(() -> metrics)
+.build(this);
 }
 
 @Override
-public Buffer load(Key key) throws Exception
+public Buffer load(Key key)
 {
-ChunkReader rebufferer = key.file;
-metrics.misses.mark();
-try (Timer.Context ctx = metrics.missLatency.time())
-{
-ByteBuffer buffer = BufferPool.get(key.file.chunkSize(), 
key.file.preferredBufferType());
-assert buffer != null;
-rebufferer.readChunk(key.position, buffer);
-return new Buffer(buffer, key.position);
-}
+ByteBuffer buffer = BufferPool.get(key.file.chunkSize(), 
key.file.preferredBufferType());
+assert buffer != null;
+key.file.readChunk(key.position, buffer);
+return new Buffer(buffer, key.position);
 }
 
 @Override
@@ -229,7 +224,6 @@ public class ChunkCache
 {
 try
 {
-metrics.requests.mark();
 long pageAlignedPos = 

[jira] [Comment Edited] (CASSANDRA-14628) Clean up cache-related metrics

2018-08-21 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587281#comment-16587281
 ] 

Aleksey Yeschenko edited comment on CASSANDRA-14628 at 8/21/18 10:57 AM:
-

bq. Modernise all of the RatioGauge construction as well, via a static method 
that accepts a Supplier

Looks good. Can look better though. Might as well go all the way and make a 
static method that accepts two {{DoubleSupplier}} s

bq. Replace requests with a Metered that proxies its calls to getX onto 
hits.getX() + misses.getX()

Good call.

Cherry-picked, and pushed an updated commit on top that reformats things, fixes 
import order, and makes the {{DoubleSupplier}} change.


was (Author: iamaleksey):
bq. Modernise all of the RatioGauge construction as well, via a static method 
that accepts a Supplier

Looks good. Can look better though. Might as well go all the way and make a 
static method that accepts two {{DoubleSupplier}}s

bq. Replace requests with a Metered that proxies its calls to getX onto 
hits.getX() + misses.getX()

Good call.

Cherry-picked, and pushed an updated commit on top that reformats things, fixes 
import order, and makes the {{DoubleSupplier}} change.

> Clean up cache-related metrics
> --
>
> Key: CASSANDRA-14628
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14628
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate 
> of pre-existing {{CacheMetrics}}. I believe it was done initially because the 
> authors thought there was no way to register hits with {{Caffeine}}, only 
> misses, but that's not quite true. All we need is to provide a 
> {{StatsCounter}} object when building the cache and update our metrics from 
> there.
> The patch removes the redundant code and streamlines chunk cache metrics to 
> use more idiomatic tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587286#comment-16587286
 ] 

Benedict commented on CASSANDRA-14628:
--

Yep, even better.  +1

> Clean up cache-related metrics
> --
>
> Key: CASSANDRA-14628
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14628
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate 
> of pre-existing {{CacheMetrics}}. I believe it was done initially because the 
> authors thought there was no way to register hits with {{Caffeine}}, only 
> misses, but that's not quite true. All we need is to provide a 
> {{StatsCounter}} object when building the cache and update our metrics from 
> there.
> The patch removes the redundant code and streamlines chunk cache metrics to 
> use more idiomatic tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics

2018-08-21 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587281#comment-16587281
 ] 

Aleksey Yeschenko commented on CASSANDRA-14628:
---

bq. Modernise all of the RatioGauge construction as well, via a static method 
that accepts a Supplier

Looks good. Can look better though. Might as well go all the way and make a 
static method that accepts two {{DoubleSupplier}}s

bq. Replace requests with a Metered that proxies its calls to getX onto 
hits.getX() + misses.getX()

Good call.

Cherry-picked, and pushed an updated commit on top that reformats things, fixes 
import order, and makes the {{DoubleSupplier}} change.

> Clean up cache-related metrics
> --
>
> Key: CASSANDRA-14628
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14628
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate 
> of pre-existing {{CacheMetrics}}. I believe it was done initially because the 
> authors thought there was no way to register hits with {{Caffeine}}, only 
> misses, but that's not quite true. All we need is to provide a 
> {{StatsCounter}} object when building the cache and update our metrics from 
> there.
> The patch removes the redundant code and streamlines chunk cache metrics to 
> use more idiomatic tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587231#comment-16587231
 ] 

Benedict commented on CASSANDRA-14628:
--

It looks like [my 
comment|https://issues.apache.org/jira/browse/CASSANDRA-14626?focusedCommentId=16587227=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16587227]
 on CASSANDRA-14626 should have gone here.

> Clean up cache-related metrics
> --
>
> Key: CASSANDRA-14628
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14628
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate 
> of pre-existing {{CacheMetrics}}. I believe it was done initially because the 
> authors thought there was no way to register hits with {{Caffeine}}, only 
> misses, but that's not quite true. All we need is to provide a 
> {{StatsCounter}} object when building the cache and update our metrics from 
> there.
> The patch removes the redundant code and streamlines chunk cache metrics to 
> use more idiomatic tracking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587227#comment-16587227
 ] 

Benedict commented on CASSANDRA-14626:
--

Patch looks like a big improvement.  I have a couple of minor suggestions for 
{{CacheMetrics}}:

# Modernise all of the {{RatioGauge}} construction as well, via a static method 
that accepts a {{Supplier}}
# Replace {{requests}} with a {{Metered}} that proxies its calls to {{getX}} 
onto {{hits.getX() + misses.getX()}}

I've pushed a patch with these changes 
[here|https://github.com/belliottsmith/cassandra/tree/14626-suggest].  Feel 
free to take or discard as you please.

+1

> Expose buffer cache metrics in caches virtual table
> ---
>
> Key: CASSANDRA-14626
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14626
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Benjamin Lerer
>Assignee: Aleksey Yeschenko
>Priority: Minor
>  Labels: virtual-tables
> Fix For: 4.0
>
>
> As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache 
> metrics in the caches virtual table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary

2018-08-21 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587183#comment-16587183
 ] 

Branimir Lambov commented on CASSANDRA-14649:
-

Committed as 
[49adbe7e0f0c8a83f3b843b65612528498b5c9a5|https://github.com/apache/cassandra/commit/49adbe7e0f0c8a83f3b843b65612528498b5c9a5].

> Index summaries fail when their size gets > 2G and use more space than 
> necessary
> 
>
> Key: CASSANDRA-14649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14649
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Major
>
> After building a summary, {{IndexSummaryBuilder}} tries to trim the memory 
> writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of 
> trimming, this ends up allocating at least as much extra space and failing 
> the {{Buffer.position()}} call when the size is greater than 
> {{Integer.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary

2018-08-21 Thread Branimir Lambov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-14649:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Index summaries fail when their size gets > 2G and use more space than 
> necessary
> 
>
> Key: CASSANDRA-14649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14649
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Major
>
> After building a summary, {{IndexSummaryBuilder}} tries to trim the memory 
> writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of 
> trimming, this ends up allocating at least as much extra space and failing 
> the {{Buffer.position()}} call when the size is greater than 
> {{Integer.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[07/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2018-08-21 Thread blambov
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/299782cf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/299782cf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/299782cf

Branch: refs/heads/trunk
Commit: 299782cff5a56db8b8fe9ad70a43bea7b729cc99
Parents: 236c47e 49adbe7
Author: Branimir Lambov 
Authored: Tue Aug 21 11:55:38 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:55:38 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[09/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-08-21 Thread blambov
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/991e1971
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/991e1971
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/991e1971

Branch: refs/heads/trunk
Commit: 991e19711f8762bbf93d6af588cef0a14668cc59
Parents: 65a4682 299782c
Author: Branimir Lambov 
Authored: Tue Aug 21 11:56:05 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:56:05 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--
diff --cc src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
index 144edad,7586543..28ca468
--- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
@@@ -38,43 -37,11 +38,43 @@@ public class DataOutputBuffer extends B
  /*
   * Threshold at which resizing transitions from doubling to increasing by 
50%
   */
- private static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
+ static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
  
 +/*
 + * Only recycle OutputBuffers up to 1Mb. Larger buffers will be trimmed 
back to this size.
 + */
 +private static final int MAX_RECYCLE_BUFFER_SIZE = 
Integer.getInteger(Config.PROPERTY_PREFIX + "dob_max_recycle_bytes", 1024 * 
1024);
 +
 +private static final int DEFAULT_INITIAL_BUFFER_SIZE = 128;
 +
 +/**
 + * Scratch buffers used mostly for serializing in memory. It's important 
to call #recycle() when finished
 + * to keep the memory overhead from being too large in the system.
 + */
 +public static final FastThreadLocal scratchBuffer = new 
FastThreadLocal()
 +{
 +protected DataOutputBuffer initialValue() throws Exception
 +{
 +return new DataOutputBuffer()
 +{
 +public void close()
 +{
 +if (buffer.capacity() <= MAX_RECYCLE_BUFFER_SIZE)
 +{
 +buffer.clear();
 +}
 +else
 +{
 +buffer = 
ByteBuffer.allocate(DEFAULT_INITIAL_BUFFER_SIZE);
 +}
 +}
 +};
 +}
 +};
 +
  public DataOutputBuffer()
  {
 -this(128);
 +this(DEFAULT_INITIAL_BUFFER_SIZE);
  }
  
  public DataOutputBuffer(int size)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[05/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2018-08-21 Thread blambov
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/299782cf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/299782cf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/299782cf

Branch: refs/heads/cassandra-3.11
Commit: 299782cff5a56db8b8fe9ad70a43bea7b729cc99
Parents: 236c47e 49adbe7
Author: Branimir Lambov 
Authored: Tue Aug 21 11:55:38 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:55:38 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[06/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2018-08-21 Thread blambov
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/299782cf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/299782cf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/299782cf

Branch: refs/heads/cassandra-3.0
Commit: 299782cff5a56db8b8fe9ad70a43bea7b729cc99
Parents: 236c47e 49adbe7
Author: Branimir Lambov 
Authored: Tue Aug 21 11:55:38 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:55:38 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[02/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G

2018-08-21 Thread blambov
Fix SafeMemoryWriter trimming and behaviour over 2G

patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e

Branch: refs/heads/cassandra-3.0
Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5
Parents: 0e81892
Author: Branimir Lambov 
Authored: Thu Aug 16 16:15:07 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:53:30 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java 
b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
index 6110afe..0f604e0 100644
--- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
+++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
@@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable
 {
 // this method should only be called when we've finished appending 
records, so we truncate the
 // memory we're using to the exact amount required to represent it 
before building our summary
-entries.setCapacity(entries.length());
-offsets.setCapacity(offsets.length());
+entries.trim();
+offsets.trim();
 }
 
 public IndexSummary build(IPartitioner partitioner)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
index 6ea6d97..3f1e081 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
@@ -37,7 +37,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 /*
  * Threshold at which resizing transitions from doubling to increasing by 
50%
  */
-private static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
+static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX 
+ "DOB_DOUBLING_THRESHOLD_MB", 64);
 
 public DataOutputBuffer()
 {
@@ -83,7 +83,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 @Override
 protected void doFlush(int count) throws IOException
 {
-reallocate(count);
+expandToFit(count);
 }
 
 //Hack for test, make it possible to override checking the buffer capacity
@@ -119,7 +119,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 return validateReallocation(newSize);
 }
 
-protected void reallocate(long count)
+protected void expandToFit(long count)
 {
 if (count <= 0)
 return;
@@ -141,7 +141,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 public int write(ByteBuffer src) throws IOException
 {
 int count = src.remaining();
-reallocate(count);
+expandToFit(count);
 buffer.put(src);
 return count;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
index c815c9e..c9767fc 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
@@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer
  * @see org.apache.cassandra.io.util.DataOutputBuffer#reallocate(long)
  */
 @Override
-protected void reallocate(long newSize)
+protected void expandToFit(long newSize)
 {
 throw new BufferOverflowException();
 }


[10/10] cassandra git commit: Merge branch 'cassandra-3.11' into trunk

2018-08-21 Thread blambov
Merge branch 'cassandra-3.11' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8b9515bd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8b9515bd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8b9515bd

Branch: refs/heads/trunk
Commit: 8b9515bd2e410c634e4a31fe3e93890f1a1f8f71
Parents: ac1bb75 991e197
Author: Branimir Lambov 
Authored: Tue Aug 21 11:56:30 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:56:30 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[01/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G

2018-08-21 Thread blambov
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 0e81892d7 -> 49adbe7e0
  refs/heads/cassandra-3.0 236c47e65 -> 299782cff
  refs/heads/cassandra-3.11 65a46820b -> 991e19711
  refs/heads/trunk ac1bb7586 -> 8b9515bd2


Fix SafeMemoryWriter trimming and behaviour over 2G

patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e

Branch: refs/heads/cassandra-2.2
Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5
Parents: 0e81892
Author: Branimir Lambov 
Authored: Thu Aug 16 16:15:07 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:53:30 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java 
b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
index 6110afe..0f604e0 100644
--- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
+++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
@@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable
 {
 // this method should only be called when we've finished appending 
records, so we truncate the
 // memory we're using to the exact amount required to represent it 
before building our summary
-entries.setCapacity(entries.length());
-offsets.setCapacity(offsets.length());
+entries.trim();
+offsets.trim();
 }
 
 public IndexSummary build(IPartitioner partitioner)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
index 6ea6d97..3f1e081 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
@@ -37,7 +37,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 /*
  * Threshold at which resizing transitions from doubling to increasing by 
50%
  */
-private static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
+static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX 
+ "DOB_DOUBLING_THRESHOLD_MB", 64);
 
 public DataOutputBuffer()
 {
@@ -83,7 +83,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 @Override
 protected void doFlush(int count) throws IOException
 {
-reallocate(count);
+expandToFit(count);
 }
 
 //Hack for test, make it possible to override checking the buffer capacity
@@ -119,7 +119,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 return validateReallocation(newSize);
 }
 
-protected void reallocate(long count)
+protected void expandToFit(long count)
 {
 if (count <= 0)
 return;
@@ -141,7 +141,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 public int write(ByteBuffer src) throws IOException
 {
 int count = src.remaining();
-reallocate(count);
+expandToFit(count);
 buffer.put(src);
 return count;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
index c815c9e..c9767fc 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
@@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer
  * @see 

[04/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G

2018-08-21 Thread blambov
Fix SafeMemoryWriter trimming and behaviour over 2G

patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e

Branch: refs/heads/trunk
Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5
Parents: 0e81892
Author: Branimir Lambov 
Authored: Thu Aug 16 16:15:07 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:53:30 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java 
b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
index 6110afe..0f604e0 100644
--- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
+++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
@@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable
 {
 // this method should only be called when we've finished appending 
records, so we truncate the
 // memory we're using to the exact amount required to represent it 
before building our summary
-entries.setCapacity(entries.length());
-offsets.setCapacity(offsets.length());
+entries.trim();
+offsets.trim();
 }
 
 public IndexSummary build(IPartitioner partitioner)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
index 6ea6d97..3f1e081 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
@@ -37,7 +37,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 /*
  * Threshold at which resizing transitions from doubling to increasing by 
50%
  */
-private static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
+static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX 
+ "DOB_DOUBLING_THRESHOLD_MB", 64);
 
 public DataOutputBuffer()
 {
@@ -83,7 +83,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 @Override
 protected void doFlush(int count) throws IOException
 {
-reallocate(count);
+expandToFit(count);
 }
 
 //Hack for test, make it possible to override checking the buffer capacity
@@ -119,7 +119,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 return validateReallocation(newSize);
 }
 
-protected void reallocate(long count)
+protected void expandToFit(long count)
 {
 if (count <= 0)
 return;
@@ -141,7 +141,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 public int write(ByteBuffer src) throws IOException
 {
 int count = src.remaining();
-reallocate(count);
+expandToFit(count);
 buffer.put(src);
 return count;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
index c815c9e..c9767fc 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
@@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer
  * @see org.apache.cassandra.io.util.DataOutputBuffer#reallocate(long)
  */
 @Override
-protected void reallocate(long newSize)
+protected void expandToFit(long newSize)
 {
 throw new BufferOverflowException();
 }


[08/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11

2018-08-21 Thread blambov
Merge branch 'cassandra-3.0' into cassandra-3.11


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/991e1971
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/991e1971
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/991e1971

Branch: refs/heads/cassandra-3.11
Commit: 991e19711f8762bbf93d6af588cef0a14668cc59
Parents: 65a4682 299782c
Author: Branimir Lambov 
Authored: Tue Aug 21 11:56:05 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:56:05 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--
diff --cc src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
index 144edad,7586543..28ca468
--- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
@@@ -38,43 -37,11 +38,43 @@@ public class DataOutputBuffer extends B
  /*
   * Threshold at which resizing transitions from doubling to increasing by 
50%
   */
- private static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
+ static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
  
 +/*
 + * Only recycle OutputBuffers up to 1Mb. Larger buffers will be trimmed 
back to this size.
 + */
 +private static final int MAX_RECYCLE_BUFFER_SIZE = 
Integer.getInteger(Config.PROPERTY_PREFIX + "dob_max_recycle_bytes", 1024 * 
1024);
 +
 +private static final int DEFAULT_INITIAL_BUFFER_SIZE = 128;
 +
 +/**
 + * Scratch buffers used mostly for serializing in memory. It's important 
to call #recycle() when finished
 + * to keep the memory overhead from being too large in the system.
 + */
 +public static final FastThreadLocal scratchBuffer = new 
FastThreadLocal()
 +{
 +protected DataOutputBuffer initialValue() throws Exception
 +{
 +return new DataOutputBuffer()
 +{
 +public void close()
 +{
 +if (buffer.capacity() <= MAX_RECYCLE_BUFFER_SIZE)
 +{
 +buffer.clear();
 +}
 +else
 +{
 +buffer = 
ByteBuffer.allocate(DEFAULT_INITIAL_BUFFER_SIZE);
 +}
 +}
 +};
 +}
 +};
 +
  public DataOutputBuffer()
  {
 -this(128);
 +this(DEFAULT_INITIAL_BUFFER_SIZE);
  }
  
  public DataOutputBuffer(int size)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/test/unit/org/apache/cassandra/io/util/DataOutputTest.java
--


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[03/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G

2018-08-21 Thread blambov
Fix SafeMemoryWriter trimming and behaviour over 2G

patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e

Branch: refs/heads/cassandra-3.11
Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5
Parents: 0e81892
Author: Branimir Lambov 
Authored: Thu Aug 16 16:15:07 2018 +0300
Committer: Branimir Lambov 
Committed: Tue Aug 21 11:53:30 2018 +0300

--
 .../io/sstable/IndexSummaryBuilder.java |  4 +-
 .../cassandra/io/util/DataOutputBuffer.java |  8 +-
 .../io/util/DataOutputBufferFixed.java  |  2 +-
 .../cassandra/io/util/SafeMemoryWriter.java | 16 ++--
 .../cassandra/io/util/DataOutputTest.java   |  4 +-
 .../cassandra/io/util/SafeMemoryWriterTest.java | 90 
 6 files changed, 110 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java 
b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
index 6110afe..0f604e0 100644
--- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
+++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java
@@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable
 {
 // this method should only be called when we've finished appending 
records, so we truncate the
 // memory we're using to the exact amount required to represent it 
before building our summary
-entries.setCapacity(entries.length());
-offsets.setCapacity(offsets.length());
+entries.trim();
+offsets.trim();
 }
 
 public IndexSummary build(IPartitioner partitioner)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
index 6ea6d97..3f1e081 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java
@@ -37,7 +37,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 /*
  * Threshold at which resizing transitions from doubling to increasing by 
50%
  */
-private static final long DOUBLING_THRESHOLD = 
Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64);
+static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX 
+ "DOB_DOUBLING_THRESHOLD_MB", 64);
 
 public DataOutputBuffer()
 {
@@ -83,7 +83,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 @Override
 protected void doFlush(int count) throws IOException
 {
-reallocate(count);
+expandToFit(count);
 }
 
 //Hack for test, make it possible to override checking the buffer capacity
@@ -119,7 +119,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 return validateReallocation(newSize);
 }
 
-protected void reallocate(long count)
+protected void expandToFit(long count)
 {
 if (count <= 0)
 return;
@@ -141,7 +141,7 @@ public class DataOutputBuffer extends 
BufferedDataOutputStreamPlus
 public int write(ByteBuffer src) throws IOException
 {
 int count = src.remaining();
-reallocate(count);
+expandToFit(count);
 buffer.put(src);
 return count;
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
--
diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java 
b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
index c815c9e..c9767fc 100644
--- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
+++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java
@@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer
  * @see org.apache.cassandra.io.util.DataOutputBuffer#reallocate(long)
  */
 @Override
-protected void reallocate(long newSize)
+protected void expandToFit(long newSize)
 {
 throw new BufferOverflowException();
 }


[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary

2018-08-21 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587178#comment-16587178
 ] 

Benedict commented on CASSANDRA-14649:
--

+1

I do wonder if we should revisit requiring linear memory for all of this, but 
really we should probably instead revisit if such huge sstables are a good idea.

> Index summaries fail when their size gets > 2G and use more space than 
> necessary
> 
>
> Key: CASSANDRA-14649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14649
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Major
>
> After building a summary, {{IndexSummaryBuilder}} tries to trim the memory 
> writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of 
> trimming, this ends up allocating at least as much extra space and failing 
> the {{Buffer.position()}} call when the size is greater than 
> {{Integer.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary

2018-08-21 Thread Branimir Lambov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587147#comment-16587147
 ] 

Branimir Lambov commented on CASSANDRA-14649:
-

I realized we still have a problem if the size grows by over 2G, i.e. if it 
becomes >4G and needs to grow. Pushed another commit to fix and test this and 
limit the test size if there isn't enough memory: [new 
commit|https://github.com/blambov/cassandra/commit/65798672eff79bd1c97b960ed965f0e908f6c23e]
 [branch|https://github.com/blambov/cassandra/tree/14649-trunk] 
[test|https://circleci.com/gh/blambov/workflows/cassandra/tree/14649-trunk]

> Index summaries fail when their size gets > 2G and use more space than 
> necessary
> 
>
> Key: CASSANDRA-14649
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14649
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Major
>
> After building a summary, {{IndexSummaryBuilder}} tries to trim the memory 
> writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of 
> trimming, this ends up allocating at least as much extra space and failing 
> the {{Buffer.position()}} call when the size is greater than 
> {{Integer.MAX_VALUE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14658) Cassandra hangs at startup

2018-08-21 Thread Jing Weng (JIRA)
Jing Weng created CASSANDRA-14658:
-

 Summary: Cassandra hangs at startup
 Key: CASSANDRA-14658
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14658
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jing Weng
 Fix For: 3.11.1


Some time ago, when our cassandra cluster started, we found that it could not 
be started, checked various logs, and found no error message. Later, we 
obtained the thread stack information at startup by some means, and then refer 
to the source code, and found that if the commitlog initialization error occurs 
at startup, cassandra will appear deadlock and hang.

The thread stack of COMMIT-LOG-ALLOCATOR:

 
{noformat}
{
  "waitedCount": 0,
  "lockOwnerId": -1,
  "lockedMonitors": [],
  "waitedTime": -1,
  "blockedCount": 0,
  "threadState": "RUNNABLE",
  "inNative": false,
  "suspended": false,
  "threadName": "COMMIT-LOG-ALLOCATOR",
  "lockInfo": null,
  "threadId": 36,
  "stackTrace": [
{
  "fileName": "AbstractCommitLogSegmentManager.java",
  "nativeMethod": false,
  "methodName": "runMayThrow",
  "className": 
"org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager$1",
  "lineNumber": 133
},
{
  "fileName": "WrappedRunnable.java",
  "nativeMethod": false,
  "methodName": "run",
  "className": "org.apache.cassandra.utils.WrappedRunnable",
  "lineNumber": 28
},
{
  "fileName": "NamedThreadFactory.java",
  "nativeMethod": false,
  "methodName": "lambda$threadLocalDeallocator$0",
  "className": "org.apache.cassandra.concurrent.NamedThreadFactory",
  "lineNumber": 81
},
{
  "fileName": null,
  "nativeMethod": false,
  "methodName": "run",
  "className": 
"org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$4\/1392794732",
  "lineNumber": -1
},
{
  "fileName": "Thread.java",
  "nativeMethod": false,
  "methodName": "run",
  "className": "java.lang.Thread",
  "lineNumber": 748
}
  ],
  "blockedTime": -1,
  "lockName": null,
  "lockOwnerName": null,
  "lockedSynchronizers": []
}{noformat}
 

The thread stack of main thread:

 
{noformat}
{
  "waitedCount": 2,
  "lockOwnerId": -1,
  "lockedMonitors": [
{
  "identityHashCode": 600118828,
  "lockedStackDepth": 10,
  "className": "java.lang.Class",
  "lockedStackFrame": {
"fileName": "ColumnFamilyStore.java",
"nativeMethod": false,
"methodName": "createColumnFamilyStore",
"className": "org.apache.cassandra.db.ColumnFamilyStore",
"lineNumber": 620
  }
},
{
  "identityHashCode": 600118828,
  "lockedStackDepth": 11,
  "className": "java.lang.Class",
  "lockedStackFrame": {
"fileName": "ColumnFamilyStore.java",
"nativeMethod": false,
"methodName": "createColumnFamilyStore",
"className": "org.apache.cassandra.db.ColumnFamilyStore",
"lineNumber": 594
  }
},
{
  "identityHashCode": 1087037934,
  "lockedStackDepth": 15,
  "className": "java.lang.Class",
  "lockedStackFrame": {
"fileName": "Keyspace.java",
"nativeMethod": false,
"methodName": "open",
"className": "org.apache.cassandra.db.Keyspace",
"lineNumber": 127
  }
}
  ],
  "waitedTime": -1,
  "blockedCount": 0,
  "threadState": "WAITING",
  "inNative": false,
  "suspended": false,
  "threadName": "main",
  "lockInfo": null,
  "threadId": 1,
  "stackTrace": [
{
  "fileName": "Unsafe.java",
  "nativeMethod": true,
  "methodName": "park",
  "className": "sun.misc.Unsafe",
  "lineNumber": -2
},
{
  "fileName": "LockSupport.java",
  "nativeMethod": false,
  "methodName": "park",
  "className": "java.util.concurrent.locks.LockSupport",
  "lineNumber": 304
},
{
  "fileName": "WaitQueue.java",
  "nativeMethod": false,
  "methodName": "awaitUninterruptibly",
  "className": 
"org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal",
  "lineNumber": 280
},
{
  "fileName": "AbstractCommitLogSegmentManager.java",
  "nativeMethod": false,
  "methodName": "awaitAvailableSegment",
  "className": 
"org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager",
  "lineNumber": 262
},
{
  "fileName": "AbstractCommitLogSegmentManager.java",
  "nativeMethod": false,
  "methodName": "advanceAllocatingFrom",
  "className": 
"org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager",
  "lineNumber": 236
},
{
  "fileName": "AbstractCommitLogSegmentManager.java",
  "nativeMethod": false,
  "methodName": "start",
  "className": 
"org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager",
  "lineNumber": 153
},

[jira] [Resolved] (CASSANDRA-14643) Performance overhead with COPY command while bulk data loading

2018-08-21 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict resolved CASSANDRA-14643.
--
Resolution: Invalid

This bug tracker is not for support using Cassandra; you will probably find the 
user mailing lists more helpful, or perhaps stack overflow.

If you find a specific inefficiency in the COPY command that can be improved, 
then please feel free to file a ticket describing the desired improvement, 
after confirming no such ticket already exists.

> Performance overhead with COPY command while bulk data loading
> --
>
> Key: CASSANDRA-14643
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14643
> Project: Cassandra
>  Issue Type: Task
>  Components: CQL
>Reporter: NIKHIL THALLAPALLI
>Priority: Major
>
> Hello Team,
> We are facing performance overhead with COPY utility of cql while doing bulk 
> loading of data.
> It took approximately 6 hours while loading around 39L records through COPY 
> command.
>  Please be noted that there are few frozen types used in the table structure 
> which contains around 15 sets each.
> so, please let us know the techniques of optimization to achieve faster 
> loading of data, And also suggest if any parameters can be reset to increase 
> the efficiency.
> Your prompt response on this would be highly appreciated
>  Thank you,
> Nikhil



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14643) Performance overhead with COPY command while bulk data loading

2018-08-21 Thread NIKHIL THALLAPALLI (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587095#comment-16587095
 ] 

NIKHIL THALLAPALLI commented on CASSANDRA-14643:


Hi Team,

Can I expect an update on the reported issue.

Please let me know in case if you would require any additional details.

Thanks,

Nikhil

> Performance overhead with COPY command while bulk data loading
> --
>
> Key: CASSANDRA-14643
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14643
> Project: Cassandra
>  Issue Type: Task
>  Components: CQL
>Reporter: NIKHIL THALLAPALLI
>Priority: Major
>
> Hello Team,
> We are facing performance overhead with COPY utility of cql while doing bulk 
> loading of data.
> It took approximately 6 hours while loading around 39L records through COPY 
> command.
>  Please be noted that there are few frozen types used in the table structure 
> which contains around 15 sets each.
> so, please let us know the techniques of optimization to achieve faster 
> loading of data, And also suggest if any parameters can be reset to increase 
> the efficiency.
> Your prompt response on this would be highly appreciated
>  Thank you,
> Nikhil



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions

2018-08-21 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587026#comment-16587026
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


Still working on it. Will give the updated patch in a couple of days.

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org