[jira] [Updated] (CASSANDRA-14631) Add RSS support for Cassandra blog
[ https://issues.apache.org/jira/browse/CASSANDRA-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Hanna updated CASSANDRA-14631: - Labels: blog (was: ) > Add RSS support for Cassandra blog > -- > > Key: CASSANDRA-14631 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14631 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Jacques-Henri Berthemet >Assignee: Jeff Beck >Priority: Major > Labels: blog > Attachments: 14631-site.txt, Screen Shot 2018-08-17 at 5.32.08 > PM.png, Screen Shot 2018-08-17 at 5.32.25 PM.png > > > It would be convenient to add RSS support to Cassandra blog: > [http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html] > And maybe also for other resources like new versions, but this ticket is > about blog. > > {quote}From: Scott Andreas > Sent: Wednesday, August 08, 2018 6:53 PM > To: [d...@cassandra.apache.org|mailto:d...@cassandra.apache.org] > Subject: Re: Apache Cassandra Blog is now live > > Please feel free to file a ticket (label: Documentation and Website). > > It looks like Jekyll, the static site generator used to build the website, > has a plugin that generates Atom feeds if someone would like to work on > adding one: [https://github.com/jekyll/jekyll-feed] > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"
[ https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Hanna updated CASSANDRA-14661: - Labels: blog (was: ) > Blog Post: "Testing Apache Cassandra 4.0" > - > > Key: CASSANDRA-14661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14661 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: C. Scott Andreas >Assignee: C. Scott Andreas >Priority: Minor > Labels: blog > Attachments: CASSANDRA-14661.diff, rendered.png > > > This is a blog post highlighting some of the approaches being used to test > Apache Cassandra 4.0. The patch attached applies as an SVN diff to the > website repo (outside the project's primary Git repo). > SVN patch containing the post and rendered screenshot attached. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14660: --- Fix Version/s: (was: 4.0.x) 4.x 3.11.x 3.0.x > Improve TokenMetaData cache populating performance for large cluster > > > Key: CASSANDRA-14660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14660 > Project: Cassandra > Issue Type: Improvement > Components: Coordination > Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP >Reporter: Pengchao Wang >Priority: Critical > Labels: Performance > Fix For: 3.0.x, 3.11.x, 4.x > > Attachments: 14660-trunk.txt, TokenMetaDataBenchmark.java > > > TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent > token and topology view on coordinations without paying read lock cost. Upon > first read the method acquire a synchronize lock and generate a copy of major > token meta data structures and cached it, and upon every token meta data > changes(due to gossip changes), the cache get cleared and next read will > taking care of cache population. > For small to medium size clusters this strategy works pretty well. But large > clusters can actually be suffered from the locking since cache populating is > much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* > 3.0.15) each cache population take about 500~700ms, and during that there > are no requests can go through since synchronize lock was acquired. This > caused waves of timeouts errors when there are large amount gossip messages > propagating cross the cluster, such as in the case of cluster restarting. > Base on profiling we found that the cost mostly comes from copying > tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use > TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in > TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying > from already ordered data. But guava's TreeMultiMap copying missed that > optimization and make it ~10 times slower than it actually need to be on our > size of cluster. > The patch attached to the issue replace the reverse TreeMultiMap to a > vanilla TreeMap> in SortedBiMultiValueMap to make sure we can > copy it O(N) time. > I also attached a benchmark script (TokenMetaDataBenchmark.java), which > simulates a large cluster then measures average latency for TokenMetaData > cache populating. > Benchmark result before and after that patch: > {code:java} > trunk: > before 100ms, after 13ms > 3.0.x: > before 199ms, after 15ms > {code} > (On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) > optimization is not applied because the key comparator is dynamically created > and TreeMap cannot determine the source and dest are in same order) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-14660: --- Labels: Performance (was: ) > Improve TokenMetaData cache populating performance for large cluster > > > Key: CASSANDRA-14660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14660 > Project: Cassandra > Issue Type: Improvement > Components: Coordination > Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP >Reporter: Pengchao Wang >Priority: Critical > Labels: Performance > Fix For: 3.0.x, 3.11.x, 4.x > > Attachments: 14660-trunk.txt, TokenMetaDataBenchmark.java > > > TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent > token and topology view on coordinations without paying read lock cost. Upon > first read the method acquire a synchronize lock and generate a copy of major > token meta data structures and cached it, and upon every token meta data > changes(due to gossip changes), the cache get cleared and next read will > taking care of cache population. > For small to medium size clusters this strategy works pretty well. But large > clusters can actually be suffered from the locking since cache populating is > much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* > 3.0.15) each cache population take about 500~700ms, and during that there > are no requests can go through since synchronize lock was acquired. This > caused waves of timeouts errors when there are large amount gossip messages > propagating cross the cluster, such as in the case of cluster restarting. > Base on profiling we found that the cost mostly comes from copying > tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use > TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in > TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying > from already ordered data. But guava's TreeMultiMap copying missed that > optimization and make it ~10 times slower than it actually need to be on our > size of cluster. > The patch attached to the issue replace the reverse TreeMultiMap to a > vanilla TreeMap> in SortedBiMultiValueMap to make sure we can > copy it O(N) time. > I also attached a benchmark script (TokenMetaDataBenchmark.java), which > simulates a large cluster then measures average latency for TokenMetaData > cache populating. > Benchmark result before and after that patch: > {code:java} > trunk: > before 100ms, after 13ms > 3.0.x: > before 199ms, after 15ms > {code} > (On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) > optimization is not applied because the key comparator is dynamically created > and TreeMap cannot determine the source and dest are in same order) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"
[ https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14661: - Reviewer: Nate McCall Attachment: CASSANDRA-14661.diff Status: Patch Available (was: Open) > Blog Post: "Testing Apache Cassandra 4.0" > - > > Key: CASSANDRA-14661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14661 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: C. Scott Andreas >Assignee: C. Scott Andreas >Priority: Minor > Attachments: CASSANDRA-14661.diff, rendered.png > > > This is a blog post highlighting some of the approaches being used to test > Apache Cassandra 4.0. The patch attached applies as an SVN diff to the > website repo (outside the project's primary Git repo). > SVN patch containing the post and rendered screenshot attached. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"
[ https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14661: - Attachment: (was: CASSANDRA-14661.diff) > Blog Post: "Testing Apache Cassandra 4.0" > - > > Key: CASSANDRA-14661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14661 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: C. Scott Andreas >Assignee: C. Scott Andreas >Priority: Minor > Attachments: CASSANDRA-14661.diff, rendered.png > > > This is a blog post highlighting some of the approaches being used to test > Apache Cassandra 4.0. The patch attached applies as an SVN diff to the > website repo (outside the project's primary Git repo). > SVN patch containing the post and rendered screenshot attached. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"
[ https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14661: - Flags: Patch > Blog Post: "Testing Apache Cassandra 4.0" > - > > Key: CASSANDRA-14661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14661 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: C. Scott Andreas >Assignee: C. Scott Andreas >Priority: Minor > Attachments: CASSANDRA-14661.diff, rendered.png > > > This is a blog post highlighting some of the approaches being used to test > Apache Cassandra 4.0. The patch attached applies as an SVN diff to the > website repo (outside the project's primary Git repo). > SVN patch containing the post and rendered screenshot attached. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"
[ https://issues.apache.org/jira/browse/CASSANDRA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14661: - Attachment: CASSANDRA-14661.diff > Blog Post: "Testing Apache Cassandra 4.0" > - > > Key: CASSANDRA-14661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14661 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: C. Scott Andreas >Assignee: C. Scott Andreas >Priority: Minor > Attachments: CASSANDRA-14661.diff, rendered.png > > > This is a blog post highlighting some of the approaches being used to test > Apache Cassandra 4.0. The patch attached applies as an SVN diff to the > website repo (outside the project's primary Git repo). > SVN patch containing the post and rendered screenshot attached. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14661) Blog Post: "Testing Apache Cassandra 4.0"
C. Scott Andreas created CASSANDRA-14661: Summary: Blog Post: "Testing Apache Cassandra 4.0" Key: CASSANDRA-14661 URL: https://issues.apache.org/jira/browse/CASSANDRA-14661 Project: Cassandra Issue Type: Improvement Components: Documentation and Website Reporter: C. Scott Andreas Assignee: C. Scott Andreas Attachments: CASSANDRA-14661.diff, rendered.png This is a blog post highlighting some of the approaches being used to test Apache Cassandra 4.0. The patch attached applies as an SVN diff to the website repo (outside the project's primary Git repo). SVN patch containing the post and rendered screenshot attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengchao Wang updated CASSANDRA-14660: -- Attachment: 14660-trunk.txt Status: Patch Available (was: Open) > Improve TokenMetaData cache populating performance for large cluster > > > Key: CASSANDRA-14660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14660 > Project: Cassandra > Issue Type: Improvement > Components: Coordination > Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP >Reporter: Pengchao Wang >Priority: Critical > Fix For: 4.0.x > > Attachments: 14660-trunk.txt, TokenMetaDataBenchmark.java > > > TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent > token and topology view on coordinations without paying read lock cost. Upon > first read the method acquire a synchronize lock and generate a copy of major > token meta data structures and cached it, and upon every token meta data > changes(due to gossip changes), the cache get cleared and next read will > taking care of cache population. > For small to medium size clusters this strategy works pretty well. But large > clusters can actually be suffered from the locking since cache populating is > much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* > 3.0.15) each cache population take about 500~700ms, and during that there > are no requests can go through since synchronize lock was acquired. This > caused waves of timeouts errors when there are large amount gossip messages > propagating cross the cluster, such as in the case of cluster restarting. > Base on profiling we found that the cost mostly comes from copying > tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use > TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in > TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying > from already ordered data. But guava's TreeMultiMap copying missed that > optimization and make it ~10 times slower than it actually need to be on our > size of cluster. > The patch attached to the issue replace the reverse TreeMultiMap to a > vanilla TreeMap> in SortedBiMultiValueMap to make sure we can > copy it O(N) time. > I also attached a benchmark script (TokenMetaDataBenchmark.java), which > simulates a large cluster then measures average latency for TokenMetaData > cache populating. > Benchmark result before and after that patch: > {code:java} > trunk: > before 100ms, after 13ms > 3.0.x: > before 199ms, after 15ms > {code} > (On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) > optimization is not applied because the key comparator is dynamically created > and TreeMap cannot determine the source and dest are in same order) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengchao Wang updated CASSANDRA-14660: -- Description: TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token and topology view on coordinations without paying read lock cost. Upon first read the method acquire a synchronize lock and generate a copy of major token meta data structures and cached it, and upon every token meta data changes(due to gossip changes), the cache get cleared and next read will taking care of cache population. For small to medium size clusters this strategy works pretty well. But large clusters can actually be suffered from the locking since cache populating is much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* 3.0.15) each cache population take about 500~700ms, and during that there are no requests can go through since synchronize lock was acquired. This caused waves of timeouts errors when there are large amount gossip messages propagating cross the cluster, such as in the case of cluster restarting. Base on profiling we found that the cost mostly comes from copying tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in TreeMap helps reduce copying complexity from O(N*log(N)) to O(N) when copying from already ordered data. But guava's TreeMultiMap copying missed that optimization and make it ~10 times slower than it actually need to be on our size of cluster. The patch attached to the issue replace the reverse TreeMultiMap to a vanilla TreeMap> in SortedBiMultiValueMap to make sure we can copy it O(N) time. I also attached a benchmark script (TokenMetaDataBenchmark.java), which simulates a large cluster then measures average latency for TokenMetaData cache populating. Benchmark result before and after that patch: {code:java} trunk: before 100ms, after 13ms 3.0.x: before 199ms, after 15ms {code} (On 3.0.x even the forward TreeMap copying is slow, the O(N*log(N)) to O(N) optimization is not applied because the key comparator is dynamically created and TreeMap cannot determine the source and dest are in same order) was: TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token and topology view on coordinations without paying read lock cost. Upon first read the method acquire a synchronize lock and generate a copy of major token meta data structures and cached it, and upon every token meta data changes(due to gossip changes), the cache get cleared and next read will taking care of cache population. For small to medium size clusters this strategy works pretty well. But large clusters can actually be suffered from the locking since cache populating is much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* 3.0.15) each cache population take about 500~700ms, and during that there are no requests can go through since synchronize lock was acquired. This caused waves of timeouts errors when there are large amount gossip messages propagating cross the cluster, such as in the case of cluster restarting. Base on profiling we found that the cost mostly comes from copying tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying from already ordered data. But guava's TreeMultiMap copying missed that optimization and make it ~10 times slower than it actually need to be on our size of cluster. The patch attached to the issue replace the reverse TreeMultiMap to a vanilla TreeMap> in SortedBiMultiValueMap to make sure we can copy it O(n) time. I also attached a benchmark script (TokenMetaDataBenchmark.java), which simulates a large cluster then measures average latency for TokenMetaData cache populating. Benchmark result before and after that patch: {code:java} trunk: before 100ms, after 13ms 3.0.x: before 199ms, after 15ms {code} (On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) optimization is not applied because the key comparator is dynamically created and TreeMap cannot determine the source and dest are in same order) > Improve TokenMetaData cache populating performance for large cluster > > > Key: CASSANDRA-14660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14660 > Project: Cassandra > Issue Type: Improvement > Components: Coordination > Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP >Reporter: Pengchao Wang >Priority: Critical > Fix For: 4.0.x > > Attachments:
[jira] [Updated] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengchao Wang updated CASSANDRA-14660: -- Description: TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token and topology view on coordinations without paying read lock cost. Upon first read the method acquire a synchronize lock and generate a copy of major token meta data structures and cached it, and upon every token meta data changes(due to gossip changes), the cache get cleared and next read will taking care of cache population. For small to medium size clusters this strategy works pretty well. But large clusters can actually be suffered from the locking since cache populating is much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* 3.0.15) each cache population take about 500~700ms, and during that there are no requests can go through since synchronize lock was acquired. This caused waves of timeouts errors when there are large amount gossip messages propagating cross the cluster, such as in the case of cluster restarting. Base on profiling we found that the cost mostly comes from copying tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying from already ordered data. But guava's TreeMultiMap copying missed that optimization and make it ~10 times slower than it actually need to be on our size of cluster. The patch attached to the issue replace the reverse TreeMultiMap to a vanilla TreeMap> in SortedBiMultiValueMap to make sure we can copy it O(n) time. I also attached a benchmark script (TokenMetaDataBenchmark.java), which simulates a large cluster then measures average latency for TokenMetaData cache populating. Benchmark result before and after that patch: {code:java} trunk: before 100ms, after 13ms 3.0.x: before 199ms, after 15ms {code} (On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) optimization is not applied because the key comparator is dynamically created and TreeMap cannot determine the source and dest are in same order) was: TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token and topology view on coordinations without paying read lock cost. Upon first read the method acquire a synchronize lock and generate a copy of major token meta data structures and cached it, and upon every token meta data changes(due to gossip changes), the cache get cleared and next read will taking care of cache population. For small to medium size clusters this strategy works pretty well. But large clusters can actually be suffered from the locking since cache populating is much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* 3.0.15) each cache population take about 500~700ms, and during that there are no requests can go through since synchronize lock was acquired. This caused waves of timeouts errors when there are large amount gossip messages propagating cross the cluster, such as in the case of cluster restarting. Base on profiling we found that the cost mostly comes from copying tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying from already ordered data. But guava's TreeMultiMap copying missed that optimization and make it ~10 times slower than it actually need to be on our size of cluster. The patch attached to the issue replace the reverse TreeMultiMap to a vanilla TreeMap> in SortedBiMultiValueMap to make sure we can copy it O(n) time. I also attached a benchmark script (TokenMetaDataBenchmark.java), which simulates a large cluster then measures average latency for TokenMetaData cache populating. Benchmark result before and after that patch: {code:java} trunk: before 100ms, after 13ms 3.0.x: before 199ms, after 15ms {code} (On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) optimization is not applied because the key comparator is dynamically created and TreeMap cannot determine the source and dest are in same order) > Improve TokenMetaData cache populating performance for large cluster > > > Key: CASSANDRA-14660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14660 > Project: Cassandra > Issue Type: Improvement > Components: Coordination > Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP >Reporter: Pengchao Wang >Priority: Critical > Fix For: 4.0.x > >
[jira] [Created] (CASSANDRA-14660) Improve TokenMetaData cache populating performance for large cluster
Pengchao Wang created CASSANDRA-14660: - Summary: Improve TokenMetaData cache populating performance for large cluster Key: CASSANDRA-14660 URL: https://issues.apache.org/jira/browse/CASSANDRA-14660 Project: Cassandra Issue Type: Improvement Components: Coordination Environment: Benchmark is on MacOSX 10.13.5, 2017 MBP Reporter: Pengchao Wang Fix For: 4.0.x Attachments: TokenMetaDataBenchmark.java TokenMetaData#cachedOnlyTokenMap is a method C* used to get a consistent token and topology view on coordinations without paying read lock cost. Upon first read the method acquire a synchronize lock and generate a copy of major token meta data structures and cached it, and upon every token meta data changes(due to gossip changes), the cache get cleared and next read will taking care of cache population. For small to medium size clusters this strategy works pretty well. But large clusters can actually be suffered from the locking since cache populating is much slower. On one of our largest cluster (~1000 nodes, 125k tokens, C* 3.0.15) each cache population take about 500~700ms, and during that there are no requests can go through since synchronize lock was acquired. This caused waves of timeouts errors when there are large amount gossip messages propagating cross the cluster, such as in the case of cluster restarting. Base on profiling we found that the cost mostly comes from copying tokenToEndpointMap. It is a SortedBiMultiValueMap made from a forward map use TreeMap and a reverse map use guava TreeMultiMap. There is an optimization in TreeMap helps reduce copying complexity from O(n*log(n)) to O(n) when copying from already ordered data. But guava's TreeMultiMap copying missed that optimization and make it ~10 times slower than it actually need to be on our size of cluster. The patch attached to the issue replace the reverse TreeMultiMap to a vanilla TreeMap> in SortedBiMultiValueMap to make sure we can copy it O(n) time. I also attached a benchmark script (TokenMetaDataBenchmark.java), which simulates a large cluster then measures average latency for TokenMetaData cache populating. Benchmark result before and after that patch: {code:java} trunk: before 100ms, after 13ms 3.0.x: before 199ms, after 15ms {code} (On 3.0.x even the forward TreeMap copying is slow, the O(n*log(n)) to O(n) optimization is not applied because the key comparator is dynamically created and TreeMap cannot determine the source and dest are in same order) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14648) CircleCI dtest runs should (by default) depend upon successful unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-14648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588112#comment-16588112 ] Dinesh Joshi commented on CASSANDRA-14648: -- Automatically running CircleCI jobs due to a git push is a waste of resources. [~aweisberg] fyi I have experienced 40-90 min wait times. I'm waiting on a build that has been queued for 46 minutes as I type. > CircleCI dtest runs should (by default) depend upon successful unit tests > - > > Key: CASSANDRA-14648 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14648 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Unit tests are very quick to run, and if they fail to pass there’s probably > no value in running dtests - particularly if we are honouring our > expectations of never committing code that breaks either unit or dtests. > When sharing CircleCI resources between multiple branches (or multiple > users), it is wasteful to have two dtest runs kicked off for every incomplete > branch that is pushed to GitHub for safe keeping. So I think a better > default CircleCI config file would only run the dtests after a successful > unit test run, and those who want to modify this behaviour can do so > consciously by editing the config file for themselves. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14659) Disable old protocol versions on demand
[ https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588063#comment-16588063 ] Dinesh Joshi commented on CASSANDRA-14659: -- DTest PR is here: https://github.com/apache/cassandra-dtest/compare/master...dineshjoshi:14659-master?expand=1 > Disable old protocol versions on demand > --- > > Key: CASSANDRA-14659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14659 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dinesh Joshi >Assignee: Dinesh Joshi >Priority: Major > Labels: usability > > This patch allows the operators to disable older protocol versions on demand. > To use it, you can set {{native_transport_allow_older_protocols}} to false or > use nodetool disableolderprotocolversions. Cassandra will reject requests > from client coming in on any version except the current version. This will > help operators selectively reject connections from clients that do not > support the latest protoocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-10726: -- Fix Version/s: (was: 4.x) 4.0 > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Blake Eggleston >Priority: Major > Fix For: 4.0 > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14659) Disable old protocol versions on demand
[ https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-14659: - Description: This patch allows the operators to disable older protocol versions on demand. To use it, you can set {{native_transport_allow_older_protocols}} to false or use nodetool disableolderprotocolversions. Cassandra will reject requests from client coming in on any version except the current version. This will help operators selectively reject connections from clients that do not support the latest protoocol. (was: This patch allows the operators to disable older protocol versions on demand. To use it, you can set native_transport_honor_older_protocols to false or use nodetool disableolderprotocolversions. Cassandra will reject requests from client coming in on any version except the current version. This will help operators selectively reject connections from clients that do not support the latest protoocol.) > Disable old protocol versions on demand > --- > > Key: CASSANDRA-14659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14659 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dinesh Joshi >Assignee: Dinesh Joshi >Priority: Major > Labels: usability > > This patch allows the operators to disable older protocol versions on demand. > To use it, you can set {{native_transport_allow_older_protocols}} to false or > use nodetool disableolderprotocolversions. Cassandra will reject requests > from client coming in on any version except the current version. This will > help operators selectively reject connections from clients that do not > support the latest protoocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13630) support large internode messages with netty
[ https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-13630: - Reviewer: Dinesh Joshi (was: Ariel Weisberg) > support large internode messages with netty > --- > > Key: CASSANDRA-13630 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13630 > Project: Cassandra > Issue Type: Task > Components: Streaming and Messaging >Reporter: Jason Brown >Assignee: Jason Brown >Priority: Major > Fix For: 4.0 > > > As part of CASSANDRA-8457, we decided to punt on large mesages to reduce the > scope of that ticket. However, we still need that functionality to ship a > correctly operating internode messaging subsystem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13630) support large internode messages with netty
[ https://issues.apache.org/jira/browse/CASSANDRA-13630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587988#comment-16587988 ] Jason Brown commented on CASSANDRA-13630: - Picking this back up after a year, I realize that my previous solution only solved part of the problem. I solved the "don't allocate an enormous buffer" problem, but I was still allocating an "enourmous buffer"'s worth of memory at the same time, albeit across multiple buffers. I believe this is ultimately what [~aweisberg]'s concerns with the previous solution encompassed, and I fully agree. Further, the previous patch attempted to do more than just solve the large buffer problem, it optimized allocating small buffers. With this new insight, that optimization is best left to a separate ticket. Thus, the new solution focuses only on the large buffer problem. The high-level overview of this patch is: - use the existing {{ByteBufDataOutputStreamPlus}} to chunk up the large message into small buffers, and use {{ByteBufDataOutputStreamPlus}} 's existing rateLimiting mechanism to make sure we don't keep too much outstanding data in the channel - rework the inbound side to allow a blocking-style message deserialization. - Refactoring to make serilization/deserialization code reusable as well as some clean up. In order to support both serialization and deserialization of arbitrarily large messages and our blocking-style {{IVersionedSerializers}}, I need to perform the those activities on a separate (background) thread. On the outbound side this is achieved with a new {{ChannelWriter}} subclass. On the inbound side, there is a fair bit of refactoring, but the thread for deserialization is in {{MessageInHandler.BlockingBufferHandler}}. Both of these "background threads" are implemented as {{ThreadExecutorServices}} so that if no large messages are being sent or received, the thread can be shutdown (and save the system resources). On the outbound side, it is easy to know if a specific {{OutboundMessagingConnection}} will be sending large messages, as we can look at it's {{OutboundConnectionIdentifier}}. The inbound side does not have that luxury, and my previous patch attempted to do some overly clever things. The simpler solution is to add a flag to the internode messaging protocol that advises the receiving side that the connection will be used for large messages, and the receiver can setup appropriately. We already have a flags section in the internode messaging protocol's header, and many unused bits within that. Further, peers that are unaware of the new bit flag (i.e. any cassandra version less than 4.0) will completely ignore the flag as they do not attempt to interpret those bits. Thus, this change is rather safe, from a protocol/handshake perspective. In fact, I'd like to backport this protocol change to 3.0 and 3.11 to have the flag sent out on new connections. The flag will be completely ignored on those versions, except when, during a cluster upgrade, the 3.0 node connects to a 4.0 node, the 4.0 node will know that the connection will contain large messages and can setup the receive side appropriately. In no way would operators be required to minor upgrade to those versions of 3.X which contain the upgraded messaging version flag (before upgrading to 4.0), but it would help make the upgrade to 4.0 smoother from a performance/memory management perspective. The other major aspect of this ticket was a refactoring mostly to move the serialization/deserialization out of {{MessageOutHandler}}/{{MessageInHandler}} so that logic could be invoked outside of the context of a netty handler. This also allowed me to clean up {{MessageIn}} and {{MessageOut}}, as well. Note: I've eliminated {{BaseMessageInHandler}} and moved the version-specific messaging parsing into classes derived from the new {{MessageIn.MessageInProcessor}}. {{MessageInHandler}} now determines if it needs to do non-blocking or blocking deserialization, and handles the buffers appropriately. {{MessageInHandler}} now derives from {{ChannelInboundHandlerAdapter}}, so the error handling changed slightly. The refacorings also affected where the unit tests are layed out (corresponding to where the logic/code unit test now lives), so I moved things around in there, as well. ||13630|| |[branch|https://github.com/jasobrown/cassandra/tree/13630]| |[utests & dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/13630]| I also needed to make a trivial [change to one dtest|https://github.com/jasobrown/cassandra-dtest/tree/13630]. To make the review easier, I've [opened a PR here|https://github.com/apache/cassandra/pull/253] > support large internode messages with netty > --- > > Key: CASSANDRA-13630 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13630 >
[jira] [Updated] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-10726: Resolution: Fixed Status: Resolved (was: Patch Available) committed to trunk as [644676b088be5177ef1d0cdaf450306ea28d8a12|https://github.com/apache/cassandra/commit/644676b088be5177ef1d0cdaf450306ea28d8a12], thanks > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Blake Eggleston >Priority: Major > Fix For: 4.x > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Improve read repair blocking behavior
Repository: cassandra Updated Branches: refs/heads/trunk 994da255c -> 644676b08 Improve read repair blocking behavior Patch by Blake Eggleston; reviewed by Marcus Eriksson and Alex Petrov for CASSANDRA-10726 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/644676b0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/644676b0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/644676b0 Branch: refs/heads/trunk Commit: 644676b088be5177ef1d0cdaf450306ea28d8a12 Parents: 994da25 Author: Blake Eggleston Authored: Mon May 14 14:24:03 2018 -0700 Committer: Blake Eggleston Committed: Tue Aug 21 11:01:10 2018 -0700 -- CHANGES.txt | 1 + .../apache/cassandra/db/ConsistencyLevel.java | 10 +- .../cassandra/db/PartitionRangeReadCommand.java | 1 + .../org/apache/cassandra/db/ReadCommand.java| 1 + .../db/SinglePartitionReadCommand.java | 1 + .../cassandra/metrics/ReadRepairMetrics.java| 5 + .../apache/cassandra/net/MessagingService.java | 1 + .../apache/cassandra/service/StorageProxy.java | 65 +++- .../service/reads/AbstractReadExecutor.java | 40 +- .../reads/repair/BlockingPartitionRepair.java | 243 + .../reads/repair/BlockingReadRepair.java| 229 ++-- .../reads/repair/BlockingReadRepairs.java | 114 ++ .../service/reads/repair/NoopReadRepair.java| 29 +- .../repair/PartitionIteratorMergeListener.java | 13 +- .../service/reads/repair/ReadRepair.java| 44 ++- .../service/reads/repair/RepairListener.java| 34 -- .../reads/repair/RowIteratorMergeListener.java | 31 +- .../service/reads/DataResolverTest.java | 18 +- .../service/reads/repair/ReadRepairTest.java| 361 +++ .../reads/repair/TestableReadRepair.java| 41 ++- 20 files changed, 1046 insertions(+), 236 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9fbaf25..b34979a 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Improve read repair blocking behavior (CASSANDRA-10726) * Add a virtual table to expose settings (CASSANDRA-14573) * Fix up chunk cache handling of metrics (CASSANDRA-14628) * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652) http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/src/java/org/apache/cassandra/db/ConsistencyLevel.java -- diff --git a/src/java/org/apache/cassandra/db/ConsistencyLevel.java b/src/java/org/apache/cassandra/db/ConsistencyLevel.java index 8f3a51c..d37da0a 100644 --- a/src/java/org/apache/cassandra/db/ConsistencyLevel.java +++ b/src/java/org/apache/cassandra/db/ConsistencyLevel.java @@ -138,12 +138,20 @@ public enum ConsistencyLevel } } +/** + * Determine if this consistency level meets or exceeds the consistency requirements of the given cl for the given keyspace + */ +public boolean satisfies(ConsistencyLevel other, Keyspace keyspace) +{ +return blockFor(keyspace) >= other.blockFor(keyspace); +} + public boolean isDatacenterLocal() { return isDCLocal; } -public boolean isLocal(InetAddressAndPort endpoint) +public static boolean isLocal(InetAddressAndPort endpoint) { return DatabaseDescriptor.getLocalDataCenter().equals(DatabaseDescriptor.getEndpointSnitch().getDatacenter(endpoint)); } http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java -- diff --git a/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java b/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java index a6641d4..c312acc 100644 --- a/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java +++ b/src/java/org/apache/cassandra/db/PartitionRangeReadCommand.java @@ -24,6 +24,7 @@ import java.util.List; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.Iterables; +import org.apache.cassandra.dht.Token; import org.apache.cassandra.schema.TableMetadata; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.db.filter.*; http://git-wip-us.apache.org/repos/asf/cassandra/blob/644676b0/src/java/org/apache/cassandra/db/ReadCommand.java -- diff --git a/src/java/org/apache/cassandra/db/ReadCommand.java b/src/java/org/apache/cassandra/db/ReadCommand.java index
[jira] [Commented] (CASSANDRA-14592) Reconcile should not be dependent on nowInSec
[ https://issues.apache.org/jira/browse/CASSANDRA-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587746#comment-16587746 ] Aleksey Yeschenko commented on CASSANDRA-14592: --- +1 > Reconcile should not be dependent on nowInSec > - > > Key: CASSANDRA-14592 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14592 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > To have the arrival time of a mutation on a replica determine the > reconciliation priority seems to provide for unintuitive database behaviour. > It seems we should formalise our reconciliation logic in a manner that does > not depend on this, and modify our internal APIs to prevent this dependency. > > Take the following example, where both writes have the same timestamp: > > Write X with a value A, TTL of 1s > Write Y with a value B, no TTL > > If X and Y arrive on replicas in < 1s, X and Y are both live, so record Y > wins the reconciliation. The value B appears in the database. > However, if X and Y arrive on replicas in > 1s, X is now (effectively) a > tombstone. This wins the reconciliation race, and NO value is the result. > > Note that the weirdness of this is more pronounced than it might first > appear. If write X gets stuck in hints for a period on the coordinator to > one replica, the value B appears in the database until the hint is replayed. > So now we’re in a very uncertain state - will hints get replayed or not? If > they do, the value B will disappear; if they don’t it won’t. This is despite > a QUORUM of replicas ACKing both writes, and a QUORUM of readers being > engaged on read; the database still changes state to the user suddenly at > some arbitrary future point in time. > > It seems to me that a simple solution to this, is to permit TTL’d data to > always win a reconciliation against non-TTL’d data (of same timestamp), so > that we are consistent across TTLs being transformed into tombstones. > > 4.0 seems like a good opportunity to fix this behaviour, and mention in > CHANGES.txt. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10726) Read repair inserts should not be blocking
[ https://issues.apache.org/jira/browse/CASSANDRA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587730#comment-16587730 ] Marcus Eriksson commented on CASSANDRA-10726: - this lgtm, +1 > Read repair inserts should not be blocking > -- > > Key: CASSANDRA-10726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10726 > Project: Cassandra > Issue Type: Improvement > Components: Coordination >Reporter: Richard Low >Assignee: Blake Eggleston >Priority: Major > Fix For: 4.x > > > Today, if there’s a digest mismatch in a foreground read repair, the insert > to update out of date replicas is blocking. This means, if it fails, the read > fails with a timeout. If a node is dropping writes (maybe it is overloaded or > the mutation stage is backed up for some other reason), all reads to a > replica set could fail. Further, replicas dropping writes get more out of > sync so will require more read repair. > The comment on the code for why the writes are blocking is: > {code} > // wait for the repair writes to be acknowledged, to minimize impact on any > replica that's > // behind on writes in case the out-of-sync row is read multiple times in > quick succession > {code} > but the bad side effect is that reads timeout. Either the writes should not > be blocking or we should return success for the read even if the write times > out. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14659) Disable old protocol versions on demand
[ https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587684#comment-16587684 ] Dinesh Joshi commented on CASSANDRA-14659: -- ||version|| |[branch|https://github.com/dineshjoshi/cassandra/tree/restrict-protocol-version]| |[utests dtests|https://circleci.com/gh/dineshjoshi/workflows/cassandra/tree/restrict-protocol-version]| || > Disable old protocol versions on demand > --- > > Key: CASSANDRA-14659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14659 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dinesh Joshi >Assignee: Dinesh Joshi >Priority: Major > Labels: usability > > This patch allows the operators to disable older protocol versions on demand. > To use it, you can set native_transport_honor_older_protocols to false or use > nodetool disableolderprotocolversions. Cassandra will reject requests from > client coming in on any version except the current version. This will help > operators selectively reject connections from clients that do not support the > latest protoocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14659) Disable old protocol versions on demand
[ https://issues.apache.org/jira/browse/CASSANDRA-14659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-14659: - Labels: usability (was: ) Status: Patch Available (was: Open) > Disable old protocol versions on demand > --- > > Key: CASSANDRA-14659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14659 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Dinesh Joshi >Assignee: Dinesh Joshi >Priority: Major > Labels: usability > > This patch allows the operators to disable older protocol versions on demand. > To use it, you can set native_transport_honor_older_protocols to false or use > nodetool disableolderprotocolversions. Cassandra will reject requests from > client coming in on any version except the current version. This will help > operators selectively reject connections from clients that do not support the > latest protoocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14659) Disable old protocol versions on demand
Dinesh Joshi created CASSANDRA-14659: Summary: Disable old protocol versions on demand Key: CASSANDRA-14659 URL: https://issues.apache.org/jira/browse/CASSANDRA-14659 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Dinesh Joshi Assignee: Dinesh Joshi This patch allows the operators to disable older protocol versions on demand. To use it, you can set native_transport_honor_older_protocols to false or use nodetool disableolderprotocolversions. Cassandra will reject requests from client coming in on any version except the current version. This will help operators selectively reject connections from clients that do not support the latest protoocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)
[ https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587656#comment-16587656 ] Benedict commented on CASSANDRA-13651: -- The latest, i.e. [~burmanm]'s. It simplifies the code, and as I say, I'm not sure the original patch was as well justified as we thought - our benchmarking methodology was flawed at the time (using a single connection). I suppose arguably there's value in maintaining the current behaviour for those users with a single connection and without epoll, but since epoll is now the default it's probably better to improve code clarity. I'm open to dispute, of course, in which case we can revisit Corentin's patch (or try to reproduce the old benchmarks and see what we might be losing in modern C* in the worst case). In this case, I would probably prefer to have a LegacyFlusher and a Flusher - the latter the cleaned code contributed by [~burmanm], the former the old unadulterated code, and to select between them based on the config property (with a default being determined by epoll usage status) > Large amount of CPU used by epoll_wait(.., .., .., 0) > - > > Key: CASSANDRA-13651 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13651 > Project: Cassandra > Issue Type: Bug >Reporter: Corentin Chary >Assignee: Corentin Chary >Priority: Major > Fix For: 4.x > > Attachments: cpu-usage.png > > > I was trying to profile Cassandra under my workload and I kept seeing this > backtrace: > {code} > epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms > io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java > (native) > io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) > Native.java:111 > io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) > EpollEventLoop.java:230 > io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254 > io.netty.util.concurrent.SingleThreadEventExecutor$5.run() > SingleThreadEventExecutor.java:858 > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() > DefaultThreadFactory.java:138 > java.lang.Thread.run() Thread.java:745 > {code} > At fist I though that the profiler might not be able to profile native code > properly, but I wen't further and I realized that most of the CPU was used by > {{epoll_wait()}} calls with a timeout of zero. > Here is the output of perf on this system, which confirms that most of the > overhead was with timeout == 0. > {code} > Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): > 11594448 > Overhead Trace output > > ◆ > 90.06% epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, > timeout: 0x > ▒ >5.77% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x > ▒ >1.98% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x03e8 > ▒ >0.04% epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, > timeout: 0x > ▒ >0.04% epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, > timeout: 0x > ▒ >0.03% epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, > timeout: 0x > ▒ >0.02% epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, > timeout: 0x > {code} > Running this time with perf record -ag for call traces: > {code} > # Children Self sys usr Trace output > > # > > # > 8.61% 8.61% 0.00% 8.61% epfd: 0x00a7, events: > 0x7fca452d6000, maxevents: 0x1000, timeout: 0x > | > ---0x1000200af313 >| >
[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)
[ https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587633#comment-16587633 ] Jeff Jirsa commented on CASSANDRA-13651: There are two patches here, [~iksaif]'s patch and [~burmanm]'s follow-patch, which did you each +1? > Large amount of CPU used by epoll_wait(.., .., .., 0) > - > > Key: CASSANDRA-13651 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13651 > Project: Cassandra > Issue Type: Bug >Reporter: Corentin Chary >Assignee: Corentin Chary >Priority: Major > Fix For: 4.x > > Attachments: cpu-usage.png > > > I was trying to profile Cassandra under my workload and I kept seeing this > backtrace: > {code} > epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms > io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java > (native) > io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) > Native.java:111 > io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) > EpollEventLoop.java:230 > io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254 > io.netty.util.concurrent.SingleThreadEventExecutor$5.run() > SingleThreadEventExecutor.java:858 > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() > DefaultThreadFactory.java:138 > java.lang.Thread.run() Thread.java:745 > {code} > At fist I though that the profiler might not be able to profile native code > properly, but I wen't further and I realized that most of the CPU was used by > {{epoll_wait()}} calls with a timeout of zero. > Here is the output of perf on this system, which confirms that most of the > overhead was with timeout == 0. > {code} > Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): > 11594448 > Overhead Trace output > > ◆ > 90.06% epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, > timeout: 0x > ▒ >5.77% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x > ▒ >1.98% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x03e8 > ▒ >0.04% epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, > timeout: 0x > ▒ >0.04% epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, > timeout: 0x > ▒ >0.03% epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, > timeout: 0x > ▒ >0.02% epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, > timeout: 0x > {code} > Running this time with perf record -ag for call traces: > {code} > # Children Self sys usr Trace output > > # > > # > 8.61% 8.61% 0.00% 8.61% epfd: 0x00a7, events: > 0x7fca452d6000, maxevents: 0x1000, timeout: 0x > | > ---0x1000200af313 >| > --8.61%--0x7fca6117bdac > 0x7fca60459804 > epoll_wait > 2.98% 2.98% 0.00% 2.98% epfd: 0x00a7, events: > 0x7fca452d6000, maxevents: 0x1000, timeout: 0x03e8 > | > ---0x1000200af313 >0x7fca6117b830 >0x7fca60459804 >epoll_wait > {code} > That looks like a lot of CPU used to wait for nothing. I'm not sure if pref > reports a per-CPU percentage or a per-system percentage, but that would be > still be 10% of the total CPU usage of Cassandra at the minimum. > I went further and found the code of all that: We schedule a lot of > {{Message::Flusher}} with a deadline of 10 usec (5 per messages I think) but > netty+epoll only support
[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)
[ https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587629#comment-16587629 ] Norman Maurer commented on CASSANDRA-13651: --- I am no committer but the netty project lead so from the point of view of netty usage I am also +1 on this. > Large amount of CPU used by epoll_wait(.., .., .., 0) > - > > Key: CASSANDRA-13651 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13651 > Project: Cassandra > Issue Type: Bug >Reporter: Corentin Chary >Assignee: Corentin Chary >Priority: Major > Fix For: 4.x > > Attachments: cpu-usage.png > > > I was trying to profile Cassandra under my workload and I kept seeing this > backtrace: > {code} > epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms > io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java > (native) > io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) > Native.java:111 > io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) > EpollEventLoop.java:230 > io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254 > io.netty.util.concurrent.SingleThreadEventExecutor$5.run() > SingleThreadEventExecutor.java:858 > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() > DefaultThreadFactory.java:138 > java.lang.Thread.run() Thread.java:745 > {code} > At fist I though that the profiler might not be able to profile native code > properly, but I wen't further and I realized that most of the CPU was used by > {{epoll_wait()}} calls with a timeout of zero. > Here is the output of perf on this system, which confirms that most of the > overhead was with timeout == 0. > {code} > Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): > 11594448 > Overhead Trace output > > ◆ > 90.06% epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, > timeout: 0x > ▒ >5.77% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x > ▒ >1.98% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x03e8 > ▒ >0.04% epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, > timeout: 0x > ▒ >0.04% epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, > timeout: 0x > ▒ >0.03% epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, > timeout: 0x > ▒ >0.02% epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, > timeout: 0x > {code} > Running this time with perf record -ag for call traces: > {code} > # Children Self sys usr Trace output > > # > > # > 8.61% 8.61% 0.00% 8.61% epfd: 0x00a7, events: > 0x7fca452d6000, maxevents: 0x1000, timeout: 0x > | > ---0x1000200af313 >| > --8.61%--0x7fca6117bdac > 0x7fca60459804 > epoll_wait > 2.98% 2.98% 0.00% 2.98% epfd: 0x00a7, events: > 0x7fca452d6000, maxevents: 0x1000, timeout: 0x03e8 > | > ---0x1000200af313 >0x7fca6117b830 >0x7fca60459804 >epoll_wait > {code} > That looks like a lot of CPU used to wait for nothing. I'm not sure if pref > reports a per-CPU percentage or a per-system percentage, but that would be > still be 10% of the total CPU usage of Cassandra at the minimum. > I went further and found the code of all that: We schedule a lot of > {{Message::Flusher}} with a deadline of 10 usec (5 per messages I think) but > netty+epoll only
[jira] [Commented] (CASSANDRA-13651) Large amount of CPU used by epoll_wait(.., .., .., 0)
[ https://issues.apache.org/jira/browse/CASSANDRA-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587624#comment-16587624 ] Benedict commented on CASSANDRA-13651: -- So, I assume we're now defaulting to epoll in most cases, and this behaviour comes from a period where this wasn't the default (and was probably poorly justified at the time - AFAICR we used to benchmark with only a single connection, where this behaviour would be more beneficial). It's a shame we no longer have any standard benchmarking tools for the project, but it seems we have multiple data points demonstrating a win (or no loss), and the code is simpler after the patch. So, I'm +1 on the patch. I will get a circleci run going shortly. > Large amount of CPU used by epoll_wait(.., .., .., 0) > - > > Key: CASSANDRA-13651 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13651 > Project: Cassandra > Issue Type: Bug >Reporter: Corentin Chary >Assignee: Corentin Chary >Priority: Major > Fix For: 4.x > > Attachments: cpu-usage.png > > > I was trying to profile Cassandra under my workload and I kept seeing this > backtrace: > {code} > epollEventLoopGroup-2-3 State: RUNNABLE CPU usage on sample: 240ms > io.netty.channel.epoll.Native.epollWait0(int, long, int, int) Native.java > (native) > io.netty.channel.epoll.Native.epollWait(int, EpollEventArray, int) > Native.java:111 > io.netty.channel.epoll.EpollEventLoop.epollWait(boolean) > EpollEventLoop.java:230 > io.netty.channel.epoll.EpollEventLoop.run() EpollEventLoop.java:254 > io.netty.util.concurrent.SingleThreadEventExecutor$5.run() > SingleThreadEventExecutor.java:858 > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run() > DefaultThreadFactory.java:138 > java.lang.Thread.run() Thread.java:745 > {code} > At fist I though that the profiler might not be able to profile native code > properly, but I wen't further and I realized that most of the CPU was used by > {{epoll_wait()}} calls with a timeout of zero. > Here is the output of perf on this system, which confirms that most of the > overhead was with timeout == 0. > {code} > Samples: 11M of event 'syscalls:sys_enter_epoll_wait', Event count (approx.): > 11594448 > Overhead Trace output > > ◆ > 90.06% epfd: 0x0047, events: 0x7f5588c0c000, maxevents: 0x2000, > timeout: 0x > ▒ >5.77% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x > ▒ >1.98% epfd: 0x00b5, events: 0x7fca419ef000, maxevents: 0x1000, > timeout: 0x03e8 > ▒ >0.04% epfd: 0x0003, events: 0x2f6af77b9c00, maxevents: 0x0020, > timeout: 0x > ▒ >0.04% epfd: 0x002b, events: 0x121ebf63ac00, maxevents: 0x0040, > timeout: 0x > ▒ >0.03% epfd: 0x0026, events: 0x7f51f80019c0, maxevents: 0x0020, > timeout: 0x > ▒ >0.02% epfd: 0x0003, events: 0x7fe4d80019d0, maxevents: 0x0020, > timeout: 0x > {code} > Running this time with perf record -ag for call traces: > {code} > # Children Self sys usr Trace output > > # > > # > 8.61% 8.61% 0.00% 8.61% epfd: 0x00a7, events: > 0x7fca452d6000, maxevents: 0x1000, timeout: 0x > | > ---0x1000200af313 >| > --8.61%--0x7fca6117bdac > 0x7fca60459804 > epoll_wait > 2.98% 2.98% 0.00% 2.98% epfd: 0x00a7, events: > 0x7fca452d6000, maxevents: 0x1000, timeout: 0x03e8 > | > ---0x1000200af313 >0x7fca6117b830 >
[jira] [Updated] (CASSANDRA-14592) Reconcile should not be dependent on nowInSec
[ https://issues.apache.org/jira/browse/CASSANDRA-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-14592: -- Reviewer: Aleksey Yeschenko > Reconcile should not be dependent on nowInSec > - > > Key: CASSANDRA-14592 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14592 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > To have the arrival time of a mutation on a replica determine the > reconciliation priority seems to provide for unintuitive database behaviour. > It seems we should formalise our reconciliation logic in a manner that does > not depend on this, and modify our internal APIs to prevent this dependency. > > Take the following example, where both writes have the same timestamp: > > Write X with a value A, TTL of 1s > Write Y with a value B, no TTL > > If X and Y arrive on replicas in < 1s, X and Y are both live, so record Y > wins the reconciliation. The value B appears in the database. > However, if X and Y arrive on replicas in > 1s, X is now (effectively) a > tombstone. This wins the reconciliation race, and NO value is the result. > > Note that the weirdness of this is more pronounced than it might first > appear. If write X gets stuck in hints for a period on the coordinator to > one replica, the value B appears in the database until the hint is replayed. > So now we’re in a very uncertain state - will hints get replayed or not? If > they do, the value B will disappear; if they don’t it won’t. This is despite > a QUORUM of replicas ACKing both writes, and a QUORUM of readers being > engaged on read; the database still changes state to the user suddenly at > some arbitrary future point in time. > > It seems to me that a simple solution to this, is to permit TTL’d data to > always win a reconciliation against non-TTL’d data (of same timestamp), so > that we are consistent across TTLs being transformed into tombstones. > > 4.0 seems like a good opportunity to fix this behaviour, and mention in > CHANGES.txt. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14592) Reconcile should not be dependent on nowInSec
[ https://issues.apache.org/jira/browse/CASSANDRA-14592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587593#comment-16587593 ] Benedict commented on CASSANDRA-14592: -- Pushed an update that addresses (I think, it's been a while) Aleksey's offline review comments. We collaborated to modify the reconcile semantics a little further, so that reconciliation is as consistent as possible. Now the only situations that might arise with inconsistent reconciliation occur when one cell is expiring, another is a tombstone, and only at the point where both are logically a tombstone. Specifically, we now prefer: # The most recent timestamp # If either are a tombstone or expiring ## If one is regular, select the tombstone or expiring ## If one is expiring, select the tombstone ## The most recent deletion time # The highest value (by raw ByteBuffer comparison) > Reconcile should not be dependent on nowInSec > - > > Key: CASSANDRA-14592 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14592 > Project: Cassandra > Issue Type: Bug >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 4.0 > > > To have the arrival time of a mutation on a replica determine the > reconciliation priority seems to provide for unintuitive database behaviour. > It seems we should formalise our reconciliation logic in a manner that does > not depend on this, and modify our internal APIs to prevent this dependency. > > Take the following example, where both writes have the same timestamp: > > Write X with a value A, TTL of 1s > Write Y with a value B, no TTL > > If X and Y arrive on replicas in < 1s, X and Y are both live, so record Y > wins the reconciliation. The value B appears in the database. > However, if X and Y arrive on replicas in > 1s, X is now (effectively) a > tombstone. This wins the reconciliation race, and NO value is the result. > > Note that the weirdness of this is more pronounced than it might first > appear. If write X gets stuck in hints for a period on the coordinator to > one replica, the value B appears in the database until the hint is replayed. > So now we’re in a very uncertain state - will hints get replayed or not? If > they do, the value B will disappear; if they don’t it won’t. This is despite > a QUORUM of replicas ACKing both writes, and a QUORUM of readers being > engaged on read; the database still changes state to the user suddenly at > some arbitrary future point in time. > > It seems to me that a simple solution to this, is to permit TTL’d data to > always win a reconciliation against non-TTL’d data (of same timestamp), so > that we are consistent across TTLs being transformed into tombstones. > > 4.0 seems like a good opportunity to fix this behaviour, and mention in > CHANGES.txt. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14573) Expose settings in virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-14573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587566#comment-16587566 ] Aleksey Yeschenko commented on CASSANDRA-14573: --- Committed the end version with some very minor tweaks on top as [994da255cec95982f52d20c91cb18eb7f9e45fc3|https://github.com/apache/cassandra/commit/994da255cec95982f52d20c91cb18eb7f9e45fc3] to 4.0. Cheers. > Expose settings in virtual table > > > Key: CASSANDRA-14573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14573 > Project: Cassandra > Issue Type: New Feature >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > Labels: pull-request-available, virtual-tables > Fix For: 4.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Allow both viewing what the settings are (currently impossible for some) and > allow changing some settings. > Example: > {code:java} > UPDATE system_info.settings SET value = 'false' WHERE setting = > 'hinted_handoff_enabled'; > SELECT * FROM system_info.settings WHERE writable = True; > setting | value | writable > --++-- > batch_size_fail_threshold_in_kb | 50 | True > batch_size_warn_threshold_in_kb | 5 | True > cas_contention_timeout_in_ms | 1000 | True > compaction_throughput_mb_per_sec | 16 | True > concurrent_compactors | 2 | True >concurrent_validations | 2147483647 | True > counter_write_request_timeout_in_ms | 5000 | True >hinted_handoff_enabled | false | True > hinted_handoff_throttle_in_kb | 1024 | True > incremental_backups | false | True > inter_dc_stream_throughput_outbound_megabits_per_sec |200 | True > phi_convict_threshold |8.0 | True > range_request_timeout_in_ms | 1 | True >read_request_timeout_in_ms | 5000 | True > request_timeout_in_ms | 1 | True > stream_throughput_outbound_megabits_per_sec |200 | True > tombstone_failure_threshold | 10 | True > tombstone_warn_threshold | 1000 | True >truncate_request_timeout_in_ms | 6 | True > write_request_timeout_in_ms | 2000 | > True{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14573) Expose settings in virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-14573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-14573: -- Resolution: Fixed Fix Version/s: (was: 4.x) 4.0 Status: Resolved (was: Patch Available) > Expose settings in virtual table > > > Key: CASSANDRA-14573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14573 > Project: Cassandra > Issue Type: New Feature >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > Labels: pull-request-available, virtual-tables > Fix For: 4.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Allow both viewing what the settings are (currently impossible for some) and > allow changing some settings. > Example: > {code:java} > UPDATE system_info.settings SET value = 'false' WHERE setting = > 'hinted_handoff_enabled'; > SELECT * FROM system_info.settings WHERE writable = True; > setting | value | writable > --++-- > batch_size_fail_threshold_in_kb | 50 | True > batch_size_warn_threshold_in_kb | 5 | True > cas_contention_timeout_in_ms | 1000 | True > compaction_throughput_mb_per_sec | 16 | True > concurrent_compactors | 2 | True >concurrent_validations | 2147483647 | True > counter_write_request_timeout_in_ms | 5000 | True >hinted_handoff_enabled | false | True > hinted_handoff_throttle_in_kb | 1024 | True > incremental_backups | false | True > inter_dc_stream_throughput_outbound_megabits_per_sec |200 | True > phi_convict_threshold |8.0 | True > range_request_timeout_in_ms | 1 | True >read_request_timeout_in_ms | 5000 | True > request_timeout_in_ms | 1 | True > stream_throughput_outbound_megabits_per_sec |200 | True > tombstone_failure_threshold | 10 | True > tombstone_warn_threshold | 1000 | True >truncate_request_timeout_in_ms | 6 | True > write_request_timeout_in_ms | 2000 | > True{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Add a virtual table to expose settings [Forced Update!]
Repository: cassandra Updated Branches: refs/heads/trunk e7e0e0b23 -> 994da255c (forced update) Add a virtual table to expose settings patch by Chris Lohfink; reviewed by Aleksey Yeschenko for CASSANDRA-14573 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/994da255 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/994da255 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/994da255 Branch: refs/heads/trunk Commit: 994da255cec95982f52d20c91cb18eb7f9e45fc3 Parents: 6d1446f Author: Chris Lohfink Authored: Tue Aug 14 14:05:12 2018 -0500 Committer: Aleksey Yeshchenko Committed: Tue Aug 21 16:08:47 2018 +0100 -- CHANGES.txt | 1 + .../cassandra/db/virtual/SettingsTable.java | 189 ++ .../db/virtual/SystemViewsKeyspace.java | 1 + .../cassandra/db/virtual/SettingsTableTest.java | 245 +++ 4 files changed, 436 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/994da255/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index aeaf8ce..9fbaf25 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Add a virtual table to expose settings (CASSANDRA-14573) * Fix up chunk cache handling of metrics (CASSANDRA-14628) * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652) * Incomplete handling of exceptions when decoding incoming messages (CASSANDRA-14574) http://git-wip-us.apache.org/repos/asf/cassandra/blob/994da255/src/java/org/apache/cassandra/db/virtual/SettingsTable.java -- diff --git a/src/java/org/apache/cassandra/db/virtual/SettingsTable.java b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java new file mode 100644 index 000..34debc6 --- /dev/null +++ b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java @@ -0,0 +1,189 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.cassandra.db.virtual; + +import java.lang.reflect.Field; +import java.lang.reflect.Modifier; +import java.util.Arrays; +import java.util.Map; +import java.util.function.BiConsumer; +import java.util.stream.Collectors; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Functions; +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; + +import org.apache.cassandra.audit.AuditLogOptions; +import org.apache.cassandra.config.*; +import org.apache.cassandra.db.DecoratedKey; +import org.apache.cassandra.db.marshal.UTF8Type; +import org.apache.cassandra.dht.LocalPartitioner; +import org.apache.cassandra.schema.TableMetadata; +import org.apache.cassandra.transport.ServerError; + +final class SettingsTable extends AbstractVirtualTable +{ +private static final String NAME = "name"; +private static final String VALUE = "value"; + +@VisibleForTesting +static final Map FIELDS = +Arrays.stream(Config.class.getFields()) + .filter(f -> !Modifier.isStatic(f.getModifiers())) + .collect(Collectors.toMap(Field::getName, Functions.identity())); + +@VisibleForTesting +final Map> overrides = +ImmutableMap.>builder() +.put("audit_logging_options", this::addAuditLoggingOptions) +.put("client_encryption_options", this::addEncryptionOptions) +.put("server_encryption_options", this::addEncryptionOptions) +.put("transparent_data_encryption_options", this::addTransparentEncryptionOptions) +.build(); + +private final Config config; + +SettingsTable(String keyspace) +{ +this(keyspace, DatabaseDescriptor.getRawConfig()); +} + +SettingsTable(String keyspace, Config config) +{ +super(TableMetadata.builder(keyspace, "settings") + .comment("current settings")
cassandra git commit: Add a virtual table to expose settings (CASSANDRA-14573)
Repository: cassandra Updated Branches: refs/heads/trunk 6d1446ff0 -> e7e0e0b23 Add a virtual table to expose settings (CASSANDRA-14573) patch by Chris Lohfink; reviewed by Aleksey Yeschenko for CASSANDRA-14573 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e7e0e0b2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e7e0e0b2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e7e0e0b2 Branch: refs/heads/trunk Commit: e7e0e0b233acf3584147435d0989a9e2474c09e4 Parents: 6d1446f Author: Chris Lohfink Authored: Tue Aug 14 14:05:12 2018 -0500 Committer: Aleksey Yeshchenko Committed: Tue Aug 21 15:57:40 2018 +0100 -- CHANGES.txt | 1 + .../cassandra/db/virtual/SettingsTable.java | 189 ++ .../db/virtual/SystemViewsKeyspace.java | 1 + .../cassandra/db/virtual/SettingsTableTest.java | 245 +++ 4 files changed, 436 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e7e0e0b2/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index aeaf8ce..9fbaf25 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Add a virtual table to expose settings (CASSANDRA-14573) * Fix up chunk cache handling of metrics (CASSANDRA-14628) * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652) * Incomplete handling of exceptions when decoding incoming messages (CASSANDRA-14574) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e7e0e0b2/src/java/org/apache/cassandra/db/virtual/SettingsTable.java -- diff --git a/src/java/org/apache/cassandra/db/virtual/SettingsTable.java b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java new file mode 100644 index 000..34debc6 --- /dev/null +++ b/src/java/org/apache/cassandra/db/virtual/SettingsTable.java @@ -0,0 +1,189 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.cassandra.db.virtual; + +import java.lang.reflect.Field; +import java.lang.reflect.Modifier; +import java.util.Arrays; +import java.util.Map; +import java.util.function.BiConsumer; +import java.util.stream.Collectors; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Functions; +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableMap; + +import org.apache.cassandra.audit.AuditLogOptions; +import org.apache.cassandra.config.*; +import org.apache.cassandra.db.DecoratedKey; +import org.apache.cassandra.db.marshal.UTF8Type; +import org.apache.cassandra.dht.LocalPartitioner; +import org.apache.cassandra.schema.TableMetadata; +import org.apache.cassandra.transport.ServerError; + +final class SettingsTable extends AbstractVirtualTable +{ +private static final String NAME = "name"; +private static final String VALUE = "value"; + +@VisibleForTesting +static final Map FIELDS = +Arrays.stream(Config.class.getFields()) + .filter(f -> !Modifier.isStatic(f.getModifiers())) + .collect(Collectors.toMap(Field::getName, Functions.identity())); + +@VisibleForTesting +final Map> overrides = +ImmutableMap.>builder() +.put("audit_logging_options", this::addAuditLoggingOptions) +.put("client_encryption_options", this::addEncryptionOptions) +.put("server_encryption_options", this::addEncryptionOptions) +.put("transparent_data_encryption_options", this::addTransparentEncryptionOptions) +.build(); + +private final Config config; + +SettingsTable(String keyspace) +{ +this(keyspace, DatabaseDescriptor.getRawConfig()); +} + +SettingsTable(String keyspace, Config config) +{ +super(TableMetadata.builder(keyspace, "settings") + .comment("current
[jira] [Commented] (CASSANDRA-9989) Optimise BTree.Buider
[ https://issues.apache.org/jira/browse/CASSANDRA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587507#comment-16587507 ] Benedict commented on CASSANDRA-9989: - Thanks Jay. The patch looks good overall, but I have few improvement comments, and some questions: # How did you arrive at your {{childrenNum}} calculation, and are we certain it is correct? This is pretty critical for correctness, and hard to test fully, so it would be nice to have some comments justifying it. # Why decrement {{left}} instead of just counting up the number of values written? # Why is TREE_SIZE indexed from 1, not 0? # It would be nice if we removed MAX_DEPTH, and just truncated TREE_SIZE to the correct maximum in our static block I'm also torn on the splitting of the last two nodes - this is consistent with the current {{NodeBuilder}} logic, but it does complicate the code a little versus evenly splitting the remaining size amongst all the children. I've pushed a patch [here|https://github.com/belliottsmith/cassandra/tree/9989-suggest] with some tweaks to the {{LongBTreeTest}} to stress the new code paths more more, and it would be great if we could run this against the final patch for a while, with a tweak to the parameters to increase further the ratio of tests that use this code path. > Optimise BTree.Buider > - > > Key: CASSANDRA-9989 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9989 > Project: Cassandra > Issue Type: Sub-task >Reporter: Benedict >Assignee: Jay Zhuang >Priority: Minor > Fix For: 4.x > > Attachments: 9989-trunk.txt > > > BTree.Builder could reduce its copying, and exploit toArray more efficiently, > with some work. It's not very important right now because we don't make as > much use of its bulk-add methods as we otherwise might, however over time > this work will become more useful. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14409) Transient Replication: Support ring changes when transient replication is in use (add/remove node, change RF, add/remove DC)
[ https://issues.apache.org/jira/browse/CASSANDRA-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587397#comment-16587397 ] Alex Petrov commented on CASSANDRA-14409: - [Link for compare|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4], may be useful (at least until rebase). Some preliminary comments (from looking at the code and running unit tests, without writing additional dtests so far): * [here|https://github.com/aweisberg/cassandra/commit/62578c85865c474774633f5337affa8c2ce0eb07#diff-350352a1ac9b039efbee2eeb8978a9c9R149] we can do just one iteration and add to one of the sets depending on whether the range is full or transient, similar change can be done in {{RangeStreamer#fetchAsync}} with {{remaining}} and {{skipped}}. Generally, {{fetchAsync}} uses a two-step logic which is quite hard to follow, since it becomes hard to track which ranges are actually making it to the end and does quite some unnecessary iteration. * [here|https://github.com/aweisberg/cassandra/commit/62578c85865c474774633f5337affa8c2ce0eb07#diff-d4e3b82e9bebfd2cb466b4a30af07fa4R1132], we can simplify the statement to {{... !needsCleanupTransient || !sstable.isRepaired()}} * [this|https://github.com/apache/cassandra/compare/trunk...aweisberg:14409-4#diff-fad052638059f53b1a6d479dbd05f2f2R323] might just use {{Replica#isLocal}}. Also, naming is inconsistent with {{isNotAlive}}, "is" is missing. Same inconsistency with {{endpointNotReplicatedAnymore}}. * maybe use a single debug statement [here|https://github.com/apache/cassandra/compare/trunk...aweisberg:14409-4#diff-fad052638059f53b1a6d479dbd05f2f2R299] * Might be worth to remove {{pritnlns}} from {{MoveTransientTest}} * Should we stick to {{workMap}} or {{rangeMap}} in {{RangeStreamer}}? * [here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-fad052638059f53b1a6d479dbd05f2f2R219] we might want to avoid iteration if tracing is not enabled * in Range streamer, we keep calling {{Keyspace#open}} many times even though we could just do it less times and pass a key space instance where applicable * duplicate import [here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-a4c2ce49cb3a3429a8d6376929a90f7fR33] * [this|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-433b489a9a55c01dc4b021b012af3af6R59] might need an extra comment, since this task is used in repair as well as in streaming. Same goes for [here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-eb7f85ba9cf3e87d842aad8b82557d19R82]. It's not completely transparent why we strip node information and transientness. I'd rather fix the code that prevents us from not doing It or would just use ranges if this information is not important. * [here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-ce3f6856b405c96859d9a50d9977e0b9R1285] (and in some other cases where pair is used), I'd just use a specialised Pair class that would give more precise definition of what left and right is. Namely: transient and full. * do we need to update legacy transferred ranges [here|https://github.com/aweisberg/cassandra/compare/2a97227550af593e63583fb1336cb94d746d1838...14409-4#diff-ce3f6856b405c96859d9a50d9977e0b9R1322]? However, this is there after [59b5b6bef0fa76bf5740b688fcd4d9cf525760d0], so not directly related to this commit. * optimised range fetch map calculation feels not fully done. We need to change the {{RangeFetchMapCalculator}} instead of unwrapping and then re-wrapping items in it instead. What it currently does looks very fragile. * {{RangeStreamer#calculateRangesToFetchWithPreferredEndpoints}} also looks unfinished, since it returns {{ReplicaMultimap}}, but we always convert it to {{Multimap>}} in {{StorageService#calculateRangesToFetchWithPreferredEndpoints}} (same name, different return type, which makes it additionally difficult to follow), and then back to {{RangeFetchMapCalculator}} and {{convertPreferredEndpointsToWorkMap}}. * it seems that we keep splitting full and transient replicas in a bunch of places, maybe we should add some auxiliary method to perform this job efficiently with a clear interface and use it everywhere? * in {{StorageService}}, use of {{Map.Entry}} simultaneously with {{Pair}} really complicates the logic. Once again hints we might need a specialised class for replica source/destination pairs. * in multiple places ({{StorageService}} and {{PendingRepairManager}}), we're still calling {{getAddressReplicas}} that materialises map only to get {{ReplicaSet}} by address * it looks like the distinction between transient and full
[jira] [Updated] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-14626: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Expose buffer cache metrics in caches virtual table > --- > > Key: CASSANDRA-14626 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14626 > Project: Cassandra > Issue Type: New Feature >Reporter: Benjamin Lerer >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache > metrics in the caches virtual table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587360#comment-16587360 ] Aleksey Yeschenko commented on CASSANDRA-14626: --- Committed as [6d1446ff062ac322b203e16ea0bf0ed8fd1fa5ca|https://github.com/apache/cassandra/commit/6d1446ff062ac322b203e16ea0bf0ed8fd1fa5ca] to 4.0, thanks. > Expose buffer cache metrics in caches virtual table > --- > > Key: CASSANDRA-14626 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14626 > Project: Cassandra > Issue Type: New Feature >Reporter: Benjamin Lerer >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache > metrics in the caches virtual table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Add chunks cache metrics to caches virtual table
Repository: cassandra Updated Branches: refs/heads/trunk e6b8e7a72 -> 6d1446ff0 Add chunks cache metrics to caches virtual table patch by Aleksey Yeschenko; reviewed by Benedict Elliott Smith for CASSANDRA-14626 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6d1446ff Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6d1446ff Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6d1446ff Branch: refs/heads/trunk Commit: 6d1446ff062ac322b203e16ea0bf0ed8fd1fa5ca Parents: e6b8e7a Author: Aleksey Yeshchenko Authored: Wed Aug 8 16:51:11 2018 +0100 Committer: Aleksey Yeshchenko Committed: Tue Aug 21 13:35:41 2018 +0100 -- CHANGES.txt | 2 +- src/java/org/apache/cassandra/db/virtual/CachesTable.java | 5 + 2 files changed, 6 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6d1446ff/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 5fa28f5..aeaf8ce 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -12,7 +12,7 @@ * Remove hardcoded java11 jvm args in idea workspace files (CASSANDRA-14627) * Update netty to 4.1.128 (CASSANDRA-14633) * Add a virtual table to expose thread pools (CASSANDRA-14523) - * Add a virtual table to expose caches (CASSANDRA-14538) + * Add a virtual table to expose caches (CASSANDRA-14538, CASSANDRA-14626) * Fix toDate function for timestamp arguments (CASSANDRA-14502) * Revert running dtests by default in circleci (CASSANDRA-14614) * Stream entire SSTables when possible (CASSANDRA-14556) http://git-wip-us.apache.org/repos/asf/cassandra/blob/6d1446ff/src/java/org/apache/cassandra/db/virtual/CachesTable.java -- diff --git a/src/java/org/apache/cassandra/db/virtual/CachesTable.java b/src/java/org/apache/cassandra/db/virtual/CachesTable.java index e5f80f7..5a265e6 100644 --- a/src/java/org/apache/cassandra/db/virtual/CachesTable.java +++ b/src/java/org/apache/cassandra/db/virtual/CachesTable.java @@ -17,6 +17,7 @@ */ package org.apache.cassandra.db.virtual; +import org.apache.cassandra.cache.ChunkCache; import org.apache.cassandra.db.marshal.*; import org.apache.cassandra.dht.LocalPartitioner; import org.apache.cassandra.metrics.CacheMetrics; @@ -69,9 +70,13 @@ final class CachesTable extends AbstractVirtualTable public DataSet data() { SimpleDataSet result = new SimpleDataSet(metadata()); + +if (null != ChunkCache.instance) +addRow(result, "chunks", ChunkCache.instance.metrics); addRow(result, "counters", CacheService.instance.counterCache.getMetrics()); addRow(result, "keys", CacheService.instance.keyCache.getMetrics()); addRow(result, "rows", CacheService.instance.rowCache.getMetrics()); + return result; } } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587318#comment-16587318 ] Benedict commented on CASSANDRA-14626: -- +1 with no comments to the actual intended patch here (i.e. that builds on the CASSANDRA-14628 patch) > Expose buffer cache metrics in caches virtual table > --- > > Key: CASSANDRA-14626 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14626 > Project: Cassandra > Issue Type: New Feature >Reporter: Benjamin Lerer >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache > metrics in the caches virtual table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587316#comment-16587316 ] Aleksey Yeschenko commented on CASSANDRA-14628: --- Cheers, committed as [e6b8e7a72f783ed0e1b5a2c04381f89b533229a4|https://github.com/apache/cassandra/commit/e6b8e7a72f783ed0e1b5a2c04381f89b533229a4] to 4.0. > Clean up cache-related metrics > -- > > Key: CASSANDRA-14628 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14628 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate > of pre-existing {{CacheMetrics}}. I believe it was done initially because the > authors thought there was no way to register hits with {{Caffeine}}, only > misses, but that's not quite true. All we need is to provide a > {{StatsCounter}} object when building the cache and update our metrics from > there. > The patch removes the redundant code and streamlines chunk cache metrics to > use more idiomatic tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14628) Clean up cache-related metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-14628: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Clean up cache-related metrics > -- > > Key: CASSANDRA-14628 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14628 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate > of pre-existing {{CacheMetrics}}. I believe it was done initially because the > authors thought there was no way to register hits with {{Caffeine}}, only > misses, but that's not quite true. All we need is to provide a > {{StatsCounter}} object when building the cache and update our metrics from > there. > The patch removes the redundant code and streamlines chunk cache metrics to > use more idiomatic tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Fix up chunk cache handling of metrics
Repository: cassandra Updated Branches: refs/heads/trunk 8b9515bd2 -> e6b8e7a72 Fix up chunk cache handling of metrics patch by Aleksey Yeschenko; reviewed by Benedict Elliott Smith for CASSANDRA-14628 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e6b8e7a7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e6b8e7a7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e6b8e7a7 Branch: refs/heads/trunk Commit: e6b8e7a72f783ed0e1b5a2c04381f89b533229a4 Parents: 8b9515b Author: Aleksey Yeshchenko Authored: Wed Aug 8 15:17:26 2018 +0100 Committer: Aleksey Yeshchenko Committed: Tue Aug 21 12:33:42 2018 +0100 -- CHANGES.txt | 1 + .../org/apache/cassandra/cache/ChunkCache.java | 36 +++--- .../cassandra/cache/InstrumentingCache.java | 3 +- .../apache/cassandra/metrics/CacheMetrics.java | 116 +++ .../cassandra/metrics/CacheMissMetrics.java | 114 -- .../cassandra/metrics/ChunkCacheMetrics.java| 92 +++ 6 files changed, 179 insertions(+), 183 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6b8e7a7/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d906879..5fa28f5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Fix up chunk cache handling of metrics (CASSANDRA-14628) * Extend IAuthenticator to accept peer SSL certificates (CASSANDRA-14652) * Incomplete handling of exceptions when decoding incoming messages (CASSANDRA-14574) * Add diagnostic events for user audit logging (CASSANDRA-13668) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e6b8e7a7/src/java/org/apache/cassandra/cache/ChunkCache.java -- diff --git a/src/java/org/apache/cassandra/cache/ChunkCache.java b/src/java/org/apache/cassandra/cache/ChunkCache.java index 9284377..0edb681 100644 --- a/src/java/org/apache/cassandra/cache/ChunkCache.java +++ b/src/java/org/apache/cassandra/cache/ChunkCache.java @@ -29,11 +29,10 @@ import com.google.common.collect.Iterables; import com.google.common.util.concurrent.MoreExecutors; import com.github.benmanes.caffeine.cache.*; -import com.codahale.metrics.Timer; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.io.sstable.CorruptSSTableException; import org.apache.cassandra.io.util.*; -import org.apache.cassandra.metrics.CacheMissMetrics; +import org.apache.cassandra.metrics.ChunkCacheMetrics; import org.apache.cassandra.utils.memory.BufferPool; public class ChunkCache @@ -47,7 +46,7 @@ public class ChunkCache public static final ChunkCache instance = enabled ? new ChunkCache() : null; private final LoadingCache cache; -public final CacheMissMetrics metrics; +public final ChunkCacheMetrics metrics; static class Key { @@ -135,29 +134,25 @@ public class ChunkCache } } -public ChunkCache() +private ChunkCache() { +metrics = new ChunkCacheMetrics(this); cache = Caffeine.newBuilder() -.maximumWeight(cacheSize) -.executor(MoreExecutors.directExecutor()) -.weigher((key, buffer) -> ((Buffer) buffer).buffer.capacity()) -.removalListener(this) -.build(this); -metrics = new CacheMissMetrics("ChunkCache", this); +.maximumWeight(cacheSize) +.executor(MoreExecutors.directExecutor()) +.weigher((key, buffer) -> ((Buffer) buffer).buffer.capacity()) +.removalListener(this) +.recordStats(() -> metrics) +.build(this); } @Override -public Buffer load(Key key) throws Exception +public Buffer load(Key key) { -ChunkReader rebufferer = key.file; -metrics.misses.mark(); -try (Timer.Context ctx = metrics.missLatency.time()) -{ -ByteBuffer buffer = BufferPool.get(key.file.chunkSize(), key.file.preferredBufferType()); -assert buffer != null; -rebufferer.readChunk(key.position, buffer); -return new Buffer(buffer, key.position); -} +ByteBuffer buffer = BufferPool.get(key.file.chunkSize(), key.file.preferredBufferType()); +assert buffer != null; +key.file.readChunk(key.position, buffer); +return new Buffer(buffer, key.position); } @Override @@ -229,7 +224,6 @@ public class ChunkCache { try { -metrics.requests.mark(); long pageAlignedPos =
[jira] [Comment Edited] (CASSANDRA-14628) Clean up cache-related metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587281#comment-16587281 ] Aleksey Yeschenko edited comment on CASSANDRA-14628 at 8/21/18 10:57 AM: - bq. Modernise all of the RatioGauge construction as well, via a static method that accepts a Supplier Looks good. Can look better though. Might as well go all the way and make a static method that accepts two {{DoubleSupplier}} s bq. Replace requests with a Metered that proxies its calls to getX onto hits.getX() + misses.getX() Good call. Cherry-picked, and pushed an updated commit on top that reformats things, fixes import order, and makes the {{DoubleSupplier}} change. was (Author: iamaleksey): bq. Modernise all of the RatioGauge construction as well, via a static method that accepts a Supplier Looks good. Can look better though. Might as well go all the way and make a static method that accepts two {{DoubleSupplier}}s bq. Replace requests with a Metered that proxies its calls to getX onto hits.getX() + misses.getX() Good call. Cherry-picked, and pushed an updated commit on top that reformats things, fixes import order, and makes the {{DoubleSupplier}} change. > Clean up cache-related metrics > -- > > Key: CASSANDRA-14628 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14628 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate > of pre-existing {{CacheMetrics}}. I believe it was done initially because the > authors thought there was no way to register hits with {{Caffeine}}, only > misses, but that's not quite true. All we need is to provide a > {{StatsCounter}} object when building the cache and update our metrics from > there. > The patch removes the redundant code and streamlines chunk cache metrics to > use more idiomatic tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587286#comment-16587286 ] Benedict commented on CASSANDRA-14628: -- Yep, even better. +1 > Clean up cache-related metrics > -- > > Key: CASSANDRA-14628 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14628 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate > of pre-existing {{CacheMetrics}}. I believe it was done initially because the > authors thought there was no way to register hits with {{Caffeine}}, only > misses, but that's not quite true. All we need is to provide a > {{StatsCounter}} object when building the cache and update our metrics from > there. > The patch removes the redundant code and streamlines chunk cache metrics to > use more idiomatic tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587281#comment-16587281 ] Aleksey Yeschenko commented on CASSANDRA-14628: --- bq. Modernise all of the RatioGauge construction as well, via a static method that accepts a Supplier Looks good. Can look better though. Might as well go all the way and make a static method that accepts two {{DoubleSupplier}}s bq. Replace requests with a Metered that proxies its calls to getX onto hits.getX() + misses.getX() Good call. Cherry-picked, and pushed an updated commit on top that reformats things, fixes import order, and makes the {{DoubleSupplier}} change. > Clean up cache-related metrics > -- > > Key: CASSANDRA-14628 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14628 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate > of pre-existing {{CacheMetrics}}. I believe it was done initially because the > authors thought there was no way to register hits with {{Caffeine}}, only > misses, but that's not quite true. All we need is to provide a > {{StatsCounter}} object when building the cache and update our metrics from > there. > The patch removes the redundant code and streamlines chunk cache metrics to > use more idiomatic tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14628) Clean up cache-related metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587231#comment-16587231 ] Benedict commented on CASSANDRA-14628: -- It looks like [my comment|https://issues.apache.org/jira/browse/CASSANDRA-14626?focusedCommentId=16587227=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16587227] on CASSANDRA-14626 should have gone here. > Clean up cache-related metrics > -- > > Key: CASSANDRA-14628 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14628 > Project: Cassandra > Issue Type: Improvement >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > {{ChunkCache}} added {{CacheMissMetrics}} which is an almost exact duplicate > of pre-existing {{CacheMetrics}}. I believe it was done initially because the > authors thought there was no way to register hits with {{Caffeine}}, only > misses, but that's not quite true. All we need is to provide a > {{StatsCounter}} object when building the cache and update our metrics from > there. > The patch removes the redundant code and streamlines chunk cache metrics to > use more idiomatic tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14626) Expose buffer cache metrics in caches virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587227#comment-16587227 ] Benedict commented on CASSANDRA-14626: -- Patch looks like a big improvement. I have a couple of minor suggestions for {{CacheMetrics}}: # Modernise all of the {{RatioGauge}} construction as well, via a static method that accepts a {{Supplier}} # Replace {{requests}} with a {{Metered}} that proxies its calls to {{getX}} onto {{hits.getX() + misses.getX()}} I've pushed a patch with these changes [here|https://github.com/belliottsmith/cassandra/tree/14626-suggest]. Feel free to take or discard as you please. +1 > Expose buffer cache metrics in caches virtual table > --- > > Key: CASSANDRA-14626 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14626 > Project: Cassandra > Issue Type: New Feature >Reporter: Benjamin Lerer >Assignee: Aleksey Yeschenko >Priority: Minor > Labels: virtual-tables > Fix For: 4.0 > > > As noted by [~blerer] in CASSANDRA-14538, we should expose buffer cache > metrics in the caches virtual table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary
[ https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587183#comment-16587183 ] Branimir Lambov commented on CASSANDRA-14649: - Committed as [49adbe7e0f0c8a83f3b843b65612528498b5c9a5|https://github.com/apache/cassandra/commit/49adbe7e0f0c8a83f3b843b65612528498b5c9a5]. > Index summaries fail when their size gets > 2G and use more space than > necessary > > > Key: CASSANDRA-14649 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14649 > Project: Cassandra > Issue Type: Bug >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Major > > After building a summary, {{IndexSummaryBuilder}} tries to trim the memory > writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of > trimming, this ends up allocating at least as much extra space and failing > the {{Buffer.position()}} call when the size is greater than > {{Integer.MAX_VALUE}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary
[ https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-14649: Resolution: Fixed Status: Resolved (was: Patch Available) > Index summaries fail when their size gets > 2G and use more space than > necessary > > > Key: CASSANDRA-14649 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14649 > Project: Cassandra > Issue Type: Bug >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Major > > After building a summary, {{IndexSummaryBuilder}} tries to trim the memory > writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of > trimming, this ends up allocating at least as much extra space and failing > the {{Buffer.position()}} call when the size is greater than > {{Integer.MAX_VALUE}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[07/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/299782cf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/299782cf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/299782cf Branch: refs/heads/trunk Commit: 299782cff5a56db8b8fe9ad70a43bea7b729cc99 Parents: 236c47e 49adbe7 Author: Branimir Lambov Authored: Tue Aug 21 11:55:38 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:55:38 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[09/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/991e1971 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/991e1971 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/991e1971 Branch: refs/heads/trunk Commit: 991e19711f8762bbf93d6af588cef0a14668cc59 Parents: 65a4682 299782c Author: Branimir Lambov Authored: Tue Aug 21 11:56:05 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:56:05 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- diff --cc src/java/org/apache/cassandra/io/util/DataOutputBuffer.java index 144edad,7586543..28ca468 --- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java @@@ -38,43 -37,11 +38,43 @@@ public class DataOutputBuffer extends B /* * Threshold at which resizing transitions from doubling to increasing by 50% */ - private static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); + static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); +/* + * Only recycle OutputBuffers up to 1Mb. Larger buffers will be trimmed back to this size. + */ +private static final int MAX_RECYCLE_BUFFER_SIZE = Integer.getInteger(Config.PROPERTY_PREFIX + "dob_max_recycle_bytes", 1024 * 1024); + +private static final int DEFAULT_INITIAL_BUFFER_SIZE = 128; + +/** + * Scratch buffers used mostly for serializing in memory. It's important to call #recycle() when finished + * to keep the memory overhead from being too large in the system. + */ +public static final FastThreadLocal scratchBuffer = new FastThreadLocal() +{ +protected DataOutputBuffer initialValue() throws Exception +{ +return new DataOutputBuffer() +{ +public void close() +{ +if (buffer.capacity() <= MAX_RECYCLE_BUFFER_SIZE) +{ +buffer.clear(); +} +else +{ +buffer = ByteBuffer.allocate(DEFAULT_INITIAL_BUFFER_SIZE); +} +} +}; +} +}; + public DataOutputBuffer() { -this(128); +this(DEFAULT_INITIAL_BUFFER_SIZE); } public DataOutputBuffer(int size) http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/test/unit/org/apache/cassandra/io/util/DataOutputTest.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[05/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/299782cf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/299782cf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/299782cf Branch: refs/heads/cassandra-3.11 Commit: 299782cff5a56db8b8fe9ad70a43bea7b729cc99 Parents: 236c47e 49adbe7 Author: Branimir Lambov Authored: Tue Aug 21 11:55:38 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:55:38 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[06/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/299782cf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/299782cf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/299782cf Branch: refs/heads/cassandra-3.0 Commit: 299782cff5a56db8b8fe9ad70a43bea7b729cc99 Parents: 236c47e 49adbe7 Author: Branimir Lambov Authored: Tue Aug 21 11:55:38 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:55:38 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/299782cf/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[02/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G
Fix SafeMemoryWriter trimming and behaviour over 2G patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e Branch: refs/heads/cassandra-3.0 Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5 Parents: 0e81892 Author: Branimir Lambov Authored: Thu Aug 16 16:15:07 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:53:30 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java index 6110afe..0f604e0 100644 --- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java +++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java @@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable { // this method should only be called when we've finished appending records, so we truncate the // memory we're using to the exact amount required to represent it before building our summary -entries.setCapacity(entries.length()); -offsets.setCapacity(offsets.length()); +entries.trim(); +offsets.trim(); } public IndexSummary build(IPartitioner partitioner) http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java index 6ea6d97..3f1e081 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java @@ -37,7 +37,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus /* * Threshold at which resizing transitions from doubling to increasing by 50% */ -private static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); +static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); public DataOutputBuffer() { @@ -83,7 +83,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus @Override protected void doFlush(int count) throws IOException { -reallocate(count); +expandToFit(count); } //Hack for test, make it possible to override checking the buffer capacity @@ -119,7 +119,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus return validateReallocation(newSize); } -protected void reallocate(long count) +protected void expandToFit(long count) { if (count <= 0) return; @@ -141,7 +141,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus public int write(ByteBuffer src) throws IOException { int count = src.remaining(); -reallocate(count); +expandToFit(count); buffer.put(src); return count; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java index c815c9e..c9767fc 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java @@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer * @see org.apache.cassandra.io.util.DataOutputBuffer#reallocate(long) */ @Override -protected void reallocate(long newSize) +protected void expandToFit(long newSize) { throw new BufferOverflowException(); }
[10/10] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8b9515bd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8b9515bd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8b9515bd Branch: refs/heads/trunk Commit: 8b9515bd2e410c634e4a31fe3e93890f1a1f8f71 Parents: ac1bb75 991e197 Author: Branimir Lambov Authored: Tue Aug 21 11:56:30 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:56:30 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[01/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G
Repository: cassandra Updated Branches: refs/heads/cassandra-2.2 0e81892d7 -> 49adbe7e0 refs/heads/cassandra-3.0 236c47e65 -> 299782cff refs/heads/cassandra-3.11 65a46820b -> 991e19711 refs/heads/trunk ac1bb7586 -> 8b9515bd2 Fix SafeMemoryWriter trimming and behaviour over 2G patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e Branch: refs/heads/cassandra-2.2 Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5 Parents: 0e81892 Author: Branimir Lambov Authored: Thu Aug 16 16:15:07 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:53:30 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java index 6110afe..0f604e0 100644 --- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java +++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java @@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable { // this method should only be called when we've finished appending records, so we truncate the // memory we're using to the exact amount required to represent it before building our summary -entries.setCapacity(entries.length()); -offsets.setCapacity(offsets.length()); +entries.trim(); +offsets.trim(); } public IndexSummary build(IPartitioner partitioner) http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java index 6ea6d97..3f1e081 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java @@ -37,7 +37,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus /* * Threshold at which resizing transitions from doubling to increasing by 50% */ -private static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); +static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); public DataOutputBuffer() { @@ -83,7 +83,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus @Override protected void doFlush(int count) throws IOException { -reallocate(count); +expandToFit(count); } //Hack for test, make it possible to override checking the buffer capacity @@ -119,7 +119,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus return validateReallocation(newSize); } -protected void reallocate(long count) +protected void expandToFit(long count) { if (count <= 0) return; @@ -141,7 +141,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus public int write(ByteBuffer src) throws IOException { int count = src.remaining(); -reallocate(count); +expandToFit(count); buffer.put(src); return count; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java index c815c9e..c9767fc 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java @@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer * @see
[04/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G
Fix SafeMemoryWriter trimming and behaviour over 2G patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e Branch: refs/heads/trunk Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5 Parents: 0e81892 Author: Branimir Lambov Authored: Thu Aug 16 16:15:07 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:53:30 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java index 6110afe..0f604e0 100644 --- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java +++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java @@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable { // this method should only be called when we've finished appending records, so we truncate the // memory we're using to the exact amount required to represent it before building our summary -entries.setCapacity(entries.length()); -offsets.setCapacity(offsets.length()); +entries.trim(); +offsets.trim(); } public IndexSummary build(IPartitioner partitioner) http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java index 6ea6d97..3f1e081 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java @@ -37,7 +37,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus /* * Threshold at which resizing transitions from doubling to increasing by 50% */ -private static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); +static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); public DataOutputBuffer() { @@ -83,7 +83,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus @Override protected void doFlush(int count) throws IOException { -reallocate(count); +expandToFit(count); } //Hack for test, make it possible to override checking the buffer capacity @@ -119,7 +119,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus return validateReallocation(newSize); } -protected void reallocate(long count) +protected void expandToFit(long count) { if (count <= 0) return; @@ -141,7 +141,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus public int write(ByteBuffer src) throws IOException { int count = src.remaining(); -reallocate(count); +expandToFit(count); buffer.put(src); return count; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java index c815c9e..c9767fc 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java @@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer * @see org.apache.cassandra.io.util.DataOutputBuffer#reallocate(long) */ @Override -protected void reallocate(long newSize) +protected void expandToFit(long newSize) { throw new BufferOverflowException(); }
[08/10] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/991e1971 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/991e1971 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/991e1971 Branch: refs/heads/cassandra-3.11 Commit: 991e19711f8762bbf93d6af588cef0a14668cc59 Parents: 65a4682 299782c Author: Branimir Lambov Authored: Tue Aug 21 11:56:05 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:56:05 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- diff --cc src/java/org/apache/cassandra/io/util/DataOutputBuffer.java index 144edad,7586543..28ca468 --- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java @@@ -38,43 -37,11 +38,43 @@@ public class DataOutputBuffer extends B /* * Threshold at which resizing transitions from doubling to increasing by 50% */ - private static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); + static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); +/* + * Only recycle OutputBuffers up to 1Mb. Larger buffers will be trimmed back to this size. + */ +private static final int MAX_RECYCLE_BUFFER_SIZE = Integer.getInteger(Config.PROPERTY_PREFIX + "dob_max_recycle_bytes", 1024 * 1024); + +private static final int DEFAULT_INITIAL_BUFFER_SIZE = 128; + +/** + * Scratch buffers used mostly for serializing in memory. It's important to call #recycle() when finished + * to keep the memory overhead from being too large in the system. + */ +public static final FastThreadLocal scratchBuffer = new FastThreadLocal() +{ +protected DataOutputBuffer initialValue() throws Exception +{ +return new DataOutputBuffer() +{ +public void close() +{ +if (buffer.capacity() <= MAX_RECYCLE_BUFFER_SIZE) +{ +buffer.clear(); +} +else +{ +buffer = ByteBuffer.allocate(DEFAULT_INITIAL_BUFFER_SIZE); +} +} +}; +} +}; + public DataOutputBuffer() { -this(128); +this(DEFAULT_INITIAL_BUFFER_SIZE); } public DataOutputBuffer(int size) http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/991e1971/test/unit/org/apache/cassandra/io/util/DataOutputTest.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[03/10] cassandra git commit: Fix SafeMemoryWriter trimming and behaviour over 2G
Fix SafeMemoryWriter trimming and behaviour over 2G patch by Branimir Lambov; reviewed by Benedict Elliott Smith for CASSANDRA-14649 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/49adbe7e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/49adbe7e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/49adbe7e Branch: refs/heads/cassandra-3.11 Commit: 49adbe7e0f0c8a83f3b843b65612528498b5c9a5 Parents: 0e81892 Author: Branimir Lambov Authored: Thu Aug 16 16:15:07 2018 +0300 Committer: Branimir Lambov Committed: Tue Aug 21 11:53:30 2018 +0300 -- .../io/sstable/IndexSummaryBuilder.java | 4 +- .../cassandra/io/util/DataOutputBuffer.java | 8 +- .../io/util/DataOutputBufferFixed.java | 2 +- .../cassandra/io/util/SafeMemoryWriter.java | 16 ++-- .../cassandra/io/util/DataOutputTest.java | 4 +- .../cassandra/io/util/SafeMemoryWriterTest.java | 90 6 files changed, 110 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java index 6110afe..0f604e0 100644 --- a/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java +++ b/src/java/org/apache/cassandra/io/sstable/IndexSummaryBuilder.java @@ -207,8 +207,8 @@ public class IndexSummaryBuilder implements AutoCloseable { // this method should only be called when we've finished appending records, so we truncate the // memory we're using to the exact amount required to represent it before building our summary -entries.setCapacity(entries.length()); -offsets.setCapacity(offsets.length()); +entries.trim(); +offsets.trim(); } public IndexSummary build(IPartitioner partitioner) http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java index 6ea6d97..3f1e081 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBuffer.java @@ -37,7 +37,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus /* * Threshold at which resizing transitions from doubling to increasing by 50% */ -private static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); +static final long DOUBLING_THRESHOLD = Long.getLong(Config.PROPERTY_PREFIX + "DOB_DOUBLING_THRESHOLD_MB", 64); public DataOutputBuffer() { @@ -83,7 +83,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus @Override protected void doFlush(int count) throws IOException { -reallocate(count); +expandToFit(count); } //Hack for test, make it possible to override checking the buffer capacity @@ -119,7 +119,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus return validateReallocation(newSize); } -protected void reallocate(long count) +protected void expandToFit(long count) { if (count <= 0) return; @@ -141,7 +141,7 @@ public class DataOutputBuffer extends BufferedDataOutputStreamPlus public int write(ByteBuffer src) throws IOException { int count = src.remaining(); -reallocate(count); +expandToFit(count); buffer.put(src); return count; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/49adbe7e/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java -- diff --git a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java index c815c9e..c9767fc 100644 --- a/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java +++ b/src/java/org/apache/cassandra/io/util/DataOutputBufferFixed.java @@ -58,7 +58,7 @@ public class DataOutputBufferFixed extends DataOutputBuffer * @see org.apache.cassandra.io.util.DataOutputBuffer#reallocate(long) */ @Override -protected void reallocate(long newSize) +protected void expandToFit(long newSize) { throw new BufferOverflowException(); }
[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary
[ https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587178#comment-16587178 ] Benedict commented on CASSANDRA-14649: -- +1 I do wonder if we should revisit requiring linear memory for all of this, but really we should probably instead revisit if such huge sstables are a good idea. > Index summaries fail when their size gets > 2G and use more space than > necessary > > > Key: CASSANDRA-14649 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14649 > Project: Cassandra > Issue Type: Bug >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Major > > After building a summary, {{IndexSummaryBuilder}} tries to trim the memory > writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of > trimming, this ends up allocating at least as much extra space and failing > the {{Buffer.position()}} call when the size is greater than > {{Integer.MAX_VALUE}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14649) Index summaries fail when their size gets > 2G and use more space than necessary
[ https://issues.apache.org/jira/browse/CASSANDRA-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587147#comment-16587147 ] Branimir Lambov commented on CASSANDRA-14649: - I realized we still have a problem if the size grows by over 2G, i.e. if it becomes >4G and needs to grow. Pushed another commit to fix and test this and limit the test size if there isn't enough memory: [new commit|https://github.com/blambov/cassandra/commit/65798672eff79bd1c97b960ed965f0e908f6c23e] [branch|https://github.com/blambov/cassandra/tree/14649-trunk] [test|https://circleci.com/gh/blambov/workflows/cassandra/tree/14649-trunk] > Index summaries fail when their size gets > 2G and use more space than > necessary > > > Key: CASSANDRA-14649 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14649 > Project: Cassandra > Issue Type: Bug >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Major > > After building a summary, {{IndexSummaryBuilder}} tries to trim the memory > writers by calling {{SafeMemoryWriter.setCapacity(capacity())}}. Instead of > trimming, this ends up allocating at least as much extra space and failing > the {{Buffer.position()}} call when the size is greater than > {{Integer.MAX_VALUE}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14658) Cassandra hangs at startup
Jing Weng created CASSANDRA-14658: - Summary: Cassandra hangs at startup Key: CASSANDRA-14658 URL: https://issues.apache.org/jira/browse/CASSANDRA-14658 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jing Weng Fix For: 3.11.1 Some time ago, when our cassandra cluster started, we found that it could not be started, checked various logs, and found no error message. Later, we obtained the thread stack information at startup by some means, and then refer to the source code, and found that if the commitlog initialization error occurs at startup, cassandra will appear deadlock and hang. The thread stack of COMMIT-LOG-ALLOCATOR: {noformat} { "waitedCount": 0, "lockOwnerId": -1, "lockedMonitors": [], "waitedTime": -1, "blockedCount": 0, "threadState": "RUNNABLE", "inNative": false, "suspended": false, "threadName": "COMMIT-LOG-ALLOCATOR", "lockInfo": null, "threadId": 36, "stackTrace": [ { "fileName": "AbstractCommitLogSegmentManager.java", "nativeMethod": false, "methodName": "runMayThrow", "className": "org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager$1", "lineNumber": 133 }, { "fileName": "WrappedRunnable.java", "nativeMethod": false, "methodName": "run", "className": "org.apache.cassandra.utils.WrappedRunnable", "lineNumber": 28 }, { "fileName": "NamedThreadFactory.java", "nativeMethod": false, "methodName": "lambda$threadLocalDeallocator$0", "className": "org.apache.cassandra.concurrent.NamedThreadFactory", "lineNumber": 81 }, { "fileName": null, "nativeMethod": false, "methodName": "run", "className": "org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$4\/1392794732", "lineNumber": -1 }, { "fileName": "Thread.java", "nativeMethod": false, "methodName": "run", "className": "java.lang.Thread", "lineNumber": 748 } ], "blockedTime": -1, "lockName": null, "lockOwnerName": null, "lockedSynchronizers": [] }{noformat} The thread stack of main thread: {noformat} { "waitedCount": 2, "lockOwnerId": -1, "lockedMonitors": [ { "identityHashCode": 600118828, "lockedStackDepth": 10, "className": "java.lang.Class", "lockedStackFrame": { "fileName": "ColumnFamilyStore.java", "nativeMethod": false, "methodName": "createColumnFamilyStore", "className": "org.apache.cassandra.db.ColumnFamilyStore", "lineNumber": 620 } }, { "identityHashCode": 600118828, "lockedStackDepth": 11, "className": "java.lang.Class", "lockedStackFrame": { "fileName": "ColumnFamilyStore.java", "nativeMethod": false, "methodName": "createColumnFamilyStore", "className": "org.apache.cassandra.db.ColumnFamilyStore", "lineNumber": 594 } }, { "identityHashCode": 1087037934, "lockedStackDepth": 15, "className": "java.lang.Class", "lockedStackFrame": { "fileName": "Keyspace.java", "nativeMethod": false, "methodName": "open", "className": "org.apache.cassandra.db.Keyspace", "lineNumber": 127 } } ], "waitedTime": -1, "blockedCount": 0, "threadState": "WAITING", "inNative": false, "suspended": false, "threadName": "main", "lockInfo": null, "threadId": 1, "stackTrace": [ { "fileName": "Unsafe.java", "nativeMethod": true, "methodName": "park", "className": "sun.misc.Unsafe", "lineNumber": -2 }, { "fileName": "LockSupport.java", "nativeMethod": false, "methodName": "park", "className": "java.util.concurrent.locks.LockSupport", "lineNumber": 304 }, { "fileName": "WaitQueue.java", "nativeMethod": false, "methodName": "awaitUninterruptibly", "className": "org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal", "lineNumber": 280 }, { "fileName": "AbstractCommitLogSegmentManager.java", "nativeMethod": false, "methodName": "awaitAvailableSegment", "className": "org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager", "lineNumber": 262 }, { "fileName": "AbstractCommitLogSegmentManager.java", "nativeMethod": false, "methodName": "advanceAllocatingFrom", "className": "org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager", "lineNumber": 236 }, { "fileName": "AbstractCommitLogSegmentManager.java", "nativeMethod": false, "methodName": "start", "className": "org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager", "lineNumber": 153 },
[jira] [Resolved] (CASSANDRA-14643) Performance overhead with COPY command while bulk data loading
[ https://issues.apache.org/jira/browse/CASSANDRA-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-14643. -- Resolution: Invalid This bug tracker is not for support using Cassandra; you will probably find the user mailing lists more helpful, or perhaps stack overflow. If you find a specific inefficiency in the COPY command that can be improved, then please feel free to file a ticket describing the desired improvement, after confirming no such ticket already exists. > Performance overhead with COPY command while bulk data loading > -- > > Key: CASSANDRA-14643 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14643 > Project: Cassandra > Issue Type: Task > Components: CQL >Reporter: NIKHIL THALLAPALLI >Priority: Major > > Hello Team, > We are facing performance overhead with COPY utility of cql while doing bulk > loading of data. > It took approximately 6 hours while loading around 39L records through COPY > command. > Please be noted that there are few frozen types used in the table structure > which contains around 15 sets each. > so, please let us know the techniques of optimization to achieve faster > loading of data, And also suggest if any parameters can be reset to increase > the efficiency. > Your prompt response on this would be highly appreciated > Thank you, > Nikhil -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14643) Performance overhead with COPY command while bulk data loading
[ https://issues.apache.org/jira/browse/CASSANDRA-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587095#comment-16587095 ] NIKHIL THALLAPALLI commented on CASSANDRA-14643: Hi Team, Can I expect an update on the reported issue. Please let me know in case if you would require any additional details. Thanks, Nikhil > Performance overhead with COPY command while bulk data loading > -- > > Key: CASSANDRA-14643 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14643 > Project: Cassandra > Issue Type: Task > Components: CQL >Reporter: NIKHIL THALLAPALLI >Priority: Major > > Hello Team, > We are facing performance overhead with COPY utility of cql while doing bulk > loading of data. > It took approximately 6 hours while loading around 39L records through COPY > command. > Please be noted that there are few frozen types used in the table structure > which contains around 15 sets each. > so, please let us know the techniques of optimization to achieve faster > loading of data, And also suggest if any parameters can be reset to increase > the efficiency. > Your prompt response on this would be highly appreciated > Thank you, > Nikhil -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions
[ https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587026#comment-16587026 ] Venkata Harikrishna Nukala commented on CASSANDRA-14344: Still working on it. Will give the updated patch in a couple of days. > Support filtering using IN restrictions > --- > > Key: CASSANDRA-14344 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14344 > Project: Cassandra > Issue Type: New Feature >Reporter: Dikang Gu >Assignee: Venkata Harikrishna Nukala >Priority: Major > Attachments: 14344-trunk-2.txt, > 14344-trunk-inexpression-approach.txt, 14344-trunk.txt > > > Support IN filter query like this: > > CREATE TABLE ks1.t1 ( > key int, > col1 int, > col2 int, > value int, > PRIMARY KEY (key, col1, col2) > ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC) > > cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering; > > key | col1 | col2 | value > -+--+--+--- > 1 | 1 | 1 | 1 > 1 | 2 | 1 | 3 > > (2 rows) > cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering; > *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] > message="IN restrictions are not supported on indexed columns"{color}* > cqlsh:ks1> -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org