[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168200#comment-17168200 ] David Capwell edited comment on CASSANDRA-15907 at 7/30/20, 8:35 PM: - FYI org.apache.cassandra.config.DatabaseDescriptorRefTest is failing https://app.circleci.com/pipelines/github/dcapwell/cassandra/379/workflows/83c3e1f6-3279-4426-8af8-a02926b10774/jobs/1975 {code} git checkout trunk git pull --rebase upstream trunk ant realclean && ant && ant generate-idea-files ant testclasslist -Dtest.classlistfile=<(echo org/apache/cassandra/config/DatabaseDescriptorRefTest.java) -Dtest.classlistprefix=unit ... [junit-timeout] Testcase: testDatabaseDescriptorRef(org.apache.cassandra.config.DatabaseDescriptorRefTest): FAILED [junit-timeout] null [junit-timeout] junit.framework.AssertionFailedError [junit-timeout] at org.apache.cassandra.config.DatabaseDescriptorRefTest.checkViolations(DatabaseDescriptorRefTest.java:303) [junit-timeout] at org.apache.cassandra.config.DatabaseDescriptorRefTest.testDatabaseDescriptorRef(DatabaseDescriptorRefTest.java:287) [junit-timeout] [junit-timeout] [junit-timeout] Test org.apache.cassandra.config.DatabaseDescriptorRefTest FAILED [junitreport] Processing /Users/davidcapwell/src/github/apache/cassandra-trunk/build/test/TESTS-TestSuites.xml to /var/folders/cm/08cddl2s25j7fq3jdb76gh4rgn/T/null2013776906 [junitreport] Loading stylesheet jar:file:/usr/local/Cellar/ant/1.10.7/libexec/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl [junitreport] Transform time: 277ms [junitreport] Deleting: /var/folders/cm/08cddl2s25j7fq3jdb76gh4rgn/T/null2013776906 BUILD FAILED /Users/davidcapwell/src/github/apache/cassandra-trunk/build.xml:1981: The following error occurred while executing this line: /Users/davidcapwell/src/github/apache/cassandra-trunk/build.xml:1871: Some test(s) failed. {code} looks like clear snapshot as well, not sure why it passed in circle ci {code} ant testclasslist -Dtest.classlistfile=<(echo org/apache/cassandra/tools/ClearSnapshotTest.java) -Dtest.classlistprefix=unit ... [junit-timeout] Testsuite: org.apache.cassandra.tools.ClearSnapshotTest Tests run: 4, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 8.175 sec [junit-timeout] [junit-timeout] Testcase: testClearSnapshot_RemoveMultiple(org.apache.cassandra.tools.ClearSnapshotTest): FAILED [junit-timeout] null [junit-timeout] junit.framework.AssertionFailedError [junit-timeout] at org.apache.cassandra.tools.ToolRunner.assertEmptyStdErr(ToolRunner.java:338) [junit-timeout] at org.apache.cassandra.tools.ToolRunner.waitAndAssertOnCleanExit(ToolRunner.java:333) [junit-timeout] at org.apache.cassandra.tools.ClearSnapshotTest.testClearSnapshot_RemoveMultiple(ClearSnapshotTest.java:91) [junit-timeout] [junit-timeout] [junit-timeout] Testcase: testClearSnapshot_NoArgs(org.apache.cassandra.tools.ClearSnapshotTest): FAILED [junit-timeout] null [junit-timeout] junit.framework.AssertionFailedError [junit-timeout] at org.apache.cassandra.tools.ToolRunner.assertEmptyStdErr(ToolRunner.java:338) [junit-timeout] at org.apache.cassandra.tools.ToolRunner.waitAndAssertOnCleanExit(ToolRunner.java:333) [junit-timeout] at org.apache.cassandra.tools.ClearSnapshotTest.testClearSnapshot_NoArgs(ClearSnapshotTest.java:61) [junit-timeout] [junit-timeout] [junit-timeout] Testcase: testClearSnapshot_RemoveByName(org.apache.cassandra.tools.ClearSnapshotTest): FAILED [junit-timeout] null [junit-timeout] junit.framework.AssertionFailedError [junit-timeout] at org.apache.cassandra.tools.ToolRunner.assertEmptyStdErr(ToolRunner.java:338) [junit-timeout] at org.apache.cassandra.tools.ToolRunner.waitAndAssertOnCleanExit(ToolRunner.java:333) [junit-timeout] at org.apache.cassandra.tools.ClearSnapshotTest.testClearSnapshot_RemoveByName(ClearSnapshotTest.java:75) [junit-timeout] [junit-timeout] [junit-timeout] Test org.apache.cassandra.tools.ClearSnapshotTest FAILED {code} was (Author: dcapwell): FYI org.apache.cassandra.config.DatabaseDescriptorRefTest is failing https://app.circleci.com/pipelines/github/dcapwell/cassandra/379/workflows/83c3e1f6-3279-4426-8af8-a02926b10774/jobs/1975 {code} git checkout trunk git pull --rebase upstream trunk ant realclean && ant && ant generate-idea-files ant testclasslist -Dtest.classlistfile=<(echo org/apache/cassandra/config/DatabaseDescriptorRefTest.java) -Dtest.classlistprefix=unit ... [junit-timeout] Testcase: testDatabaseDescriptorRef(org.apache.cassandra.config.DatabaseDescriptorRefTest): FAILED [junit-timeout] null [junit-timeout] junit.framework.AssertionFailedError [junit-timeout] at org.apache.cassandra.config.DatabaseDescriptorRefTest.checkViolations(DatabaseDescriptorRefTest.java:303)
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166770#comment-17166770 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/29/20, 5:46 AM: --- [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/4af8d5a3-280d-449a-a650-d3ff2db5b0f4] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/c59be4f8-329e-4d76-9c59-d49c38e58dd2], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/56130350-62b0-48c3-b675-bdc45e3cebf2] The [one failure|https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/c59be4f8-329e-4d76-9c59-d49c38e58dd2/jobs/448] in {{TestBootstrap}} is an existing problem, but it doesn't appear to have a flakey test Jira. Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! was (Author: maedhroz): [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/4af8d5a3-280d-449a-a650-d3ff2db5b0f4] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/c59be4f8-329e-4d76-9c59-d49c38e58dd2], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/56130350-62b0-48c3-b675-bdc45e3cebf2] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 8h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter)
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166770#comment-17166770 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/29/20, 4:27 AM: --- [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/4af8d5a3-280d-449a-a650-d3ff2db5b0f4] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/c59be4f8-329e-4d76-9c59-d49c38e58dd2], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/92/workflows/56130350-62b0-48c3-b675-bdc45e3cebf2] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! was (Author: maedhroz): [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/4af8d5a3-280d-449a-a650-d3ff2db5b0f4] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/b1bd958d-2abf-48ff-a39c-4b43c3fc6008], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/32a03dbd-2062-4a8a-914f-7dacf2ed1a59] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 8h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results). It's a minor fix, but would avoid confusion. > * The method call chain
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166770#comment-17166770 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/29/20, 4:18 AM: --- [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/4af8d5a3-280d-449a-a650-d3ff2db5b0f4] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/b1bd958d-2abf-48ff-a39c-4b43c3fc6008], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/32a03dbd-2062-4a8a-914f-7dacf2ed1a59] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! was (Author: maedhroz): [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/6c0cd966-f55a-4c3c-b78a-81724134cce3] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) {{TestCompression::test_compression_cql_options}} looks like it might be an unreported failure. trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/b1bd958d-2abf-48ff-a39c-4b43c3fc6008], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/32a03dbd-2062-4a8a-914f-7dacf2ed1a59] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 8h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166770#comment-17166770 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/29/20, 4:03 AM: --- [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/6c0cd966-f55a-4c3c-b78a-81724134cce3] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) {{TestCompression::test_compression_cql_options}} looks like it might be an unreported failure. trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/b1bd958d-2abf-48ff-a39c-4b43c3fc6008], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/91/workflows/32a03dbd-2062-4a8a-914f-7dacf2ed1a59] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! was (Author: maedhroz): [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/6c0cd966-f55a-4c3c-b78a-81724134cce3] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) {{TestCompression::test_compression_cql_options}} looks like it might be an unreported failure. trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/2804223b-0b3b-4da4-a180-733f67d179a5], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/51c944a3-ac7e-4e15-acc2-a92844967d9e] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 8h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection"
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166770#comment-17166770 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/29/20, 3:51 AM: --- [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/6c0cd966-f55a-4c3c-b78a-81724134cce3] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) {{TestCompression::test_compression_cql_options}} looks like it might be an unreported failure. trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/2804223b-0b3b-4da4-a180-733f67d179a5], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/51c944a3-ac7e-4e15-acc2-a92844967d9e] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! was (Author: maedhroz): [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/6c0cd966-f55a-4c3c-b78a-81724134cce3] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/2804223b-0b3b-4da4-a180-733f67d179a5], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/51c944a3-ac7e-4e15-acc2-a92844967d9e] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 8h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166770#comment-17166770 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/29/20, 12:38 AM: [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/6c0cd966-f55a-4c3c-b78a-81724134cce3] ({{SASIIndexTest#testIndexMemtableSwitching}} is a known failure) trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/2804223b-0b3b-4da4-a180-733f67d179a5], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/51c944a3-ac7e-4e15-acc2-a92844967d9e] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! was (Author: maedhroz): [~adelapena] Here are the final PRs and test runs (with not ALL commits squashed just yet, to make reviewing the small bits you might not have reviewed yet easier): 3.0: [patch|https://github.com/apache/cassandra/pull/659/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/89/workflows/b6192a9d-774a-4453-95c1-34d46acedcf4] 3.11: [patch|https://github.com/apache/cassandra/pull/665/commits], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/88/workflows/6c0cd966-f55a-4c3c-b78a-81724134cce3] trunk: [patch|https://github.com/apache/cassandra/pull/666/commits], [CircleCI J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/2804223b-0b3b-4da4-a180-733f67d179a5], [CircleCI J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/90/workflows/51c944a3-ac7e-4e15-acc2-a92844967d9e] Let me know if you'd like me to do a final squash at some point. Thanks for all your help on this one! > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 8h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results). It's a minor fix, but would avoid confusion. > * The method call chain in {{DataResolver}} might be a bit simpler if we put > the
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/24/20, 10:19 PM: [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/74/workflows/633624df-d6b2-4904-b766-25d9684a9f6d] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/72/workflows/4ca3d4d2-a9d3-4e08-ace2-e7bb073d54e6] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 7h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/21/20, 8:33 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/72/workflows/4ca3d4d2-a9d3-4e08-ace2-e7bb073d54e6] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/71/workflows/4dfbc4d2-19c2-452d-b2f3-7b6b3fdcbe9a] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 6h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/21/20, 8:02 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/71/workflows/4dfbc4d2-19c2-452d-b2f3-7b6b3fdcbe9a] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/70/workflows/312ef7ca-823f-4d38-8fd7-f3a7fc9de15c] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 6h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/21/20, 7:55 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/70/workflows/312ef7ca-823f-4d38-8fd7-f3a7fc9de15c] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/66/workflows/c3e60efe-a50e-47ff-a8a7-51de36deb17b] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 6h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/21/20, 3:09 AM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/66/workflows/c3e60efe-a50e-47ff-a8a7-51de36deb17b] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/60/workflows/37dcf29a-effb-425a-95de-2362949942c5] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 5h 10m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161550#comment-17161550 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/20/20, 9:31 PM: --- bq. As for tracking the cache size per replica o per query, I think it would be nice if we used the same criteria that is used in the guardrail, so they measure the same thing. That would mean either tracking the metric per query and leaving the guardrail as it is, or changing the guardrail to be per-replica instead of per-query. Tracking the metric per query rather than changing the existing guardrail sounds like the right move, although there isn't really a legitimate multi-partition use-case yet, so most of the time a per-partition metric should be equivalent to a per-query one. On the other hand, the whole point of the metric is to help provide guidance for operators looking to set appropriate warn/fail thresholds. I'll push something up today, along with slightly modified inline docs, etc. was (Author: maedhroz): bq. As for tracking the cache size per replica o per query, I think it would be nice if we used the same criteria that is used in the guardrail, so they measure the same thing. That would mean either tracking the metric per query and leaving the guardrail as it is, or changing the guardrail to be per-replica instead of per-query. Tracking the metric per query rather than changing the existing guardrail sounds like the right move, although there isn't really a legitimate multi-partition use-case yet, so most of the time a per-partition metric should be equivalent to a per-query one. I'll push something up today, along with slightly modified inline docs, etc. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 4h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results). It's a minor fix, but would avoid confusion. > * The method call chain in {{DataResolver}} might be a bit simpler if we put > the {{repairedDataTracker}} in {{ResolveContext}}. > *Testing* > * I want to bite the bullet and get some basic tests for RFP (including any > guardrails we might add here) onto the in-JVM dtest framework. > *Guardrails* > * As it stands, we don't have a way to enforce an upper bound on the memory > usage of {{ReplicaFilteringProtection}} which caches row responses from the > first round of requests. (Remember, these are later used to merged with the > second round of results to complete the data for filtering.) Operators will >
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/17/20, 8:44 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/60/workflows/37dcf29a-effb-425a-95de-2362949942c5] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/57/workflows/b408ce8a-ee68-47bf-b9f1-eb9541e9827e] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 4h 20m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159482#comment-17159482 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/17/20, 5:38 AM: --- bq. the main advantage of this approach is that in the absence of conflicts nothing needs to be cached Even though partition-restricted queries without digest mismatches will skip {{DataResolver}} entirely, this still helps in the case where we have a mismatch, but start with a large number of conflict-free rows, correct? If so, I think this is a benefit we should consider, given how likely we are to be dealing with partition-restricted queries. bq. Also we could also just let it unbounded to minimize the number of RFP queries The one thing that makes me a little uneasy about this is the extra logic we need to enforce the "target cache size". I propose we avoid that, simply leave the guardrails we've already got in place to avoid catastrophe (and excessive RFP queries), and see if that means we can simplify what remains (like having to clear the {{contents}} array list between batches). I'll try this and see how it looks and pull it into the main 3.0 branch if it works. Aside from that, I think the only remaining question would be verifying the safety of [aggressively clearing|https://github.com/apache/cassandra/pull/659/commits/30b8f4bebd95b3520b637d6d25d6bc16cb4d81a2] {{responses}}. was (Author: maedhroz): bq. the main advantage of this approach is that in the absence of conflicts nothing needs to be cached Even though partition-restricted queries without digest mismatches will skip {{DataResolver}} entirely, this still helps in the case where we have a mismatch, but start with a large number of conflict-free rows, correct? If so, I think this is a benefit we can't ignore, given how likely we are to be dealing with partition-restricted queries. bq. Also we could also just let it unbounded to minimize the number of RFP queries The one thing that makes me a little uneasy about this is the extra logic we need to enforce the "target cache size". I propose we avoid that, simply leave the guardrails we've already got in place to avoid catastrophe (and excessive RFP queries), and see if that means we can simplify what remains (like having to clear the {{contents}} array list between batches). I'll try this and see how it looks and pull it into the main 3.0 branch if it works. Aside from that, I think the only remaining question would be verifying the safety of [aggressively clearing|https://github.com/apache/cassandra/pull/659/commits/30b8f4bebd95b3520b637d6d25d6bc16cb4d81a2] {{responses}}. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 4h 20m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159370#comment-17159370 ] Andres de la Peña edited comment on CASSANDRA-15907 at 7/16/20, 5:50 PM: - [~maedhroz] I couldn't resist giving a try to the per-row lazy pre-fetch, I have left it [here|https://github.com/adelapena/cassandra/commit/accf2a47c341875942b0d8b06c016cc0d66d62cb]. Instead of consuming all the contents of each merged partition, it consumes them row-per-row until there are replica contents. That way, if there are no conflicts, it only caches one row per replica, instead of an entire partition per replica. Also, iif it finds that there are rows to fetch from the replica, it advances the first phase merged row iterator a bit more until reaching a certain cache size, trying to find a balance between the cache size and the number of RFP queries. Right now that desired cache size is hardcoded to 100, but we could use a config property, or the query limit, for example. Also we could also just let it unbounded to minimize the number of RFP queries, the main advantage of this approach is that in the absence of conflicts nothing needs to be cached. That should benefit the most common case, which is when there are no conflicts. To illustrate how the per-row approach behaves, let's see this example: {code:python} self._prepare_cluster( create_table="CREATE TABLE t (k int, c int, v text, PRIMARY KEY(k, c))", create_index="CREATE INDEX ON t(v)", both_nodes=["INSERT INTO t (k, c, v) VALUES (0, 0, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 1, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 2, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 3, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 4, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 5, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 6, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 7, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 8, 'old')", "INSERT INTO t (k, c, v) VALUES (0, 9, 'old')"], only_node1=["INSERT INTO t (k, c, v) VALUES (0, 4, 'new')", "INSERT INTO t (k, c, v) VALUES (0, 6, 'new')"]) self._assert_all("SELECT c FROM t WHERE v = 'old'", rows=[[0], [1], [2], [3], [5], [7], [8], [9]]) {code} Without the per-row approach we cached 20 rows (10 per replica) and issued one single RFP query. In contrast, with the per-row approach the behaviour should be: * If the target cache size is very high or unbounded, we will cache a max of 12 rows and we will need 1 RFP queries. There are less cached row because we don't cache rows until the first conflict is found in the fourth row. * If the target cache size is 2, we will cache a max of 8 rows and we will need 1 RFP queries. This is because the two conflicts fit into the same window of cached rows, and once they are fetched we don't need to cache more rows. * If we use a target cache size of 1, we will cache a max of 6 rows but, differently from before, we will need 2 separate RFP queries. * If there are no conflicts we will only have one cached row per replica; the current one. Note that consuming rows from the first phase iterator to populate the cache can still produce an unlimited growth of the cache, so we still need the guardrail. The configurable target cache size that I mention is only used to try to find a balance between cache size and grouping of primary keys to fetch. was (Author: adelapena): [~maedhroz] I couldn't resist giving a try to the making the per-row lazy pre-fetch, I have left it [here|https://github.com/adelapena/cassandra/commit/accf2a47c341875942b0d8b06c016cc0d66d62cb]. Instead of consuming all the contents of each merged partition, it consumes them row-per-row until there are replica contents. That way, if there are no conflicts, it only caches one row per replica, instead of an entire partition per replica. Also, iif it finds that there are rows to fetch from the replica, it advances the first phase merged row iterator a bit more until reaching a certain cache size, trying to find a balance between the cache size and the number of RFP queries. Right now that desired cache size is hardcoded to 100, but we could use a config property, or the query limit, for example. Also we could also just let it unbounded to minimize the number of RFP queries, the main advantage of this approach is that in the absence of conflicts nothing needs to be cached. That should benefit the most common case, which is when there are no conflicts. To illustrate how the per-row approach behaves, let's see this example: {code:python} self._prepare_cluster( create_table="CREATE TABLE t (k int, c int, v text, PRIMARY KEY(k, c))", create_index="CREATE INDEX ON t(v)",
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/16/20, 4:19 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/57/workflows/b408ce8a-ee68-47bf-b9f1-eb9541e9827e] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/51/workflows/4fd639f6-6523-4520-961e-5b5c384a13b3] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 4h 10m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17158490#comment-17158490 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/15/20, 4:37 PM: --- Minor note: Failures in the [latest test run|https://app.circleci.com/pipelines/github/maedhroz/cassandra/51/workflows/4fd639f6-6523-4520-961e-5b5c384a13b3/jobs/281] around {{cqlsh_tests.test_cqlsh.TestCqlsh}} are going to push me into a flaky test Jira. (I've seen these on other Jiras in the past couple weeks, but nobody's reported yet.) EDIT: Created CASSANDRA-15948 was (Author: maedhroz): Minor note: Failures in the [latest test run|https://app.circleci.com/pipelines/github/maedhroz/cassandra/51/workflows/4fd639f6-6523-4520-961e-5b5c384a13b3/jobs/281] around {{cqlsh_tests.test_cqlsh.TestCqlsh}} are going to push me into a flaky test Jira. (I've seen these on other Jiras in the past couple weeks, but nobody's reported yet.) > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 3h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results). It's a minor fix, but would avoid confusion. > * The method call chain in {{DataResolver}} might be a bit simpler if we put > the {{repairedDataTracker}} in {{ResolveContext}}. > *Testing* > * I want to bite the bullet and get some basic tests for RFP (including any > guardrails we might add here) onto the in-JVM dtest framework. > *Guardrails* > * As it stands, we don't have a way to enforce an upper bound on the memory > usage of {{ReplicaFilteringProtection}} which caches row responses from the > first round of requests. (Remember, these are later used to merged with the > second round of results to complete the data for filtering.) Operators will > likely need a way to protect themselves, i.e. simply fail queries if they hit > a particular threshold rather than GC nodes into oblivion. (Having control > over limits and page sizes doesn't quite get us there, because stale results > _expand_ the number of incomplete results we must cache.) The fun question is > how we do this, with the primary axes being scope (per-query, global, etc.) > and granularity (per-partition, per-row, per-cell, actual heap usage, etc.). > My starting disposition on the right trade-off between > performance/complexity and accuracy is having something along the lines of > cached rows per query. Prior art suggests this probably makes sense alongside > things like {{tombstone_failure_threshold}} in {{cassandra.yaml}}. --
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/14/20, 9:13 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/51/workflows/4fd639f6-6523-4520-961e-5b5c384a13b3] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/48/workflows/576994bd-0e65-4786-816f-6dfa58c9b0af] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 3.5h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157366#comment-17157366 ] Andres de la Peña edited comment on CASSANDRA-15907 at 7/14/20, 1:54 PM: - I have left [here|https://github.com/adelapena/cassandra/commit/afb3aafbca48d9ec7b5810d8dee2b1b6d6e4dc50] some changes to {{ReplicaFilteringProtection}} to reduce the number of cached rows in some cases. The caching happens when we fully consume pessimistically SRP-protected results of the first iteration. This is done to collect the primary keys of the silent replicas that have to be queried. The suggested change avoids that full consumption of the first iteration results. Instead, it only consumes a partition each time the second phase iterators (those returned by {{queryProtectedPartitions}}) don't have a cached partition to offer. At that moment the first phase iterator is advanced one partition, so it caches and fetches all the partitions between the last SRP-protected partition and the next one. Note that consuming one more merged partition might involve reading an undefined number of partitions from the replica responses if they invalidate each other, so it doesn't eliminate the problem of caching, it just alleviates it in some cases. Caching is still done at the partition level, so it can ask the replicas with a single query per partition. Thus, this change would be beneficial mainly for multi-partition queries, for example the query in the following test will go down from 20 cached rows to just 4: {code:python} self._prepare_cluster( create_table="CREATE TABLE t (k int PRIMARY KEY, v text)", create_index="CREATE INDEX ON t(v)", both_nodes=["INSERT INTO t (k, v) VALUES (5, 'old')", "INSERT INTO t (k, v) VALUES (1, 'old')", "INSERT INTO t (k, v) VALUES (8, 'old')", "INSERT INTO t (k, v) VALUES (0, 'old')", "INSERT INTO t (k, v) VALUES (2, 'old')", "INSERT INTO t (k, v) VALUES (4, 'old')", "INSERT INTO t (k, v) VALUES (7, 'old')", "INSERT INTO t (k, v) VALUES (6, 'old')", "INSERT INTO t (k, v) VALUES (9, 'old')", "INSERT INTO t (k, v) VALUES (3, 'old')"], only_node1=["INSERT INTO t (k, v) VALUES (5, 'new')", "INSERT INTO t (k, v) VALUES (3, 'new')"]) self._assert_all("SELECT * FROM t WHERE v = 'old'", rows=[[1, 'old'], [8, 'old'], [0, 'old'], [2, 'old'], [4, 'old'], [7, 'old'], [6, 'old'], [9, 'old']]) {code} The number of both SRP and RFP requests remains the same in most cases, although in some particular cases the proposed approach can save us a few queries. That's for example the case of [{{test_complementary_deletion_with_limit_and_rows_after}}|https://github.com/apache/cassandra-dtest/blob/68f05b02842ccf4b2859d35a057d3be77d3313ab/replica_side_filtering_test.py#L284-L294], which goes down from 8 to 5 total SRP requests. We could also do something similar at the row level, so instead of advancing partition per partition we would advance row per row until we have a significant amount of either cached data and/or primary keys to fetch. However, even in that case it would still be possible to design (unlikely) scenarios where to advance the first phase iterator just a single row we would need to read (and cache) the entire db. Thus, we would still need the guardrail, and I'm not sure advancing row per row worths the additional complexity, given that the problematic cases are supposed to be unlikely and the implementation doesn't look as straightforward as the proposed partition-based change. Perhaps we could give it a go after 4.0, as an improvement. I haven't updated the guardrail tests, they would need minor changes for the proposed approach because some queries cache less rows than before. [~maedhroz] WDYT? was (Author: adelapena): I have left [here|https://github.com/adelapena/cassandra/commit/afb3aafbca48d9ec7b5810d8dee2b1b6d6e4dc50] some changes to {{ReplicaFilteringProtection}} to reduce the number of cached rows in some cases. That's for example the case of [{{test_complementary_deletion_with_limit_and_rows_after}}|https://github.com/apache/cassandra-dtest/blob/68f05b02842ccf4b2859d35a057d3be77d3313ab/replica_side_filtering_test.py#L284-L294], which goes down from 8 to 5 total SRP requests. The caching happens when we fully consume pessimistically SRP-protected results of the first iteration. This is done to collect the primary keys of the silent replicas that have to be queried. The suggested change avoids that full consumption of the first iteration results. Instead, it only consumes a partition each time the second phase iterators (those returned by {{queryProtectedPartitions}}) don't have a cached partition to offer. At that moment the first phase iterator
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157366#comment-17157366 ] Andres de la Peña edited comment on CASSANDRA-15907 at 7/14/20, 1:53 PM: - I have left [here|https://github.com/adelapena/cassandra/commit/afb3aafbca48d9ec7b5810d8dee2b1b6d6e4dc50] some changes to {{ReplicaFilteringProtection}} to reduce the number of cached rows in some cases. That's for example the case of [{{test_complementary_deletion_with_limit_and_rows_after}}|https://github.com/apache/cassandra-dtest/blob/68f05b02842ccf4b2859d35a057d3be77d3313ab/replica_side_filtering_test.py#L284-L294], which goes down from 8 to 5 total SRP requests. The caching happens when we fully consume pessimistically SRP-protected results of the first iteration. This is done to collect the primary keys of the silent replicas that have to be queried. The suggested change avoids that full consumption of the first iteration results. Instead, it only consumes a partition each time the second phase iterators (those returned by {{queryProtectedPartitions}}) don't have a cached partition to offer. At that moment the first phase iterator is advanced one partition, so it caches and fetches all the partitions between the last SRP-protected partition and the next one. Note that consuming one more merged partition might involve reading an undefined number of partitions from the replica responses if they invalidate each other, so it doesn't eliminate the problem of caching, it just alleviates it in some cases. Caching is still done at the partition level, so it can ask the replicas with a single query per partition. Thus, this change would be beneficial mainly for multi-partition queries, for example the query in the following test will go down from 20 cached rows to just 4: {code:python} self._prepare_cluster( create_table="CREATE TABLE t (k int PRIMARY KEY, v text)", create_index="CREATE INDEX ON t(v)", both_nodes=["INSERT INTO t (k, v) VALUES (5, 'old')", "INSERT INTO t (k, v) VALUES (1, 'old')", "INSERT INTO t (k, v) VALUES (8, 'old')", "INSERT INTO t (k, v) VALUES (0, 'old')", "INSERT INTO t (k, v) VALUES (2, 'old')", "INSERT INTO t (k, v) VALUES (4, 'old')", "INSERT INTO t (k, v) VALUES (7, 'old')", "INSERT INTO t (k, v) VALUES (6, 'old')", "INSERT INTO t (k, v) VALUES (9, 'old')", "INSERT INTO t (k, v) VALUES (3, 'old')"], only_node1=["INSERT INTO t (k, v) VALUES (5, 'new')", "INSERT INTO t (k, v) VALUES (3, 'new')"]) self._assert_all("SELECT * FROM t WHERE v = 'old'", rows=[[1, 'old'], [8, 'old'], [0, 'old'], [2, 'old'], [4, 'old'], [7, 'old'], [6, 'old'], [9, 'old']]) {code} The number of both SRP and RFP requests remains the same in most cases, although in some particular cases the proposed approach can save us a few queries. We could also do something similar at the row level, so instead of advancing partition per partition we would advance row per row until we have a significant amount of either cached data and/or primary keys to fetch. However, even in that case it would still be possible to design (unlikely) scenarios where to advance the first phase iterator just a single row we would need to read (and cache) the entire db. Thus, we would still need the guardrail, and I'm not sure advancing row per row worths the additional complexity, given that the problematic cases are supposed to be unlikely and the implementation doesn't look as straightforward as the proposed partition-based change. Perhaps we could give it a go after 4.0, as an improvement. I haven't updated the guardrail tests, they would need minor changes for the proposed approach because some queries cache less rows than before. [~maedhroz] WDYT? was (Author: adelapena): I have left [here|https://github.com/adelapena/cassandra/commit/afb3aafbca48d9ec7b5810d8dee2b1b6d6e4dc50] some changes to {{ReplicaFilteringProtection}} to reduce the number of cached rows in some cases. The caching happens when we fully consume pessimistically SRP-protected results of the first iteration. This is done to collect the primary keys of the silent replicas that have to be queried. The suggested change avoids that full consumption of the first iteration results. Instead, it only consumes a partition each time the second phase iterators (those returned by {{queryProtectedPartitions}}) don't have a cached partition to offer. At that moment the first phase iterator is advanced one partition, so it caches and fetches all the partitions between the last SRP-protected partition and the next one. Note that consuming one more merged partition might involve reading an undefined number of partitions from the replica responses if they
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157366#comment-17157366 ] Andres de la Peña edited comment on CASSANDRA-15907 at 7/14/20, 1:36 PM: - I have left [here|https://github.com/adelapena/cassandra/commit/afb3aafbca48d9ec7b5810d8dee2b1b6d6e4dc50] some changes to {{ReplicaFilteringProtection}} to reduce the number of cached rows in some cases. The caching happens when we fully consume pessimistically SRP-protected results of the first iteration. This is done to collect the primary keys of the silent replicas that have to be queried. The suggested change avoids that full consumption of the first iteration results. Instead, it only consumes a partition each time the second phase iterators (those returned by {{queryProtectedPartitions}}) don't have a cached partition to offer. At that moment the first phase iterator is advanced one partition, so it caches and fetches all the partitions between the last SRP-protected partition and the next one. Note that consuming one more merged partition might involve reading an undefined number of partitions from the replica responses if they invalidate each other, so it doesn't eliminate the problem of caching, it just alleviates it in some cases. Caching is still done at the partition level, so it can ask the replicas with a single query per partition. Thus, this change would be beneficial mainly for multi-partition queries, for example the query in the following test will go down from 20 cached rows to just 4: {code:python} self._prepare_cluster( create_table="CREATE TABLE t (k int PRIMARY KEY, v text)", create_index="CREATE INDEX ON t(v)", both_nodes=["INSERT INTO t (k, v) VALUES (5, 'old')", "INSERT INTO t (k, v) VALUES (1, 'old')", "INSERT INTO t (k, v) VALUES (8, 'old')", "INSERT INTO t (k, v) VALUES (0, 'old')", "INSERT INTO t (k, v) VALUES (2, 'old')", "INSERT INTO t (k, v) VALUES (4, 'old')", "INSERT INTO t (k, v) VALUES (7, 'old')", "INSERT INTO t (k, v) VALUES (6, 'old')", "INSERT INTO t (k, v) VALUES (9, 'old')", "INSERT INTO t (k, v) VALUES (3, 'old')"], only_node1=["INSERT INTO t (k, v) VALUES (5, 'new')", "INSERT INTO t (k, v) VALUES (3, 'new')"]) self._assert_all("SELECT * FROM t WHERE v = 'old'", rows=[[1, 'old'], [8, 'old'], [0, 'old'], [2, 'old'], [4, 'old'], [7, 'old'], [6, 'old'], [9, 'old']]) {code} The number of both SRP and RFP requests remains the same in most cases, although in some particular cases the proposed approach can save us a few queries. We could also do something similar at the row level, so instead of advancing partition per partition we would advance row per row until we have a significant amount of either cached data and/or primary keys to fetch. However, even in that case it would still be possible to design (unlikely) scenarios where to advance the first phase iterator just a single row we would need to read (and cache) the entire db. Thus, we would still need the guardrail, and I'm not sure advancing row per row worths the additional complexity, given that the problematic cases are supposed to be unlikely. Perhaps we could give it a go after 4.0, as an improvement. I haven't updated the guardrail tests, they would need minor changes for the proposed approach because some queries cache less rows than before. [~maedhroz] WDYT? was (Author: adelapena): I have left [here|https://github.com/adelapena/cassandra/commit/afb3aafbca48d9ec7b5810d8dee2b1b6d6e4dc50] some changes to {{ReplicaFilteringProtection}} to reduce the number of cached rows in some cases. The caching happens when we fully consume pessimistically SRP-protected results of the first iteration. This is done to collect the primary keys of the silent replicas that have to be queried. The suggested change avoids that full consumption of the first iteration results. Instead, it only consumes a partition each time the second phase iterators (those returned by {{queryProtectedPartitions}}) don't have a cached partition to offer. At that moment the first phase iterator is advanced one partition, so it caches and fetches all the partitions between the last SRP-protected partition and the next one. Note that consuming one more merged partition might involve reading an undefined number of partitions from the replica responses if they invalidate each other, so it doesn't eliminate the problem of caching, it just alleviates it in some cases. Caching is still done at the partition level, so it can ask the replicas with a single query per partition. Thus, this change would be beneficial mainly for multi-partition queries, for example the query in the following test will go down from 20 cached
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/13/20, 6:11 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/48/workflows/576994bd-0e65-4786-816f-6dfa58c9b0af] WIP (avoid review ATM) 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/48/workflows/576994bd-0e65-4786-816f-6dfa58c9b0af] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 3.5h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/13/20, 6:10 PM: --- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/48/workflows/576994bd-0e65-4786-816f-6dfa58c9b0af] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/39/workflows/a8187516-e0cd-4e06-b017-6e2e00fcfe3f] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 3.5h > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/9/20, 4:56 AM: -- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/39/workflows/a8187516-e0cd-4e06-b017-6e2e00fcfe3f] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/38/workflows/c3a3b51b-d105-49d9-91f8-2a149cf211b6] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 1h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/9/20, 4:40 AM: -- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/84e48d9e-f3dd-45ff-b70a-b69a86f6eb96] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/37/workflows/79b581ab-94a1-4920-a894-7f0f91ef466b] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/20388274-baa6-4a81-bf8a-5590325b2e7f] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/b463f9e0-10c6-458a-b1e8-cfb635703d07] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 1h 50m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/8/20, 6:15 PM: -- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666], [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/20388274-baa6-4a81-bf8a-5590325b2e7f] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/b463f9e0-10c6-458a-b1e8-cfb635703d07] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665] [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666] [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/20388274-baa6-4a81-bf8a-5590325b2e7f] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/b463f9e0-10c6-458a-b1e8-cfb635703d07] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/8/20, 6:14 PM: -- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665] [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666] [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/20388274-baa6-4a81-bf8a-5590325b2e7f] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/b463f9e0-10c6-458a-b1e8-cfb635703d07] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: CASSANDRA-15881 is visible in the unit tests results. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results). It's a minor fix, but would avoid confusion. > * The method call chain in {{DataResolver}} might be a bit simpler if we put > the {{repairedDataTracker}} in {{ResolveContext}}. > *Testing* > * I want to bite the bullet and get some basic tests for RFP (including any > guardrails we might add here) onto the in-JVM dtest framework. > *Guardrails* > * As it stands, we don't have a way to enforce an upper bound on the memory > usage of {{ReplicaFilteringProtection}} which caches row responses from the > first round of requests. (Remember, these are later used to merged with the > second
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/8/20, 6:14 PM: -- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665] [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666] [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/20388274-baa6-4a81-bf8a-5590325b2e7f] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/b463f9e0-10c6-458a-b1e8-cfb635703d07] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] 3.11: [patch|https://github.com/apache/cassandra/pull/665] [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/31/workflows/2c8a5738-13c0-447d-8ebc-53ade399f94c] trunk: [patch|https://github.com/apache/cassandra/pull/666] [j8 CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/20388274-baa6-4a81-bf8a-5590325b2e7f] [j11 Circle CI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/33/workflows/b463f9e0-10c6-458a-b1e8-cfb635703d07] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: Existing issues described by CASSANDRA-14595 (Thrift dtest) and CASSANDRA-15881 (SASI memtable switching) are visible in the test results so far. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results).
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/7/20, 6:10 PM: -- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] If we're happy with the implementation, the next step will be to do some basic stress testing. Note: CASSANDRA-15881 is visible in the unit tests results. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] If we're happy with the implementation, the next step will be to do some basic stress testing. > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > Time Spent: 1h 10m > Remaining Estimate: 0h > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results). It's a minor fix, but would avoid confusion. > * The method call chain in {{DataResolver}} might be a bit simpler if we put > the {{repairedDataTracker}} in {{ResolveContext}}. > *Testing* > * I want to bite the bullet and get some basic tests for RFP (including any > guardrails we might add here) onto the in-JVM dtest framework. > *Guardrails* > * As it stands, we don't have a way to enforce an upper bound on the memory > usage of {{ReplicaFilteringProtection}} which caches row responses from the > first round of requests. (Remember, these are later used to merged with the > second round of results to complete the data for filtering.) Operators will > likely need a way to protect themselves, i.e. simply fail queries if they hit > a particular threshold rather than GC nodes into oblivion. (Having control > over limits and page sizes doesn't quite get us there, because stale results > _expand_ the number of incomplete results we must cache.) The fun question is > how we do this, with the primary axes being scope (per-query, global, etc.) > and granularity (per-partition, per-row, per-cell, actual heap usage, etc.). > My starting disposition on the right trade-off between > performance/complexity and accuracy is
[jira] [Comment Edited] (CASSANDRA-15907) Operational Improvements & Hardening for Replica Filtering Protection
[ https://issues.apache.org/jira/browse/CASSANDRA-15907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152248#comment-17152248 ] Caleb Rackliffe edited comment on CASSANDRA-15907 at 7/6/20, 6:58 PM: -- [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] If we're happy with the implementation, the next step will be to do some basic stress testing. was (Author: maedhroz): [~jwest] I've hopefully addressed the points from [~adelapena]'s first round of review, so I think this is officially ready for a second reviewer. 3.0: [patch|https://github.com/apache/cassandra/pull/659], [CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra/22/workflows/d272c9e6-1db6-472f-93d9-f2715a25ef97] > Operational Improvements & Hardening for Replica Filtering Protection > - > > Key: CASSANDRA-15907 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15907 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/2i Index >Reporter: Caleb Rackliffe >Assignee: Caleb Rackliffe >Priority: Normal > Labels: 2i, memory > Fix For: 3.0.x, 3.11.x, 4.0-beta > > > CASSANDRA-8272 uses additional space on the heap to ensure correctness for 2i > and filtering queries at consistency levels above ONE/LOCAL_ONE. There are a > few things we should follow up on, however, to make life a bit easier for > operators and generally de-risk usage: > (Note: Line numbers are based on {{trunk}} as of > {{3cfe3c9f0dcf8ca8b25ad111800a21725bf152cb}}.) > *Minor Optimizations* > * {{ReplicaFilteringProtection:114}} - Given we size them up-front, we may be > able to use simple arrays instead of lists for {{rowsToFetch}} and > {{originalPartitions}}. Alternatively (or also), we may be able to null out > references in these two collections more aggressively. (ex. Using > {{ArrayList#set()}} instead of {{get()}} in {{queryProtectedPartitions()}}, > assuming we pass {{toFetch}} as an argument to {{querySourceOnKey()}}.) > * {{ReplicaFilteringProtection:323}} - We may be able to use > {{EncodingStats.merge()}} and remove the custom {{stats()}} method. > * {{DataResolver:111 & 228}} - Cache an instance of > {{UnaryOperator#identity()}} instead of creating one on the fly. > * {{ReplicaFilteringProtection:217}} - We may be able to scatter/gather > rather than serially querying every row that needs to be completed. This > isn't a clear win perhaps, given it targets the latency of single queries and > adds some complexity. (Certainly a decent candidate to kick even out of this > issue.) > *Documentation and Intelligibility* > * There are a few places (CHANGES.txt, tracing output in > {{ReplicaFilteringProtection}}, etc.) where we mention "replica-side > filtering protection" (which makes it seem like the coordinator doesn't > filter) rather than "replica filtering protection" (which sounds more like > what we actually do, which is protect ourselves against incorrect replica > filtering results). It's a minor fix, but would avoid confusion. > * The method call chain in {{DataResolver}} might be a bit simpler if we put > the {{repairedDataTracker}} in {{ResolveContext}}. > *Testing* > * I want to bite the bullet and get some basic tests for RFP (including any > guardrails we might add here) onto the in-JVM dtest framework. > *Guardrails* > * As it stands, we don't have a way to enforce an upper bound on the memory > usage of {{ReplicaFilteringProtection}} which caches row responses from the > first round of requests. (Remember, these are later used to merged with the > second round of results to complete the data for filtering.) Operators will > likely need a way to protect themselves, i.e. simply fail queries if they hit > a particular threshold rather than GC nodes into oblivion. (Having control > over limits and page sizes doesn't quite get us there, because stale results > _expand_ the number of incomplete results we must cache.) The fun question is > how we do this, with the primary axes being scope (per-query, global, etc.) > and granularity (per-partition, per-row, per-cell, actual heap usage, etc.). > My starting disposition on the right trade-off between > performance/complexity and accuracy is having something along the lines of > cached rows per query. Prior art suggests this probably makes sense alongside > things like {{tombstone_failure_threshold}} in {{cassandra.yaml}}. -- This message was sent