[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169628#comment-17169628 ] Lars Hofhansl commented on HBASE-24742: --- Re: Tests. This is purely an internal optimization of another optimization (no kidding) with no functional impact. For the SEEK optimization we have tests that assert the number of SEEKs vs SKIPs during scanning. I cannot think of any useful additional tests. Lemme perhaps check if there are SEEK vs SKIP tests with ROWCOL BFs enabled. Or [~bharathv], could you also perhaps have a look as I'm off the next few weeks. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162795#comment-17162795 ] Hudson commented on HBASE-24742: Results for branch branch-2.3 [build #189 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/189/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/189/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/189/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/189/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/189/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162382#comment-17162382 ] Nick Dimiduk commented on HBASE-24742: -- bq. Looks like HBASE-19863 added some coverage. Do we need more than that? It's hard to say; those tests pass with and without this change. We pushed a change to critical section of code to all maintenance branches that was not explicitly accompanied by updated test coverage. It will go out in patch releases on three release lines. I'm just asking. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162268#comment-17162268 ] Bharath Vissapragada commented on HBASE-24742: -- Looks like HBASE-19863 added some coverage. Do we need more than that? > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162235#comment-17162235 ] Nick Dimiduk commented on HBASE-24742: -- I'm still retrying toward a passing precommit run for branch-2.3. In the mean time, would it be possible to get a correctness test case to go along with this change? This seems like too fundamental of an area to be tweaking without test coverage. Thanks. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17160250#comment-17160250 ] Hudson commented on HBASE-24742: Results for branch branch-2.2 [build #915 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/915/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/915//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/915//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/915//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159872#comment-17159872 ] Hudson commented on HBASE-24742: Results for branch branch-1 [build #1327 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1327/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1327//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1327//JDK7_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1327//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159861#comment-17159861 ] Hudson commented on HBASE-24742: Results for branch master [build #1787 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1787/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/1787/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- Something went wrong running this stage, please [check relevant console output|https://builds.apache.org/job/HBase%20Nightly/job/master/1787//console]. (x) {color:red}-1 jdk11 hadoop3 checks{color} -- Something went wrong running this stage, please [check relevant console output|https://builds.apache.org/job/HBase%20Nightly/job/master/1787//console]. (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159706#comment-17159706 ] Hudson commented on HBASE-24742: Results for branch branch-2 [build #2748 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2748/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2748/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2748/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2748/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2748/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159456#comment-17159456 ] Andrew Kyle Purtell commented on HBASE-24742: - Sounds good. The other issue is available for branch-2 findings. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159444#comment-17159444 ] Lars Hofhansl commented on HBASE-24742: --- Master (and branch-2) patch. Will just apply as they're the same as the branch-1 patch. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159385#comment-17159385 ] Lars Hofhansl commented on HBASE-24742: --- Merged into branch-1. I'll look into master/branch-2, but my feeling is that things are quite different there. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 1.7.0 > > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158497#comment-17158497 ] Lars Hofhansl commented on HBASE-24742: --- Created a PR for observation #1 above. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17158167#comment-17158167 ] Duo Zhang commented on HBASE-24742: --- {quote} I'm not sure how we can avoid passing the fake keys up, since it is designed to handle things and the "upper" heap. At least not without a lot refactoring. {quote} We need a big refactoring for this change, I believe. Will open a brainstorm issue later, as I'm a bit busy these days. Make summary for the first half and make plan for the second half... > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157895#comment-17157895 ] Lars Hofhansl commented on HBASE-24742: --- Upon second thought. Perhaps a seek (not a reseek, but a seek that could actually goes backwards) could make it so that previousIndexedKey and nextIndexedKey are accidentally the same and we *still* would have to do the compare. In my test the majority of the improvement came from the first change. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157890#comment-17157890 ] Lars Hofhansl commented on HBASE-24742: --- [~zhangduo] I'm with you there. The second part was harder to reason about, and I feel a bit less easy about it. In the end it's an optimization to save a comparison, previousIndexedKey and nextIndexedKey will never accidentally be the same (as in identical), so I *think* it should OK. I'm not sure how we can avoid passing the fake keys up, since it is designed to handle things and the "upper" heap. At least not without a lot refactoring. [~apurtell] I'll take a look at HBASE-24637 (I'm a bit thinly spread, though) > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157815#comment-17157815 ] Duo Zhang commented on HBASE-24742: --- And on the fake cell, maybe we should not pass it to upper layer? I think the intention here, is to avoid do a real seek for a store scanner, if it just needs to know its order in the KeyValueScanner, and delay the actual seek. Now I believe the code logic is to peek the kv and compare them directly. Maybe we could introduce something like getKeyForComparison? So we actually call peek or next, we will done the real seek and get a 'real' kv. In general, I think the open too many internal things to upper layer in KeyValueScanner, we have seek, reseek, requestSeek, realSeekDone, enforceSeek... Maybe when we introduced them at the first place, they were doing well and improving performance, but later when we added more code and fixed bugs, people will misuse them and cause performance regression... Looking at the POC, #1 is good, but I'm a bit nervous for #2, as in the discussion in HBASE-17958, We claimed that the index key could be changed during the next call. Will learn more when the PR is ready. Thanks. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157807#comment-17157807 ] Bharath Vissapragada commented on HBASE-24742: -- > I think the discussion in HBASE-17958 is enough to show that the logic is > necessary Yep, not suggesting that we undo that patch. Instead we should comprehensively fix the codepaths to not do extra byte compares. So agree with you. > I was/am planning a review of any commit that touches SQM and friends. This > was a bit daunting because (I am guessing) the number of commits from circa > 1.3 to 2.2 is more than a handful. [~apurtell] look at the flame graph I attached. I also noticed the bump in number of re-seeks. Based on my analysis I think HBASE-17958 is related, there are more cases where the skip hinting fails in the above code I pasted. Overall I think both the issues are related. We just tested a part of the fix (which is reduce the number of byte comparisons) but we need to analyze the code properly to see where the hinting fails and then re-seeks, which is essentially your jira. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157791#comment-17157791 ] Andrew Kyle Purtell commented on HBASE-24742: - There is an actual regression in SKIP hint handling in branch-2, see HBASE-24637. Is this related? In the HBASE-24637 case the difference is not so much more comparisons on a hot path but an actual serious regression with respect to reseeking (I/O). I went out on vacation (and am still out) before tracking this down. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157787#comment-17157787 ] Duo Zhang commented on HBASE-24742: --- I think the discussion in HBASE-17958 is enough to show that the logic is necessary? Let’s not go back to let the filters deal with strange cells. Will take a look at the patch. Anyway, back to the problem, I agree we have done too many bytes comparison. We should find a general way to deal with it. Was imagine that, we add some methods in the ScannerContext, to record whether we have changed the row, or family, or column, or version, so for most cases we do not need to do bytes comparison again? Thanks. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157779#comment-17157779 ] Lars Hofhansl commented on HBASE-24742: --- Passes the test added in HBASE-19863 and brings the runtime of a test Phoenix query from 5.8s to 4.2s. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157776#comment-17157776 ] Bharath Vissapragada commented on HBASE-24742: -- To add more color, following is the tight loop that Lars is talking about {noformat} protected boolean trySkipToNextColumn(Cell cell) throws IOException { Cell nextCell = null; // used to guard against a changed next indexed key by doing a identity comparison // when the identity changes we need to compare the bytes again Cell previousIndexedKey = null; do { Cell nextIndexedKey = getNextIndexedKey(); if (nextIndexedKey != null && nextIndexedKey != KeyValueScanner.NO_NEXT_INDEXED_KEY && (nextIndexedKey == previousIndexedKey || matcher.compareKeyForNextColumn(nextIndexedKey, cell) >= 0)) { <= this.heap.next(); ++kvsScanned; previousIndexedKey = nextIndexedKey; } else { return false; } } while ((nextCell = this.heap.peek()) != null && CellUtil.matchingRowColumn(cell, nextCell)); // We need this check because it may happen that the new scanner that we get // during heap.next() is requiring reseek due of fake KV previously generated for // ROWCOL bloom filter optimization. See HBASE-19863 for more details if (nextCell != null && matcher.compareKeyForNextColumn(nextCell, cell) < 0) {. <=== return false; } return true; } {noformat} Specifically that was added to prevent SQM from matching the skipped rows but it turns out that it does may more compare checks than what it was before. To test our theory we've undone the loop and let the SQM match the rows and we gained almost ~30% back in scans with explicit column filters. But again as discussed in HBASE-17958, that comes at an expense of correctness that filters shouldn't see skipped rows. [~zghao] [~zhangduo] FYI since you were involved in the original jira fix and implementation. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Attachments: hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157775#comment-17157775 ] Lars Hofhansl commented on HBASE-24742: --- Here's a patch. Please have a careful look, especially at the part that turns previousIndexedKey into a member. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Priority: Major > Attachments: hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157774#comment-17157774 ] Lars Hofhansl commented on HBASE-24742: --- There are two observations: 1. We do not need for "fake" keys inserted by the ROWCOL BF logic if there are not ROWCOL BFs (or if they are not used) 2. We can extend the identify compare of the nextIndexedKey across multiple calls. It's just an optimization and not for correctness. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Priority: Major > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)