[jira] [Commented] (HBASE-29103) Avoid excessive allocations during reverse scanning when seeking to next row

Hudson (Jira) Sat, 22 Mar 2025 18:09:14 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-29103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937657#comment-17937657
 ]


Hudson commented on HBASE-29103:
--------------------------------

Results for branch branch-3
        [build #396 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/396/]: 
(x) *{color:red}-1 overall{color}*
----
details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/396/General_20Nightly_20Build_20Report/]








(/) {color:green}+1 jdk17 hadoop3 checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/396/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk17 hadoop 3.3.5 backward compatibility checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/396/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk17 hadoop 3.3.6 backward compatibility checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/396/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk17 hadoop 3.4.0 backward compatibility checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/396/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test for 3.3.5 {color}
(/) {color:green}+1 client integration test for 3.3.6 {color}
(/) {color:green}+1 client integration test for 3.4.0 {color}
(/) {color:green}+1 client integration test for 3.4.1 {color}


> Avoid excessive allocations during reverse scanning when seeking to next row
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-29103
>                 URL: https://issues.apache.org/jira/browse/HBASE-29103
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance
>    Affects Versions: 3.0.0-beta-1, 2.6.1, 2.5.11
>            Reporter: Becker Ewing
>            Assignee: Becker Ewing
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: high-block-cache-key-to-string-alloc-profile.html
>
>
> Currently, when we're reverse scanning in a storefile, the general path is to:
>  # Seek to before the current row to find the prior row
>  # Seek to the beginning of the prior row
> (this can get a big more complex depending on how fast a single "seek" 
> operation is, see HBASE-28043 for additional details).
>  
> At step 1, we call HFileScanner#getCell and then we subsequently always call 
> PrivateCellUtil.createFirstOnRow() on this Cell instance 
> ([Code).|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java#L611-L614]
>  PrivateCellUtil.createFirstOnRow() creates a [copy of only the row portion 
> of this 
> Cell|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-common/src/main/java/org/apache/hadoop/hbase/PrivateCellUtil.java#L2768-L2775].
>  
>  
> I propose that since we're only using the key-portion of the cell returned by 
> HFileScanner#getCell, that we should instead call 
> [HFileScanner#getKey|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java#L91-L96]
>  in this scenario so we avoid deep-copying extra components of the Cell such 
> as the value, tags, etc... This should be a safe change as this Cell instance 
> never escapes StoreFileScanner and we only call HFileScanner#getCell when the 
> scanner is already seeked.
>  
> Attached is the same allocation profile taken to guide the optimizations in 
> HBASE-29099 which shows that about 3% of allocations are spent in 
> [BufferedEncodedSeeker.getCell in the body of 
> seekBeforeAndSaveKeyToPreviousRow|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java#L284-L348].
>  The region server in question here was pinned at 100% CPU utilization for a 
> while and was running a reverse-scan heavy workload.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-29103) Avoid excessive allocations during reverse scanning when seeking to next row

Reply via email to