[jira] [Commented] (HBASE-28256) Enhance ByteBufferUtils.readVLong to read 8 bytes at a time

Becker Ewing (Jira) Thu, 21 Dec 2023 11:35:05 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-28256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799546#comment-17799546
 ]


Becker Ewing commented on HBASE-28256:
--------------------------------------

At the higher-end (6 byte long encoded as vLong and up), it seems like 
readVLongTimestamp has better performance than readVLongHBase14186 when padding 
exists (which I'm pretty sure will be the common case) and when we're not using 
the none recycler (i.e. what perf will be like in Region Servers until 
HBASE-27730). Since I plan on getting to 
https://issues.apache.org/jira/browse/HBASE-27730 soon, this likely isn't a 
huge issue as the performance of readVLongHBase14186 generally looks way better 
on the none recycler.

> Enhance ByteBufferUtils.readVLong to read 8 bytes at a time
> -----------------------------------------------------------
>
>                 Key: HBASE-28256
>                 URL: https://issues.apache.org/jira/browse/HBASE-28256
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance
>            Reporter: Becker Ewing
>            Assignee: Becker Ewing
>            Priority: Major
>         Attachments: ReadVLongBenchmark.zip, async-prof-rs-cpu.html
>
>
> Currently, ByteBufferUtils.readVLong is used to decode rows in all data block 
> encodings in order to read the memstoreTs field. For a data block encoding 
> like prefix, ByteBufferUtils.readVLong can surprisingly occupy over 50% of 
> the CPU time in BufferedEncodedSeeker.decodeNext (which can be quite a hot 
> method in seek operations).
>  
> Since memstoreTs will typically require at least 6 bytes to store, we could 
> look to vectorize the read path for readVLong to read 8 bytes at a time 
> instead of a single byte at a time (like in 
> https://issues.apache.org/jira/browse/HBASE-28025) in order to increase 
> performance.
>  
> Attached is a CPU flamegraph of a region server process which shows that we 
> spend a surprising amount of time in decoding rows from the DBE in 
> ByteBufferUtils.readVLong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28256) Enhance ByteBufferUtils.readVLong to read 8 bytes at a time

Reply via email to