[ 
https://issues.apache.org/jira/browse/HBASE-28256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801863#comment-17801863
 ] 

Becker Ewing commented on HBASE-28256:
--------------------------------------

I tested whether ByteBufferUtils is currently inlined by using the VM Args 
above. When running TestDataBlockEncoders#testSeekingOnSample on the 5th 
parameter set—i.e. use memstoreTs, no tags, and off-heap backed cells—I've 
found that readVLong is inlined sometimes ([full stdout logs available 
here|https://gist.github.com/jbewing/a9fc9d6f3a58d211c78bd6f4e2e97449]). If I'm 
interpreting these correctly, it looks like readVLong is inlined for the prefix 
and diff seekers. The current method size is ~161 bytes which seems to put it 
on the larger side as far as inlining is concerned. 

> Enhance ByteBufferUtils.readVLong to read 8 bytes at a time
> -----------------------------------------------------------
>
>                 Key: HBASE-28256
>                 URL: https://issues.apache.org/jira/browse/HBASE-28256
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance
>            Reporter: Becker Ewing
>            Assignee: Becker Ewing
>            Priority: Major
>         Attachments: ReadVLongBenchmark.zip, async-prof-rs-cpu.html
>
>
> Currently, ByteBufferUtils.readVLong is used to decode rows in all data block 
> encodings in order to read the memstoreTs field. For a data block encoding 
> like prefix, ByteBufferUtils.readVLong can surprisingly occupy over 50% of 
> the CPU time in BufferedEncodedSeeker.decodeNext (which can be quite a hot 
> method in seek operations).
>  
> Since memstoreTs will typically require at least 6 bytes to store, we could 
> look to vectorize the read path for readVLong to read 8 bytes at a time 
> instead of a single byte at a time (like in 
> https://issues.apache.org/jira/browse/HBASE-28025) in order to increase 
> performance.
>  
> Attached is a CPU flamegraph of a region server process which shows that we 
> spend a surprising amount of time in decoding rows from the DBE in 
> ByteBufferUtils.readVLong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to