[ 
https://issues.apache.org/jira/browse/HBASE-29135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Connell updated HBASE-29135:
------------------------------------
    Attachment: create-decompression-stream-zstd.html

> ZStandard decompression can operate directly on ByteBuffs
> ---------------------------------------------------------
>
>                 Key: HBASE-29135
>                 URL: https://issues.apache.org/jira/browse/HBASE-29135
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Charles Connell
>            Assignee: Charles Connell
>            Priority: Minor
>         Attachments: create-decompression-stream-zstd.html
>
>
> I've been thinking about ways to improve HBase's performance when reading 
> HFiles, and I believe there is significant opportunity. I look at many 
> RegionServer profile flamegraphs of my company's servers. A pattern that I've 
> discovered is that object allocation in a very hot code path is a performance 
> killer. The HFile decoding code makes some effort to avoid this, but it isn't 
> totally successful.
> Each time a block is decoded in HFileBlockDefaultDecodingContext, a new 
> DecompressorStream is allocated and used. This is a lot of allocation, and 
> the use of the streaming pattern requires copying every byte to be 
> decompressed more times than necessary. Each byte is copied from a ByteBuff 
> into a byte[], then decompressed, then copied back to a ByteBuff. For 
> decompressors like org.apache.hadoop.hbase.io.compress.zstd.ZstdDecompressor 
> that only operate on direct memory, two additional copies are introduced to 
> move from a byte[] to a direct NIO ByteBuffer, then back to a byte[].
> Aside from the copies inherent in the decompression algorithm, the necessity 
> of copying from an compressed buffer to an uncompressed buffer, all of these 
> other copies can be avoided without sacrificing functionality. Along the way, 
> we'll also avoid allocating objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to