[ 
https://issues.apache.org/jira/browse/HBASE-15545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804066#comment-16804066
 ] 

Eiichi Sato commented on HBASE-15545:
-------------------------------------

We had the same issue and found that this issue gets more serious when Lz4Codec 
or SnappyCodec (both of which by default allocates 256 KiB for every 
decompress(), which is far larger than the default 4 KiB of GzipCodec) is used 
or when compressed BlockCache (where every read needs to decompress() cached 
blocks) is enabled. In our case, DecompressorStream accounts for about 48% of 
memory allocations in a compaction thread, and more than half of RegionServer's 
memory allocations in total. Allocation rate was too high and we had suffered 
from occasional allocation stalls with ZGC.

https://github.com/eiiches/hbase/commit/ad1ec4081b0ec9af5e20befaa1d09d0852e60d02
https://github.com/eiiches/hadoop/commit/e3337840b6e34236342c039b8a0b9fb9fcccfa40

We applied these patches to our cluster and saw 60-70% reduction in allocation 
rate. My approach is to cache DecompressorStream "weakly" in ThreadLocal and 
reuse them. WeakReference is used so that the cache won't be retained too long 
because I thought many people (especially for those who don't use compressed 
BlockCache) would prefer to keep heap usage minimum at the cost of slightly 
more frequent re-allocations.

What do you think? As this requires a change to hadoop-common, I think I will 
go propose the Hadoop part of the change to the community as a first step, if 
you like this fix.

> org.apache.hadoop.io.compress.DecompressorStream allocates too much memory
> --------------------------------------------------------------------------
>
>                 Key: HBASE-15545
>                 URL: https://issues.apache.org/jira/browse/HBASE-15545
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Major
>         Attachments: image-2019-03-29-01-20-56-863.png
>
>
> It accounts for ~ 11% of overall memory allocation during compaction when 
> compression (GZ) is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to