[
https://issues.apache.org/jira/browse/HBASE-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923784#comment-15923784
]
Duo Zhang commented on HBASE-17747:
-----------------------------------
OK, after reviewing the code of BucketCache, now I'm +1 on changing the
reference type of IdReadWriteLock which is used in BucketCache from weak to
soft.
The offsetLock here is the lock of offset in bucket cache, not hfile, so for a
stable workload(typical scenario in read world), the buckets in BucketCache
will also be stable after some time, which means we will likely to use the same
set of offsets to acquire lock from offsetLock even if we keep evicting entries
out. Using strong reference maybe better but it will introduce more complex
logic as we need to clear the unused offset by ourselves, so I think soft
reference is good here.
[~carp84] Mind adding the comments above to the implementations?
Thanks.
> Support both weak and soft object pool
> --------------------------------------
>
> Key: HBASE-17747
> URL: https://issues.apache.org/jira/browse/HBASE-17747
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 2.0
> Reporter: Yu Li
> Assignee: Yu Li
> Fix For: 2.0
>
> Attachments: HBASE-17747.patch, HBASE-17747.v2.patch,
> HBASE-17747.v3.patch
>
>
> During YCSB testing on embedded mode after HBASE-17744, we found that under
> high read load GC is quite severe even with offheap L2 cache. After some
> investigation, we found it's caused by using weak reference in
> {{IdReadWriteLock}}. In embedded mode the read is so quick that the lock
> might already get promoted to the old generation when the weak reference is
> cleared, which causes dirty card table (old reference get removed and new
> lock object set into {{referenceCache}}, see {{WeakObjectPool#get}}) thus
> slowing YGC. In distributed mode there'll also be more lock object created
> with weak reference than soft reference that slowing down the processing.
> So we proposed to use soft reference for this {{IdReadWriteLock}} used in
> cache, which won't get cleared until JVM memory is not enough, and could
> resolve the issue mentioned above. What's more, we propose to extend the
> {{WeakObjectPool}} to be more generate to support both weak and soft
> reference.
> Note that the GC issue only emerges under embedded mode with DirectOperator,
> in which case all costs on the wire is removed thus produces extremely high
> concurrency.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)