[
https://issues.apache.org/jira/browse/HBASE-19435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279504#comment-16279504
]
Zach York commented on HBASE-19435:
-----------------------------------
[~tedyu] I think that scenario would be a bad error... What would be causing
these connections to get interrupted/closed frequently? If there is an HBase
thread that keeps interrupting that connection, we should fix the error there.
Currently, I believe we are seeing this in some case where a compaction
interrupts the connection, but haven't isolated the specific process.
What do you propose? I guess I could implement some sort of max retries before
disabling the cache again which would be reset on successful access, but this
would be fairly fragile (what is the right number?)
I think in addition to the proposed fix, we should look at trying to re-enable
disabled caches after a period (if disabled due to an error). However, that
wouldn't invalidate this change since even if the cache is re-enabled, unless
the connections are refreshed, it will just get disabled again.
> Reopen Files for ClosedChannelException in BucketCache
> ------------------------------------------------------
>
> Key: HBASE-19435
> URL: https://issues.apache.org/jira/browse/HBASE-19435
> Project: HBase
> Issue Type: Bug
> Components: BucketCache
> Affects Versions: 2.0.0, 1.3.1
> Reporter: Zach York
> Assignee: Zach York
> Attachments: HBASE-19435.master.001.patch
>
>
> When using the FileIOEngine for BucketCache, the cache will be disabled if
> the connection is interrupted or closed. HBase will then get
> ClosedChannelExceptions trying to access the file. After 60s, the RS will
> disable the cache. This causes severe read performance degradation for
> workloads that rely on this cache. FileIOEngine never tries to reopen the
> connection. This JIRA is to reopen files when the BucketCache encounters a
> ClosedChannelException.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)