[
https://issues.apache.org/jira/browse/HBASE-22460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914480#comment-16914480
]
stack commented on HBASE-22460:
-------------------------------
Nice writeup [~vjasani]
This one is a little odd because it seems like the problem is fully local to
the RegionServer so sensible it takes care of issues it has uncovered.
Technically #1 is most expedient but I favor #2. Master has general cluster
state. It is best positioned making remediation decisions. It does region open
and close in all cases (except in cluster shutdown scenario). Doing #1 violates
Master being in charge of assign. Doing in Master is also the more frugal
choice. #1 requires every RS running a monitoring thread. Instead we can run
one task in Master for whole cluster to look at refcounts and it then does the
reopen and no chance of it being surprised by self-ordained RS reopen.
> Reopen a region if store reader references may have leaked
> ----------------------------------------------------------
>
> Key: HBASE-22460
> URL: https://issues.apache.org/jira/browse/HBASE-22460
> Project: HBase
> Issue Type: Sub-task
> Reporter: Andrew Purtell
> Assignee: Viraj Jasani
> Priority: Minor
>
> We can leak store reader references if a coprocessor or core function somehow
> opens a scanner, or wraps one, and then does not take care to call close on
> the scanner or the wrapped instance. A reasonable mitigation for a reader
> reference leak would be a fast reopen of the region on the same server
> (initiated by the RS) This will release all resources, like the refcount,
> leases, etc. The clients should gracefully ride over this like any other
> region transition. This reopen would be like what is done during schema
> change application and ideally would reuse the relevant code. If the refcount
> is over some ridiculous threshold this mitigation could be triggered along
> with a fat WARN in the logs.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)