[
https://issues.apache.org/jira/browse/HBASE-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609876#comment-13609876
]
ramkrishna.s.vasudevan commented on HBASE-8144:
-----------------------------------------------
Should we synchronize on failedOpenTracker where we update and remove this
Concurrent hash map? Overall patch look very good.
[~jxiang]
Actually this scenario i have recently seen in 0.94 where Lars had shared me
some logs where the region opening was failing because the Compression codec
while trying to open the region on the RS side was not found.
So this change will atleast avoid the continuous rebouncing of assignment
between master and RS.
HBASE-8049 is to do that. After this patch i think we can make that issue to
work like this,
In case of FAILED_OPEN- can we add the exception msg or the reason why it
failed and add it in the znode so that once we complete the retrying we try to
use that info and prompt the user about the problem.
Let me take up more on that JIRA.
Coming back to this JIRA,
So once this retries are completed how do we again reassign the region? Just
in case.
> Limit number of attempts to assign a region
> -------------------------------------------
>
> Key: HBASE-8144
> URL: https://issues.apache.org/jira/browse/HBASE-8144
> Project: HBase
> Issue Type: Bug
> Components: Region Assignment
> Reporter: Jimmy Xiang
> Assignee: Jimmy Xiang
> Priority: Minor
> Fix For: 0.95.0, 0.98.0
>
> Attachments: trunk-8144.patch
>
>
> In sending a region open request to a region server, we make sure we try at
> most some configured times. However, once the request is accepted by the
> region server, the region could go through this transition forever:
> failed_open (in ZK) => closed => opening => failed_open (in ZK), assuming no
> RPC/network issue.
> It will be good to break the loop and limit the number of tries and move the
> region to failed_open state (will be introduced in HBASE-8137)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira