[
https://issues.apache.org/jira/browse/HBASE-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell resolved HBASE-8281.
----------------------------------------
Resolution: Incomplete
> Unassigned regions: dropped messages from Master to RS
> ------------------------------------------------------
>
> Key: HBASE-8281
> URL: https://issues.apache.org/jira/browse/HBASE-8281
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.89-fb
> Reporter: Amitanand Aiyer
> Priority: Major
>
> We have seen a couple of scenarios where transcient network issue between the
> RS and Master results in regions being unassigned (and staying unassigned)
> until someone intervenes manually with hbck -fix.
> The events occur as follows.
> RS checks in for a regionServerReport.
> Master wants to assign a region to the RS. Hence adds a MSG_REGION_OPEN msg
> to the return results, and marks the region as PENDING_OPEN.
> The messages from the master to the RS is not delivered due to network
> error. Master does not do anything to revert the state changes.
> Network heals, and the RS is able to do regionServerReports in future; it is
> in good standing with the master. But, RS does not know that it has to open
> the region. Master thinks that the RS is going to open the region.
> Region remains unassigned until we intervene with hbck.
> Possible fix:
> I think it may be a mistake to unilaterally change the RegionState to
> pendingOpen once the master decides that it wants to send the message.
> Perhaps, we should create an intermediate state, where the master will keep
> sending the OPEN message to the RS until it acks. And, update the RegionState
> to PendingOpen only after the RS has acked.
> While this would fix the particular scenario in which the unassigned regions
> were caused. We might want to update all the Master-RS communication (even
> region closes?)to expect message failures, and wait for an ack before it
> updates the state in master.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)