[
https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-4015:
-------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
Applied to TRUNK. We should consider doing a version of this on branch. J-D
points out it changes the HRegionInterface. Maybe if we put the change on the
end in branch we'll be able to do rolling restarts up to 0.90.5. I'll open new
issue to look into this.
Thanks for persevering with the patch Ram.
> Refactor the TimeoutMonitor to make it less racy
> ------------------------------------------------
>
> Key: HBASE-4015
> URL: https://issues.apache.org/jira/browse/HBASE-4015
> Project: HBase
> Issue Type: Sub-task
> Affects Versions: 0.90.3
> Reporter: Jean-Daniel Cryans
> Assignee: ramkrishna.s.vasudevan
> Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: HBASE-4015_1_trunk.patch, HBASE-4015_2_trunk.patch,
> HBASE-4015_reprepared_trunk_2.patch, Timeoutmonitor with state diagrams.pdf
>
>
> The current implementation of the TimeoutMonitor acts like a race condition
> generator, mostly making things worse rather than better. It does it's own
> thing for a while without caring for what's happening in the rest of the
> master.
> The first thing that needs to happen is that the regions should not be
> processed in one big batch, because that sometimes can take minutes to
> process (meanwhile a region that timed out opening might have opened, then
> what happens is it will be reassigned by the TimeoutMonitor generating the
> never ending PENDING_OPEN situation).
> Those operations should also be done more atomically, although I'm not sure
> how to do it in a scalable way in this case.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira