[
https://issues.apache.org/jira/browse/HBASE-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723861#action_12723861
]
Jim Kellerman commented on HBASE-1583:
--------------------------------------
@stack:
> startup is a mess with our assigning out regions an rebalancing at same time.
> By time that the
> compactions on open run, it can be near an hour before whole thing settles
> down and becomes
> useable
safe mode was to prevent rebalancing during startup. Are we not using safe mode
anymore?
@ryan
> The master needs to blast region assignments out there in a
> threaded/performant manner
What I was planning to do for 0.21 when I put region assignments in ZK was to
make a znode
whose children are unassigned regions. A region server can then decide if it is
too busy or
not, and if not, remove the unassigned region from the list and add it to its
list of regions being
served once the region is available.
This removes the master from region assignment. It would then only need to
detect unassigned
regions.
> Start/Stop of large cluster untenable
> -------------------------------------
>
> Key: HBASE-1583
> URL: https://issues.apache.org/jira/browse/HBASE-1583
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.0
>
>
> Starting and stopping a loaded large cluster is way too flakey and takes too
> long. This is 0.19.x but same issues apply to TRUNK I'd say.
> At pset with our > 100 nodes carrying 6k regions:
> + shutdown takes way too long.... maybe ten minutes or so. We compact
> regions inline with shutdown. We should just go down. It doesn't seem like
> all regionservers go down everytime either.
> + startup is a mess with our assigning out regions an rebalancing at same
> time. By time that the compactions on open run, it can be near an hour
> before whole thing settles down and becomes useable
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.