[ 
https://issues.apache.org/jira/browse/HBASE-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891010#action_12891010
 ] 

Jonathan Gray commented on HBASE-2692:
--------------------------------------

bq. I'd be interested to see your thoughts about determining the minimum number 
of RSs.
bq. How long you going to wait? Whats the trigger that has you start assigning?
This is a configuration parameter that already exists.  We could also add some 
kind of timeout (once one RS checks in, we wait N seconds before starting up).

bq. Also how do you handle the assignment of ROOT and META? What's the chance 
that there's always enough RSs that reported in once both catalog regions are 
assigned?
As I said, this doc leaves out catalog tables for now.  One thing to remember, 
startup is no longer going to need to wait around for the meta scanner to kick 
in and region assignment should be significantly faster, so it could be 
potentially sub-second to assign them out.

bq. So once the load balancer is started, do we block regions going in 
transition?
Well sometimes you can't.  The idea is that load balancing runs when the 
cluster is at steady state.
bq. [if regions in transiton] Doesn't just go back to sleep?
It could.  Might be simpler approach except if you set a long period you'd 
potentially skip a load balance because of a single region in transition.

The load balancer will probably be the last thing I add, still some details to 
flesh out.  I think it makes sense to not kick off a load balancing when there 
are already regions in transition (it makes sense as a background operation 
when the cluster is mostly at steady state from region movement standpoint).  
Whether it waits or skips the run is an open question, I'd opt for waiting.  
The next run of the load balancer would happen at the fixed period, maybe 
starting from the end of the previous run (rather than strictly periodic).

bq. What happens if a region splits while load balancing is running?
Good question.  Will try to address splits in the next revision.  They are 
largely ignored in this one as is meta editing in general.  Let me give that a 
crack.

bq. "We assume that the Master will not fail until after the OFFLINE nodes have 
been created in ZK"   And if it does? Is that for later?
Later, perhaps.  Does not seem unreasonable that the Master can't fail during 
the first minute of the initial cluster startup but I suppose it could 
eventually be handled.  In any case, I think this is okay for now.

bq. Master sends RPCs to each RS, telling them to OPEN their regions.  This is 
a new message?
Yup.  There'll be a new OPEN and CLOSE RPC to RS.  In one of my earlier 
branches for this I moved cluster admin methods to be direct RPC to RS as well 
but that's outside the scope of this for now.

bq. Who updated .META.? The RS, right?
That is the plan.  I will address META editing and splits in the next revision 
of the doc.

bq. [load balancing] This has to be better than what we currently have?
Yes, this should be better (hard to be worse).  Feasible to use ZK to make a 
synchronous client call as well (and could look at status of nodes in zk to see 
the status of regions transitioning).

bq. "Master determines which regions need to be handled."  How?
In the doc I describe that assignment information is sitting in-memory in the 
master, but not positive we should use this since a failed-over master may not 
have this yet.  I left out what a failed-over master does to rebuild the 
assignments in-memory as well.  The latter would be done with the meta scanner, 
the former could as well.

bq. "Master sends RPCs to RSs to open all the regions."  Whats this message 
look like? Its the new one mentioned above?  Will it take time figuring what 
was on the dead RS? Or will it be near-instantaneous?
If we have up-to-date in-memory state to work with, its near-instantaneous.  
Meta scanning not so much.  It'd be nice to be able to use the info in-memory 
if it is available, and if not complete (we're a failed-over master that hasn't 
fully recovered yet) we'd fall-back to meta scanning.

bq. I don't see mention of wait on RS WAL log spitting in here. Should it be 
mentioned?
Yup.  Good call.


Thanks for reviewing it guys.  Not sure I'll have time this week/weekend to do 
a refresh but should have a revised one up sometime Monday.

> Master rewrite and cleanup for 0.90
> -----------------------------------
>
>                 Key: HBASE-2692
>                 URL: https://issues.apache.org/jira/browse/HBASE-2692
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.90.0
>
>         Attachments: BRANCH_NOTES.txt
>
>
> This is the parent issue for master changes targeted at 0.90 release.
> Changes done as part of this issue grew out of work done over in HBASE-2485 
> to move region transitions into ZK.
> In addition to that work, this issue will include general HMaster and 
> ZooKeeper refactorings and cleanups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to