[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263140#comment-13263140
 ] 

nkeywal commented on HBASE-5877:
--------------------------------

bq. This patch will benefit any move, not just rolling restart, right?
Yes, but as there is a wait time between two tries, I think the benefit will be 
minimal vs. the wait time for a single client. I could add an heuristic like if 
region was closed more than 2 seconds ago, consider that it's now available on 
the new server and don't sleep before the next retry. That could lead of having 
more network messages if the rule is wrong (and the rule will be wrong when the 
system is overloaded), and it will add some complexity to the client code. 
Having the real status of the region would solve this. 

Anyway, with the dev already done to cut the link between master & clients, it 
can help to save a reconnect to master. And during a rolling restart with 
regions moving everywhere, I think it will make a real difference.


bq. I don't see changes to make use of this new functionality? I'd expect the 
balancer in master to make use of it?
Yes, it's the changes in AssignmentManager: the changes are in the patch, but 
are quite small at the end: basically:
{noformat}
-    unassign(plan.getRegionInfo());
+    unassign(plan.getRegionInfo(), false, plan.getDestination());
{noformat}

I still need to manage the case when the destination is not specified at the 
beginning.

                
> When a query fails because the region has moved, let the regionserver return 
> the new address to the client
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5877
>                 URL: https://issues.apache.org/jira/browse/HBASE-5877
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, master, regionserver
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>             Fix For: 0.96.0
>
>         Attachments: 5877.v1.patch
>
>
> This is mainly useful when we do a rolling restart. This will decrease the 
> load on the master and the network load.
> Note that a region is not immediately opened after a close. So:
> - it seems preferable to wait before retrying on the other server. An 
> optimisation would be to have an heuristic depending on when the region was 
> closed.
> - during a rolling restart, the server moves the regions then stops. So we 
> may have failures when the server is stopped, and this patch won't help.
> The implementation in the first patch does:
> - on the region move, there is an added parameter on the regionserver#close 
> to say where we are sending the region
> - the regionserver keeps a list of what was moved. Each entry is kept 100 
> seconds.
> - the regionserver sends a specific exception when it receives a query on a 
> moved region. This exception contains the new address.
> - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to