[
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267751#comment-13267751
]
nkeywal commented on HBASE-5877:
--------------------------------
v12, should be final.
1) ServerName is used everywhere in the interface, thanks to protobuf
2) hadoop.ipc serialization of exception is based on the #getMessage. So we
have to parse it internally. It's not visisble to the exception user.
3) The code to manage the error in the client package is quite complex. We have
the exception at the very beginning, and then it's checked again, but we don't
have the real exception anymore. I used a new "historyList" to make it works.
There is another JIRA for other improvement, in which I could get rid of this
(HBASE-5924)
4) Generated with protobuf 2.4.1
5) The destination is the closeRegion interface is a kind of interface
hijacking. Other options would be:
- sharing the region state in zookeeper
- letting the regionserver calls the master to get the new server. On paper
this would be more efficient than a client -> master call. In both cases we
could consider that the client should not connect to the master except for
cluster administration (create table, split regin; ...). That would increase
global reliability. That's for another discussion as well I think.
6) RegionServerServices has been modified to set a destination when removing a
region from the online regions.
7) In another JIRA I will manage the case when the destination is not specified
when calling the move function.
> When a query fails because the region has moved, let the regionserver return
> the new address to the client
> ----------------------------------------------------------------------------------------------------------
>
> Key: HBASE-5877
> URL: https://issues.apache.org/jira/browse/HBASE-5877
> Project: HBase
> Issue Type: Improvement
> Components: client, master, regionserver
> Affects Versions: 0.96.0
> Reporter: nkeywal
> Assignee: nkeywal
> Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch
>
>
> This is mainly useful when we do a rolling restart. This will decrease the
> load on the master and the network load.
> Note that a region is not immediately opened after a close. So:
> - it seems preferable to wait before retrying on the other server. An
> optimisation would be to have an heuristic depending on when the region was
> closed.
> - during a rolling restart, the server moves the regions then stops. So we
> may have failures when the server is stopped, and this patch won't help.
> The implementation in the first patch does:
> - on the region move, there is an added parameter on the regionserver#close
> to say where we are sending the region
> - the regionserver keeps a list of what was moved. Each entry is kept 100
> seconds.
> - the regionserver sends a specific exception when it receives a query on a
> moved region. This exception contains the new address.
> - the client analyses the exeptions and update its cache accordingly...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira