[
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576641#comment-13576641
]
nkeywal commented on HBASE-7590:
--------------------------------
current patch shows the work in progress. All tests passes, with or without the
multicast activated. It works also on a real cluster.
I've got some work to do still:
- I've hijacked the current ClusterStatus protobuf, I'm going to create a
specific one
- I need to do some cleanup around ServerName & ServerCallable.
- plus various.
> Add a costless notifications mechanism from master to regionservers & clients
> -----------------------------------------------------------------------------
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
> Issue Type: Bug
> Components: Client, master, regionserver
> Affects Versions: 0.96.0
> Reporter: nkeywal
> Assignee: nkeywal
> Attachments: 7590.inprogress.patch
>
>
> t would be very useful to add a mechanism to distribute some information to
> the clients and regionservers. Especially It would be useful to know globally
> (regionservers + clients apps) that some regionservers are dead. This would
> allow:
> - to lower the load on the system, without clients using staled information
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use
> large timeouts on the client side, so the client may need a lot of time
> before declaring a region server dead and trying another one. If the client
> receives the information separatly about a region server states, it can take
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down'
> to instruct the client to increase the retries delay and so on.
> Technically, the master could send this information. To lower the load on
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great.
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message
> about the dead servers on a multicast socket. If the socket is not
> configured, it does not do anything. On the client side, when we receive an
> information that a node is dead, we refresh the cache about it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira