Re: Client/Server Multicast Discovery and Failover

Dain Sundstrom Fri, 12 Sep 2008 13:15:08 -0700

On Sep 12, 2008, at 12:45 PM, David Blevins wrote:

I originally had the version be a simple "increment by one"strategy, but eventually went with the value ofSystem.currentTimeMillis(). It's possible more than one server isreachable via the ServerMetaData (i.e. multicast://) and each serverhas it's own list and version number. Secondly, if a server isrestarted, the version number will go back to zero and the clientcould be stuck thinking it has a more current list than the server.

Time sometimes moves backwards on servers with connected to a timeserver. How about something slightly more unique like a 16 bit rand +the most significant 48 bits of the system time? 48 bits ofmilliseconds is like 9000 years.

When a server shuts down, more connections are refused, existingconnections not in mid-request are closed, any remaining connectionsare closed immediately after completion of the request in progressand clients can failover gracefully to the next server in the list.If a server crashes requests are retried on the next server in thelist. This failover pattern is followed until there are no moreservers in the list at which point the client attempts a finalmulticast search (if it was created with a multicast PROVIDER_URL)before abandoning the request and throwing an exception to thecaller. Currently, the failover is ordered but could very easily bemade random. The multicast discovery aspect of the client adds anice randomness to the selection of the first server that is perhapssomewhat "just". Theoretically, servers that are under more loadwill send out less heart beats than servers with no load. This maynot happen as theory dictates, but certainly as we get more ejbstatistic data wired into the server functionality we can pursuedeliberate heartbeat throttling techniques that might make thattheory really sing in practice.


Very cool.

-dain

Re: Client/Server Multicast Discovery and Failover

Reply via email to