Re: SourceAliveRemoteEvent Part II

Dennis Reedy Tue, 29 May 2007 14:11:16 -0700

Hi,

In the Rio project there is the concept of a fault detection handler(FDH), that is used to determine the reachability of a service. Theservice provides the FDH, clients use it to determine whether aservice is indeed at the far end of the connection. In the case of anevent producer this aligns itself with option (1) below.

In general, the FDH approach has served quite well, provides apluggable approach that can be built to use any technique (ping,heartbeat multicast, lease ...), specific to each service (as needed).


Regards

Dennis

On May 29, 2007, at 349PM, Dan Creswell wrote:

Hi all,

It started with a discussion under the Javaspaces.notify() notreliableconversation and I've now had a bit more time to formulate mythoughts.

Without this extra feature we do something like the following inthe client:


(1)     Setup a watchdog timer with a suitable expiry
(2)     On receiving a remote event, reset our watchdog timer
(3)     If timer expires, check to see if our source is still alive, check
to see if we might've missed an event.

What's being proposed, if I understand correctly is the the source if
it's alive and hasn't generated events in a particular time period
confirm that by posting a SourceAliveRemoteEvent to the client
confirming this.

This would potentially change the above client code to reset the timer
on just a SourceAliveRemoteEvent (SARE).

Things of note:

(1)     The original solution places the responsibility and load on the

client (bar the pinging of the server). This naturally scales outquitewell as the server only has to respond to pings and chances are aclientonly maintains timers for a few services. If client timeouts aretuned

appropriately to event frequency/typical pause, pings will be rare.

(2) The new solution places much of the responsibility with theserver.

 I believe there may be a scaling problem here.  In contrast to the
client-side approach a server might have a large number of clients to
cope with.  This potentially means the server has significant load

tracking a large number of timer events for all it's clients andposting

SARE's in addition to what it already does.

(3)     The only difference between old and new approach from a client
coding perspective is what causes a reset of the watchdog timer.

(4)     SARE's like any other event can be lost - if it's lost the client
watchdog will trigger just as it would in the old approach given
sufficient time between RemoteEvents.

(5) If the source has sent events but they've been lost it won'tsend an

SARE and, again client watchdog will timeout and ping.

Based on the above it seems to me that whilst an SARE might save a few
pings there's additional complexity and greater server load.  If I've
missed some subtleties, please shout because right now I don't see
enough benefit in this to justify the "pain".

Thoughts?

Dan.

Re: SourceAliveRemoteEvent Part II

Reply via email to