Here is what I gathered so far from this email thread:

1. It is quite possible for the  Registar to cancel the lease even if the 
disconnection would have been temporary.

2. There are three mitigations currently proposed, each can solve the temporary 
network disconnection problem:
               2A. Fix the LeaseRenewalManager to use renewDuration to from the 
first lease request, and not just from the second lease request (the original 
email in this thread)

               2B. Create a new mechanism in which the Registrar sends periodic 
events to the client, and have the client reconnect if it has not received any 
event for a while.

               2C. Create a new mechanism in which the Listener sends periodic 
listener attribute updates.



Here is the proposed patch that uses the existing renewDuration mechanism:


Index: src/net/jini/lease/LeaseRenewalManager.java
===================================================================
--- src/net/jini/lease/LeaseRenewalManager.java  (revision 1359529)
+++ src/net/jini/lease/LeaseRenewalManager.java  (working copy)
@@ -577,7 +577,15 @@
        } else {
          delta = 1000 * 60 * 60 * 24 * 3;
        }
-        renew = endTime - delta;
+
+        long renewTime = endTime - delta;
+        if (renewDuration != Lease.ANY &&
+            renewTime - now > renewDuration) {
+
+        // shorten the time between two consecutive lease renewals
+         renewTime = now + renewDuration;
+        }
+        this.renew = renewTime;
    }
     /** Calculate a new renew time due to an indefinite exception */



Regards,

Itai





-----Original Message-----
From: Greg Trasuk [mailto:[email protected]]
Sent: Tuesday, July 10, 2012 6:50 PM
To: [email protected]
Subject: RE: Question about LeaseRenewalManager and renewDuration





On Tue, 2012-07-10 at 10:14, Itai Frenkel wrote:

> >> Are you sure about that?

> Looking at RegistrarImpl when ThrowableConstants.retryable(e) returns 
> BAD_OBJECT, it rethrows only if (e instanceof Error), otherwise it cancels 
> the lease. Since ConnectException is not an Error the lease would be canceled.

> Why is the Error check being performed ?

>

ThrowableConstants.retryable(e) only returns BAD_OBJECT if it receives a 
definite response from the remote endpoint.  For a comm failure, it should 
return INDEFINITE.  Having said that, the logic seems to favour declaring an 
exception "Definite" where it might be arguable.  For instance, it will declare 
BAD_OBJECT in the case of a "No route to host"

exception, which arguably could be temporary, for instance if a router goes 
offline.



> >> Personally, I'd use an internal timer on the client side that says "if I 
> >> don't receive any events for a given time, I'll cancel the current lease 
> >> and re-register".

> That requires the Registrar to periodically send probe notifications. The 
> number of real world notifications could fluctuate from zero to high load and 
> cannot be trusted without probe notifications.

>

Might be an interesting improvement if a client could request a heartbeat or 
supervisory message from the registrar.  But my point above was that if the 
events are not coming fast enough to satisfy a reasonable "liveness" timeout, 
then it's probably not a big problem if the client simply cancels the lease and 
re-registers.  So you could effectively implement your own heartbeat.



Alternately (subject to exploring the loading and the number of clients) you 
could create a service that does nothing but registers, then updates its 
service attributes periodically, which would have the effect of generating 
registrar messages.  Starting to get a little complicated and indirect, though.



In the end, however, it seems like your trying to have the client find out that 
it's not receiving registrar notifications.  I can't think of any better 
evidence than "you're not receiving registrar notifications".



Cheers,



Greg.



> Thanks,

> Itai

>

> -----Original Message-----

> From: Greg Trasuk 
> [mailto:[email protected]]<mailto:[mailto:[email protected]]>

> Sent: Tuesday, July 10, 2012 4:36 PM

> To: [email protected]<mailto:[email protected]>

> Subject: Re: Question about LeaseRenewalManager and renewDuration

>

>

> On Tue, 2012-07-10 at 06:41, Itai Frenkel wrote:

> <snip...>

> > Background Information:

> > The motivation for this is the way the Registrar handles event 
> > notifications.

> > When the Registrar fails to send a notification to a listener due to

> > a temporary network glitch, it assumes the listener is no longer available 
> > and cancels the event lease.

>

> Are you sure about that?  Looking through com.sun.jini.reggie.RegistrarImpl, 
> it appears that when an exception occurs during event notification, the code 
> tries to categorize the exception as either "definite" (no such event, no 
> such object, etc) or "indefinite" (communications failure).  Then it only 
> cancels the lease on a definite exception.

>

> In other words, the lease is maintained in the case of a temporary network 
> failure.  After all, that's the whole point of the lease: it represents an 
> agreement between the client and service that resources are going to be 
> maintained for a definite time period.

>

> Personally, I'd use an internal timer on the client side that says "if I 
> don't receive any events for a given time, I'll cancel the current lease and 
> re-register".  If the events are that quiet, then clearly the registrar is 
> not that heavily loaded, so the overhead of cancelling the lease and creating 
> a new registration should not be too bad.  You'd want to test it under 
> simulated load, of course.

>

> Cheers,

>

> Greg.

> --

> Greg Trasuk, President

> StratusCom Manufacturing Systems Inc. - We use information technology to 
> solve business problems on your plant floor.

> http://stratuscom.com

>

>

>




Reply via email to