Thanks Chris, I'll action your recommendations. It would be nice to try to track down where the problem is, it's a shame DiscoveryEvent isn't immutable.

Does anyone need access to the protected fields in DiscoveryEvent?

What are the ramifications of making DiscoveryEvent immutable? How much breakage of application code?.

Or all Event's for that matter.

Cheers,

Peter.

Chris Dolan (JIRA) wrote:
Attempted discard of unknown registrar kills LookupLocatorDiscovery thread
--------------------------------------------------------------------------

                 Key: RIVER-337
                 URL: https://issues.apache.org/jira/browse/RIVER-337
             Project: River
          Issue Type: Bug
          Components: net_jini_discovery, net_jini_lookup
    Affects Versions: AR1, jtsk_2.1
            Reporter: Chris Dolan


The method
   
net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(DiscoveryEvent)
has the following code that throws a RuntimeException (the code comment 
suggests that it is supposed to be impossible, but it's not).

        ProxyReg reg = findReg(proxys[i]);
        if(reg != null ) { // this check can be removed.
            proxyRegSet.remove(proxyRegSet.indexOf(reg));
            drops.add(reg);
        } else {
            throw new RuntimeException("discard error");
        }//endif

Our QA does failover testing with two servers, each with a Reggie, where we 
deliberately crash and reboot server 1 then server 2 every 30 minutes 
continuously.  In one case, we hit that RuntimeException.  I don't know why we 
got a null reg (that's a problem for another defect, maybe an undiagnosed race 
of two discards put on a task queue?  Maybe related to RIVER-37?).  But it 
caused a catastrophic chain of events because the RuntimeException is not 
caught anywhere up the stack.  In our case, it killed the 
LookupLocatorDiscovery$Notifier thread.

java.lang.RuntimeException: discard error
        at 
net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(2639)
        at net.jini.discovery.LookupDiscoveryManager.notifyListener(1375)
        at net.jini.discovery.LookupDiscoveryManager.notifyListener(1356)
        at net.jini.discovery.LookupDiscoveryManager.access$500(92)
        at 
net.jini.discovery.LookupDiscoveryManager$LocatorDiscoveryListener.discarded(543)
        at net.jini.discovery.LookupLocatorDiscovery$Notifier.run(650)

I propose three changes:

  1) change the discarded() method above to simply warn instead of throwing
2) put a try/catch(Throwable) around the listener invocation in LookupLocatorDiscovery$Notifier.run()
  3) put a similar try/catch around listener invocation in 
LookupDiscoveryManager.notifyListener

The idea behind #2 and #3 is that misbehaving listeners should not be allowed 
to derail the discovery process.



Reply via email to