[
https://issues.apache.org/jira/browse/RIVER-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Firmstone resolved RIVER-337.
-----------------------------------
Resolution: Fixed
Fix Version/s: River_3.0.0
If this occurs, it is logged, however it is very unlikely to occur as a number
of race condions have been fixed in SDM since.
> Attempted discard of unknown registrar kills LookupLocatorDiscovery thread
> --------------------------------------------------------------------------
>
> Key: RIVER-337
> URL: https://issues.apache.org/jira/browse/RIVER-337
> Project: River
> Issue Type: Bug
> Components: net_jini_discovery, net_jini_lookup
> Affects Versions: jtsk_2.1, River_2.1.1
> Reporter: Chris Dolan
> Fix For: River_3.0.0
>
>
> The method
>
> net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(DiscoveryEvent)
> has the following code that throws a RuntimeException (the code comment
> suggests that it is supposed to be impossible, but it's not).
> ProxyReg reg = findReg(proxys[i]);
> if(reg != null ) { // this check can be removed.
> proxyRegSet.remove(proxyRegSet.indexOf(reg));
> drops.add(reg);
> } else {
> throw new RuntimeException("discard error");
> }//endif
> Our QA does failover testing with two servers, each with a Reggie, where we
> deliberately crash and reboot server 1 then server 2 every 30 minutes
> continuously. In one case, we hit that RuntimeException. I don't know why
> we got a null reg (that's a problem for another defect, maybe an undiagnosed
> race of two discards put on a task queue? Maybe related to RIVER-37?). But
> it caused a catastrophic chain of events because the RuntimeException is not
> caught anywhere up the stack. In our case, it killed the
> LookupLocatorDiscovery$Notifier thread.
> java.lang.RuntimeException: discard error
> at
> net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(2639)
> at net.jini.discovery.LookupDiscoveryManager.notifyListener(1375)
> at net.jini.discovery.LookupDiscoveryManager.notifyListener(1356)
> at net.jini.discovery.LookupDiscoveryManager.access$500(92)
> at
> net.jini.discovery.LookupDiscoveryManager$LocatorDiscoveryListener.discarded(543)
> at net.jini.discovery.LookupLocatorDiscovery$Notifier.run(650)
> I propose three changes:
> 1) change the discarded() method above to simply warn instead of throwing
> 2) put a try/catch(Throwable) around the listener invocation in
> LookupLocatorDiscovery$Notifier.run()
> 3) put a similar try/catch around listener invocation in
> LookupDiscoveryManager.notifyListener
> The idea behind #2 and #3 is that misbehaving listeners should not be allowed
> to derail the discovery process.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)