Unless I've misunderstood RIVER-52, I can't see how it could be related to my bug because that's on the Reggie side.
My scenario is that we have a Reggie on Server#1 and a Reggie on Server#2 and when we shut down Server#1, then a *client* also running on Server#2 gets this error. That client is still in contact with the Reggie on Server#2, but appears to be executing two discards for Reggie#1. I just had a new thought: could it be that we're getting one discard from LookupDiscovery and one from LookupLocatorDiscovery that are coincidentally firing at the same time when Reggie#1 shuts down? I know for a fact that my two servers are on the same subnet and that they also have locators pointing at each other, so both discovery mechanisms are active in the client. Thanks for your attention. Chris -----Original Message----- From: Tom Hobbs [mailto:[email protected]] Sent: Wednesday, April 21, 2010 6:22 AM To: [email protected] Subject: Re: [jira] Created: (RIVER-337) Attempted discard of unknown registrar kills LookupLocatorDiscovery thread There's a related problem on busy networks when reggies disappear; see RIVER-52. Although (probably) not the cause of Chris' trouble it's in a related area... I think. I sure that the problem described in RIVER-52 exists because I've encountered it in the wild, but I'm still having trouble reproducing it at will. I can take a look at both of these when time allows. Peter; I don't know about DiscoveryEvent's fields. I can't think of any reason off the top of my head as to why they can't be made protected. I remember a little while ago you were talking about MarshalledInstance for some reason. (Or did I just make that up?) What are your thoughts on RIVER-29? Cheers, Tom On Wed, Apr 21, 2010 at 11:52 AM, Peter Firmstone <[email protected]> wrote: > Thanks Chris, I'll action your recommendations. It would be nice to try to > track down where the problem is, it's a shame DiscoveryEvent isn't > immutable. > > Does anyone need access to the protected fields in DiscoveryEvent? > > What are the ramifications of making DiscoveryEvent immutable? How much > breakage of application code?. > > Or all Event's for that matter. > > Cheers, > > Peter. > > > Chris Dolan (JIRA) wrote: > >> Attempted discard of unknown registrar kills LookupLocatorDiscovery thread >> ------------------------------------------------------------------------ -- >> >> Key: RIVER-337 >> URL: https://issues.apache.org/jira/browse/RIVER-337 >> Project: River >> Issue Type: Bug >> Components: net_jini_discovery, net_jini_lookup >> Affects Versions: AR1, jtsk_2.1 >> Reporter: Chris Dolan >> >> >> The method >> >> net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(Discov eryEvent) >> has the following code that throws a RuntimeException (the code comment >> suggests that it is supposed to be impossible, but it's not). >> >> ProxyReg reg = findReg(proxys[i]); >> if(reg != null ) { // this check can be removed. >> proxyRegSet.remove(proxyRegSet.indexOf(reg)); >> drops.add(reg); >> } else { >> throw new RuntimeException("discard error"); >> }//endif >> >> Our QA does failover testing with two servers, each with a Reggie, where >> we deliberately crash and reboot server 1 then server 2 every 30 minutes >> continuously. In one case, we hit that RuntimeException. I don't know why >> we got a null reg (that's a problem for another defect, maybe an undiagnosed >> race of two discards put on a task queue? Maybe related to RIVER-37?). But >> it caused a catastrophic chain of events because the RuntimeException is not >> caught anywhere up the stack. In our case, it killed the >> LookupLocatorDiscovery$Notifier thread. >> >> java.lang.RuntimeException: discard error >> at >> net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(2639) >> at net.jini.discovery.LookupDiscoveryManager.notifyListener(1375) >> at net.jini.discovery.LookupDiscoveryManager.notifyListener(1356) >> at net.jini.discovery.LookupDiscoveryManager.access$500(92) >> at >> net.jini.discovery.LookupDiscoveryManager$LocatorDiscoveryListener.disca rded(543) >> at net.jini.discovery.LookupLocatorDiscovery$Notifier.run(650) >> >> I propose three changes: >> >> 1) change the discarded() method above to simply warn instead of throwing >> 2) put a try/catch(Throwable) around the listener invocation in >> LookupLocatorDiscovery$Notifier.run() >> 3) put a similar try/catch around listener invocation in >> LookupDiscoveryManager.notifyListener >> >> The idea behind #2 and #3 is that misbehaving listeners should not be >> allowed to derail the discovery process. >> >> >> >> > >
