Hi Pierre, Much better to find these problems before a release than just after!
I saw an OOM once recently but haven't been able to reproduce it. I'm looking into the NPE. I think I see the timing hole it is using but need to think about it some more. many thanks! david jencks On Oct 27, 2013, at 2:58 AM, Pierre De Rop <[email protected]> wrote: > Hi David, > > Looking at our configurator component we are currently using (but we will fix > it in order to use the multi-location "?"), I see this: > > void configure(String pid, Dictionary pidConf) { > Configuration config = getConfiguration(_pid, null); > if (config.getBundleLocation() != null) { > config.setBundleLocation(null); > } > config.update(pidConf); > } > > So I believe that you are getting a null configuration because there is a > short window between the setBundleLocation(null) (at this point, the > configuration is null) and the config.update(pidConf) call ... > > So, the good news is that I'm not having anymore some NPE using your latest > commits :-) and I think our application is now fully operational. > > but ... (please don't start to abominate me ) now, in order to do a final > check, I restarted the integration tests and there is still two problems: > > 1) I'm sometimes getting some out of memory errors: this is probably caused > by the ComponentConcurrencyTest/Felix3680Test tests, which are currently > configured in DEBUG mode ? > > 2) I ran the tests two times, and the second time, I got this exception with > the failing > Felix3680_2Test: > > test_concurrent_injection_with_bundleContext(org.apache.felix.scr.integration.Felix3680_2Test) > Time elapsed: 36.597 sec <<< ERROR! > java.lang.NullPointerException > at > org.apache.felix.scr.impl.manager.DependencyManager.invokeUnbindMethod(DependencyManager.java:1710) > at > org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUnbindMethod(SingleComponentManager.java:387) > at > org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:355) > at > org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:290) > at > org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1503) > at > org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1398) > at > org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.untrack(ServiceTracker.java:1258) > at > org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1437) > at > org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:932) > at > org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793) > at > org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543) > at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4260) > at org.apache.felix.framework.Felix.access$000(Felix.java:74) > at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:390) > at > org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:148) > at > org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:127) > at > org.apache.felix.scr.integration.components.felix3680_2.Main$RegistrationHelper$2.run(Main.java:136) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > > Are you also getting this exception ? > > thanks > > /Pierre > > > > > > > > On Sat, Oct 26, 2013 at 6:34 PM, David Jencks <[email protected]> wrote: > Hi PIerre, > > Looking at the CA spec it looks like CA is supposed to send out > CM_LOCATION_CHANGED events even before any properties are set when > setBundleLocation is called. I added some code to ignore these events. Note > that DS is "reserving" the configurations for (one of) the component(s) that > will be consuming them by calling getConfiguration(pid). > > I do wonder how the location to something non-null on your configurations > before the properties are set. > > Waiting for the next bug :-) > > thanks > david jencks > > On Oct 26, 2013, at 3:00 AM, Pierre De Rop <[email protected]> wrote: > > > Hello David, > > > > The code we are using to configure our components is old, at at the time we > > wrote it, configadmin was not supporting multi-location. But I do agree, we > > can now use the "?" multi-location. > > > > Now, I'm sorry but I'm still seeing another NPE (sometimes, not always): > > > > 2013-10-26 11:45:44,209 CM Event Dispatcher (Fire ConfigurationEvent: > > pid=sipagent) ERROR osgi - [43] Unexpected problem delivering > > configuration event to [org.osgi.service.cm.ConfigurationListener, id=102, > > bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar] > > > > java.lang.NullPointerException > > at > > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226) > > at > > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396) > > at > > org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:344) > > at > > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032) > > at > > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002) > > at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103) > > at java.lang.Thread.run(Thread.java:722) > > > > > > I'm not sure, but it seems that ConfigAdmin is providing a null dictionary, > > when delivering a CM_LOCATION_CHANGED event ? if correct, then Is this a > > normal behavior ? > > > > This is strange; perhaps I shall start a new integration test ? > > > > /Pierre > > > > > > > > > > On Sat, Oct 26, 2013 at 9:54 AM, David Jencks <[email protected]> > > wrote: > > Hi Pierre, > > > > This pointed out a logic error I introduced for Felix 3651. I opened > > https://issues.apache.org/jira/browse/FELIX-4293 and fixed the error I > > found which I think explains the NPE. Could you check this? > > > > Could I ask what you are trying to do by setting the bundleLocation to > > null? If you want to allow any bundle to receive the configuration you > > could use multi-location support and set the location to "?" With the code > > you have now, if the configuration is already in use by a DS component, the > > location changed event will result in the bundle location being reset back > > to what it was. > > > > thanks! > > david jencks > > On Oct 25, 2013, at 8:32 AM, Pierre De Rop <[email protected]> wrote: > > > > > Hi David, > > > > > > thanks; The fix is fixing the problem :-) > > > > > > but ... there's now a new different problem: i'm now sometimes getting > > > this > > > NPE, after SCR is receiving a CM_LOCATION_CHANGED event: > > > > > > 2013-10-25 16:11:44,674 CM Event Dispatcher (Fire ConfigurationEvent: > > > pid=sipagent) ERROR osgi - [43] Unexpected problem delivering > > > configuration event to [org.osgi.service.cm.ConfigurationListener, id=102, > > > bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar] > > > > > > java.lang.NullPointerException > > > at > > > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226) > > > at > > > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396) > > > at > > > org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:390) > > > at > > > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032) > > > at > > > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002) > > > at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103) > > > at java.lang.Thread.run(Thread.java:722) > > > > > > Perhaps a new jira issue shall be opened ? > > > > > > I think we are getting a CM_LOCATION_CHANGED event because in our > > > application, we populate configuration admin by doing something like this: > > > > > > Configuration cfg = cm.getConfiguration(pid, null) > > > if (config.getBundleLocation() != null) { > > > config.setBundleLocation(null); > > > } > > > > > > The setBundleLocation(null) is probably useless, but this leads to a > > > CM_LOCATION_CHANGED event, which then sometimes ends up with the NPE. > > > > > > > > > > > > > > > > > > > > > On Friday, October 25, 2013, David Jencks <[email protected]> wrote: > > >> Hi Pierre, > > >> > > >> You are so good at writing useful tests!! > > >> > > >> I found a place to call setTargets(getProperties()) from inside > > > ComponentFactoryImpl that would have fewer side effects. Could you see if > > > this makes your actual applications work properly? I'm uploading a > > > snapshot. > > >> > > >> many thanks > > >> david jencks > > >> > > >> On Oct 24, 2013, at 6:17 AM, Pierre De Rop <[email protected]> > > >> wrote: > > >> > > >>> Hi David, > > >>> > > >>> Since this application is complex, I'm not able to provide logs because > > >>> there are hundreds of components involved which are not mine, and for > > > now, > > >>> I'm not able to diagnose the problem. > > >>> > > >>> But I have created FELIX-4290, and joined to it an integration test > > >>> which > > >>> seems to reproduce the kind of problem I think I'm having in my > > >>> application. I also joined the proposed patch. > > >>> > > >>> I did not have time to test the patch you suggested regarding the > > >>> SingleComponentManager.reconfigure method, so let's continue to > > > investigate > > >>> using the jira issue and the test I attached to it. > > >>> > > >>> Thanks; > > >>> > > >>> /Pierre > > >>> > > >>> > > >>> On Thu, Oct 24, 2013 at 12:27 AM, David Jencks <[email protected] > > >> wrote: > > >>> > > >>>> Hi Pierre, > > >>>> > > >>>> I believe you that this code path doesn't work :-) > > >>>> > > >>>> I think there should be a less invasive way to fix this. By any chance > > >>>> can you get a debug-enabled log from when this problem occurs? It > > >>>> would > > >>>> help confirm my suspicions of what might be missing. > > >>>> > > >>>> FWIW I suspect SingleComponentManager.reconfigure is missing a check > > >>>> for > > >>>> m_factoryProperties here (line 561): > > >>>> > > >>>> // nothing to do if there is no configuration (see FELIX-714) > > >>>> if ( configuration == null && m_configurationProperties == > > >>>> null ) > > >>>> { > > >>>> log( LogService.LOG_DEBUG, "No configuration provided (or > > >>>> deleted), nothing to do", null ); > > >>>> return; > > >>>> } > > >>>> > > >>>> Unless we can't figure anything out for sure I'd prefer to fix this > > > before > > >>>> the release. > > >>>> > > >>>> thanks > > >>>> david jencks > > >>>> > > >>>> On Oct 23, 2013, at 3:09 PM, Pierre De Rop <[email protected]> > > > wrote: > > >>>> > > >>>>> Hi David, > > >>>>> > > >>>>> (sorry to do all this noise while you are releasing ...) > > >>>>> > > >>>>> We are indeed using factory components; and today, I finally found and > > >>>>> fixed a cycle, using the Apache Service Diagnostic tool; and I'm going > > >>>>> further on but now I'm facing another problem which I did not have in > > > the > > >>>>> scr 1.6.2. > > >>>>> > > >>>>> So, I would like to discuss about this new problem with you before you > > >>>> redo > > >>>>> a release, in order to decide if this problem (if there is really one > > > ?) > > >>>>> shall be addressed now or after the upcoming release ? > > >>>>> > > >>>>> So, in our application, we are extensively using factory components > > >>>>> (@Component(factory=XXX")). > > >>>>> When we instantiate a factory component (using > > >>>>> ComponentFactory.newInstance()), We pass to the newInstance() method > > > some > > >>>>> additional component properties which may also contain some target > > >>>> filters. > > >>>>> > > >>>>> This allows to dynamically configure the filter of some References > > >>>> declared > > >>>>> in the factory component. > > >>>>> in the scr 1.6.2, this mechanism was working fine. But using trunk, > > > this > > >>>>> does not work all the time. Some target filters seem to be correctly > > >>>>> configured, and some others are not (I'm not sure, actually, it's late > > >>>> ...). > > >>>>> > > >>>>> So, it looks like sometimes, some target filters are not updated > > >>>>> before > > >>>>> activating components ? or factory components ? > > >>>>> > > >>>>> I'm not sure but this might be related to the old FELIX-3726. > > >>>>> Now, interestingly, I did the following patch and my application is > > >>>>> now > > >>>>> working fine: In the AbstractComponentManager class, I systematically > > >>>>> update target filters, like this: > > >>>>> > > >>>>> +++ > > >>>>> > > >>>> > > > src/main/java/org/apache/felix/scr/impl/manager/AbstractComponentManager.java > > >>>>> > > > > > >
