FYI: configadmin 1.8 is in central (just checked) Thanks Felix
— Felix Meschberger | Principal Scientist | Adobe Am 29.10.2013 um 00:13 schrieb David Jencks <[email protected]>: > Pierre, > > I opened https://issues.apache.org/jira/browse/FELIX-4297 and fixed the > problems I found (for 2). I don't see the OOM often enough to have any > confidence that anything I do would actually fix it, so I'm inclined to do > nothing. Is that OK with you? > > Unless you can find some more problems :-) I'm planning to try another > release when the config admin 1.8 gets to maven central. I'm going to update > the pom to normally run against the CA 1.8 version supporting R5 and change > the profile so running against R4 requires specifying profiles explicitly. > > thanks again! > david jencks > > On Oct 28, 2013, at 12:24 AM, David Jencks <[email protected]> wrote: > >> Hi Pierre, >> >> Much better to find these problems before a release than just after! >> >> I saw an OOM once recently but haven't been able to reproduce it. >> >> I'm looking into the NPE. I think I see the timing hole it is using but >> need to think about it some more. >> >> many thanks! >> david jencks >> >> On Oct 27, 2013, at 2:58 AM, Pierre De Rop <[email protected]> wrote: >> >>> Hi David, >>> >>> Looking at our configurator component we are currently using (but we will >>> fix it in order to use the multi-location "?"), I see this: >>> >>> void configure(String pid, Dictionary pidConf) { >>> Configuration config = getConfiguration(_pid, null); >>> if (config.getBundleLocation() != null) { >>> config.setBundleLocation(null); >>> } >>> config.update(pidConf); >>> } >>> >>> So I believe that you are getting a null configuration because there is a >>> short window between the setBundleLocation(null) (at this point, the >>> configuration is null) and the config.update(pidConf) call ... >>> >>> So, the good news is that I'm not having anymore some NPE using your latest >>> commits :-) and I think our application is now fully operational. >>> >>> but ... (please don't start to abominate me ) now, in order to do a final >>> check, I restarted the integration tests and there is still two problems: >>> >>> 1) I'm sometimes getting some out of memory errors: this is probably caused >>> by the ComponentConcurrencyTest/Felix3680Test tests, which are currently >>> configured in DEBUG mode ? >>> >>> 2) I ran the tests two times, and the second time, I got this exception >>> with the failing >>> Felix3680_2Test: >>> >>> test_concurrent_injection_with_bundleContext(org.apache.felix.scr.integration.Felix3680_2Test) >>> Time elapsed: 36.597 sec <<< ERROR! >>> java.lang.NullPointerException >>> at >>> org.apache.felix.scr.impl.manager.DependencyManager.invokeUnbindMethod(DependencyManager.java:1710) >>> at >>> org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUnbindMethod(SingleComponentManager.java:387) >>> at >>> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:355) >>> at >>> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:290) >>> at >>> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1503) >>> at >>> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1398) >>> at >>> org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.untrack(ServiceTracker.java:1258) >>> at >>> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1437) >>> at >>> org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:932) >>> at >>> org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793) >>> at >>> org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543) >>> at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4260) >>> at org.apache.felix.framework.Felix.access$000(Felix.java:74) >>> at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:390) >>> at >>> org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:148) >>> at >>> org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:127) >>> at >>> org.apache.felix.scr.integration.components.felix3680_2.Main$RegistrationHelper$2.run(Main.java:136) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:722) >>> >>> Are you also getting this exception ? >>> >>> thanks >>> >>> /Pierre >>> >>> >>> >>> >>> >>> >>> >>> On Sat, Oct 26, 2013 at 6:34 PM, David Jencks <[email protected]> >>> wrote: >>> Hi PIerre, >>> >>> Looking at the CA spec it looks like CA is supposed to send out >>> CM_LOCATION_CHANGED events even before any properties are set when >>> setBundleLocation is called. I added some code to ignore these events. >>> Note that DS is "reserving" the configurations for (one of) the >>> component(s) that will be consuming them by calling getConfiguration(pid). >>> >>> I do wonder how the location to something non-null on your configurations >>> before the properties are set. >>> >>> Waiting for the next bug :-) >>> >>> thanks >>> david jencks >>> >>> On Oct 26, 2013, at 3:00 AM, Pierre De Rop <[email protected]> wrote: >>> >>>> Hello David, >>>> >>>> The code we are using to configure our components is old, at at the time >>>> we wrote it, configadmin was not supporting multi-location. But I do >>>> agree, we can now use the "?" multi-location. >>>> >>>> Now, I'm sorry but I'm still seeing another NPE (sometimes, not always): >>>> >>>> 2013-10-26 11:45:44,209 CM Event Dispatcher (Fire ConfigurationEvent: >>>> pid=sipagent) ERROR osgi - [43] Unexpected problem delivering >>>> configuration event to [org.osgi.service.cm.ConfigurationListener, id=102, >>>> bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar] >>>> >>>> java.lang.NullPointerException >>>> at >>>> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226) >>>> at >>>> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396) >>>> at >>>> org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:344) >>>> at >>>> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032) >>>> at >>>> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002) >>>> at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103) >>>> at java.lang.Thread.run(Thread.java:722) >>>> >>>> >>>> I'm not sure, but it seems that ConfigAdmin is providing a null >>>> dictionary, when delivering a CM_LOCATION_CHANGED event ? if correct, then >>>> Is this a normal behavior ? >>>> >>>> This is strange; perhaps I shall start a new integration test ? >>>> >>>> /Pierre >>>> >>>> >>>> >>>> >>>> On Sat, Oct 26, 2013 at 9:54 AM, David Jencks <[email protected]> >>>> wrote: >>>> Hi Pierre, >>>> >>>> This pointed out a logic error I introduced for Felix 3651. I opened >>>> https://issues.apache.org/jira/browse/FELIX-4293 and fixed the error I >>>> found which I think explains the NPE. Could you check this? >>>> >>>> Could I ask what you are trying to do by setting the bundleLocation to >>>> null? If you want to allow any bundle to receive the configuration you >>>> could use multi-location support and set the location to "?" With the >>>> code you have now, if the configuration is already in use by a DS >>>> component, the location changed event will result in the bundle location >>>> being reset back to what it was. >>>> >>>> thanks! >>>> david jencks >>>> On Oct 25, 2013, at 8:32 AM, Pierre De Rop <[email protected]> wrote: >>>> >>>>> Hi David, >>>>> >>>>> thanks; The fix is fixing the problem :-) >>>>> >>>>> but ... there's now a new different problem: i'm now sometimes getting >>>>> this >>>>> NPE, after SCR is receiving a CM_LOCATION_CHANGED event: >>>>> >>>>> 2013-10-25 16:11:44,674 CM Event Dispatcher (Fire ConfigurationEvent: >>>>> pid=sipagent) ERROR osgi - [43] Unexpected problem delivering >>>>> configuration event to [org.osgi.service.cm.ConfigurationListener, id=102, >>>>> bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar] >>>>> >>>>> java.lang.NullPointerException >>>>> at >>>>> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226) >>>>> at >>>>> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396) >>>>> at >>>>> org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:390) >>>>> at >>>>> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032) >>>>> at >>>>> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002) >>>>> at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103) >>>>> at java.lang.Thread.run(Thread.java:722) >>>>> >>>>> Perhaps a new jira issue shall be opened ? >>>>> >>>>> I think we are getting a CM_LOCATION_CHANGED event because in our >>>>> application, we populate configuration admin by doing something like this: >>>>> >>>>> Configuration cfg = cm.getConfiguration(pid, null) >>>>> if (config.getBundleLocation() != null) { >>>>> config.setBundleLocation(null); >>>>> } >>>>> >>>>> The setBundleLocation(null) is probably useless, but this leads to a >>>>> CM_LOCATION_CHANGED event, which then sometimes ends up with the NPE. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Friday, October 25, 2013, David Jencks <[email protected]> wrote: >>>>>> Hi Pierre, >>>>>> >>>>>> You are so good at writing useful tests!! >>>>>> >>>>>> I found a place to call setTargets(getProperties()) from inside >>>>> ComponentFactoryImpl that would have fewer side effects. Could you see if >>>>> this makes your actual applications work properly? I'm uploading a >>>>> snapshot. >>>>>> >>>>>> many thanks >>>>>> david jencks >>>>>> >>>>>> On Oct 24, 2013, at 6:17 AM, Pierre De Rop <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>> Since this application is complex, I'm not able to provide logs because >>>>>>> there are hundreds of components involved which are not mine, and for >>>>> now, >>>>>>> I'm not able to diagnose the problem. >>>>>>> >>>>>>> But I have created FELIX-4290, and joined to it an integration test >>>>>>> which >>>>>>> seems to reproduce the kind of problem I think I'm having in my >>>>>>> application. I also joined the proposed patch. >>>>>>> >>>>>>> I did not have time to test the patch you suggested regarding the >>>>>>> SingleComponentManager.reconfigure method, so let's continue to >>>>> investigate >>>>>>> using the jira issue and the test I attached to it. >>>>>>> >>>>>>> Thanks; >>>>>>> >>>>>>> /Pierre >>>>>>> >>>>>>> >>>>>>> On Thu, Oct 24, 2013 at 12:27 AM, David Jencks <[email protected] >>>>>> wrote: >>>>>>> >>>>>>>> Hi Pierre, >>>>>>>> >>>>>>>> I believe you that this code path doesn't work :-) >>>>>>>> >>>>>>>> I think there should be a less invasive way to fix this. By any chance >>>>>>>> can you get a debug-enabled log from when this problem occurs? It >>>>>>>> would >>>>>>>> help confirm my suspicions of what might be missing. >>>>>>>> >>>>>>>> FWIW I suspect SingleComponentManager.reconfigure is missing a check >>>>>>>> for >>>>>>>> m_factoryProperties here (line 561): >>>>>>>> >>>>>>>> // nothing to do if there is no configuration (see FELIX-714) >>>>>>>> if ( configuration == null && m_configurationProperties == >>>>>>>> null ) >>>>>>>> { >>>>>>>> log( LogService.LOG_DEBUG, "No configuration provided (or >>>>>>>> deleted), nothing to do", null ); >>>>>>>> return; >>>>>>>> } >>>>>>>> >>>>>>>> Unless we can't figure anything out for sure I'd prefer to fix this >>>>> before >>>>>>>> the release. >>>>>>>> >>>>>>>> thanks >>>>>>>> david jencks >>>>>>>> >>>>>>>> On Oct 23, 2013, at 3:09 PM, Pierre De Rop <[email protected]> >>>>> wrote: >>>>>>>> >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> (sorry to do all this noise while you are releasing ...) >>>>>>>>> >>>>>>>>> We are indeed using factory components; and today, I finally found and >>>>>>>>> fixed a cycle, using the Apache Service Diagnostic tool; and I'm going >>>>>>>>> further on but now I'm facing another problem which I did not have in >>>>> the >>>>>>>>> scr 1.6.2. >>>>>>>>> >>>>>>>>> So, I would like to discuss about this new problem with you before you >>>>>>>> redo >>>>>>>>> a release, in order to decide if this problem (if there is really one >>>>> ?) >>>>>>>>> shall be addressed now or after the upcoming release ? >>>>>>>>> >>>>>>>>> So, in our application, we are extensively using factory components >>>>>>>>> (@Component(factory=XXX")). >>>>>>>>> When we instantiate a factory component (using >>>>>>>>> ComponentFactory.newInstance()), We pass to the newInstance() method >>>>> some >>>>>>>>> additional component properties which may also contain some target >>>>>>>> filters. >>>>>>>>> >>>>>>>>> This allows to dynamically configure the filter of some References >>>>>>>> declared >>>>>>>>> in the factory component. >>>>>>>>> in the scr 1.6.2, this mechanism was working fine. But using trunk, >>>>> this >>>>>>>>> does not work all the time. Some target filters seem to be correctly >>>>>>>>> configured, and some others are not (I'm not sure, actually, it's late >>>>>>>> ...). >>>>>>>>> >>>>>>>>> So, it looks like sometimes, some target filters are not updated >>>>>>>>> before >>>>>>>>> activating components ? or factory components ? >>>>>>>>> >>>>>>>>> I'm not sure but this might be related to the old FELIX-3726. >>>>>>>>> Now, interestingly, I did the following patch and my application is >>>>>>>>> now >>>>>>>>> working fine: In the AbstractComponentManager class, I systematically >>>>>>>>> update target filters, like this: >>>>>>>>> >>>>>>>>> +++ >>>>>>>>> >>>>>>>> >>>>> src/main/java/org/apache/felix/scr/impl/manager/AbstractComponentManager.java >>>>>>>>> >>>> >>>> >>> >>> >> >
