Hi David;
1) Every integration tests are passing OK.
Regarding the OOM, I believe that the out of memory is likely to be cause
by the following load tests, which are running in DEBUG mode:
src/test/java/org/apache/felix/scr/integration/ComponentConcurrencyTest.java
src/test/java/org/apache/felix/scr/integration/Felix3680Test.java
src/test/java/org/apache/felix/scr/integration/Felix3680_2Test.java
So, I just added this in each files (in static initializers):
DS_LOGLEVEL = "warn";
And I have no more OOM, with this nice test result message:
Tests run: 103, Failures: 0, Errors: 0, Skipped: 0
[INFO] BUILD SUCCESS
2) Now, I did again a non regression tests with our applications, and I'm
seeing a new WARN message (but it does not seem to prevent the application
to be correctly initialized: all components are in the end satisfied):
(if you believe that it's now time to release, then you can go ahead, and
we'll then investigate this new issue after the release ... I let you
decide)
->
2013-10-29 11:10:02,487 CM Event Dispatcher (Fire ConfigurationEvent:
pid=com.alcatel.as.ioh.impl.server.ServerFactoryImpl) ERROR osgi -
[com.alcatel.as.ioh.impl.server.ServerFactoryImpl(91)] DependencyManager :
invokeBindMethod : timeout on open latch newtcpprocessor
kind regards;
/Pierre
On Tue, Oct 29, 2013 at 12:13 AM, David Jencks <[email protected]>wrote:
> Pierre,
>
> I opened https://issues.apache.org/jira/browse/FELIX-4297 and fixed the
> problems I found (for 2). I don't see the OOM often enough to have any
> confidence that anything I do would actually fix it, so I'm inclined to do
> nothing. Is that OK with you?
>
> Unless you can find some more problems :-) I'm planning to try another
> release when the config admin 1.8 gets to maven central. I'm going to
> update the pom to normally run against the CA 1.8 version supporting R5 and
> change the profile so running against R4 requires specifying profiles
> explicitly.
>
> thanks again!
> david jencks
>
> On Oct 28, 2013, at 12:24 AM, David Jencks <[email protected]> wrote:
>
> > Hi Pierre,
> >
> > Much better to find these problems before a release than just after!
> >
> > I saw an OOM once recently but haven't been able to reproduce it.
> >
> > I'm looking into the NPE. I think I see the timing hole it is using but
> need to think about it some more.
> >
> > many thanks!
> > david jencks
> >
> > On Oct 27, 2013, at 2:58 AM, Pierre De Rop <[email protected]>
> wrote:
> >
> >> Hi David,
> >>
> >> Looking at our configurator component we are currently using (but we
> will fix it in order to use the multi-location "?"), I see this:
> >>
> >> void configure(String pid, Dictionary pidConf) {
> >> Configuration config = getConfiguration(_pid, null);
> >> if (config.getBundleLocation() != null) {
> >> config.setBundleLocation(null);
> >> }
> >> config.update(pidConf);
> >> }
> >>
> >> So I believe that you are getting a null configuration because there is
> a short window between the setBundleLocation(null) (at this point, the
> configuration is null) and the config.update(pidConf) call ...
> >>
> >> So, the good news is that I'm not having anymore some NPE using your
> latest commits :-) and I think our application is now fully operational.
> >>
> >> but ... (please don't start to abominate me ) now, in order to do a
> final check, I restarted the integration tests and there is still two
> problems:
> >>
> >> 1) I'm sometimes getting some out of memory errors: this is probably
> caused by the ComponentConcurrencyTest/Felix3680Test tests, which are
> currently configured in DEBUG mode ?
> >>
> >> 2) I ran the tests two times, and the second time, I got this exception
> with the failing
> >> Felix3680_2Test:
> >>
> >>
> test_concurrent_injection_with_bundleContext(org.apache.felix.scr.integration.Felix3680_2Test)
> Time elapsed: 36.597 sec <<< ERROR!
> >> java.lang.NullPointerException
> >> at
> org.apache.felix.scr.impl.manager.DependencyManager.invokeUnbindMethod(DependencyManager.java:1710)
> >> at
> org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUnbindMethod(SingleComponentManager.java:387)
> >> at
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:355)
> >> at
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:290)
> >> at
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1503)
> >> at
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1398)
> >> at
> org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.untrack(ServiceTracker.java:1258)
> >> at
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1437)
> >> at
> org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:932)
> >> at
> org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793)
> >> at
> org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543)
> >> at
> org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4260)
> >> at org.apache.felix.framework.Felix.access$000(Felix.java:74)
> >> at
> org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:390)
> >> at
> org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:148)
> >> at
> org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:127)
> >> at
> org.apache.felix.scr.integration.components.felix3680_2.Main$RegistrationHelper$2.run(Main.java:136)
> >> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >> at java.lang.Thread.run(Thread.java:722)
> >>
> >> Are you also getting this exception ?
> >>
> >> thanks
> >>
> >> /Pierre
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Sat, Oct 26, 2013 at 6:34 PM, David Jencks <[email protected]>
> wrote:
> >> Hi PIerre,
> >>
> >> Looking at the CA spec it looks like CA is supposed to send out
> CM_LOCATION_CHANGED events even before any properties are set when
> setBundleLocation is called. I added some code to ignore these events.
> Note that DS is "reserving" the configurations for (one of) the
> component(s) that will be consuming them by calling getConfiguration(pid).
> >>
> >> I do wonder how the location to something non-null on your
> configurations before the properties are set.
> >>
> >> Waiting for the next bug :-)
> >>
> >> thanks
> >> david jencks
> >>
> >> On Oct 26, 2013, at 3:00 AM, Pierre De Rop <[email protected]>
> wrote:
> >>
> >>> Hello David,
> >>>
> >>> The code we are using to configure our components is old, at at the
> time we wrote it, configadmin was not supporting multi-location. But I do
> agree, we can now use the "?" multi-location.
> >>>
> >>> Now, I'm sorry but I'm still seeing another NPE (sometimes, not
> always):
> >>>
> >>> 2013-10-26 11:45:44,209 CM Event Dispatcher (Fire ConfigurationEvent:
> pid=sipagent) ERROR osgi - [43] Unexpected problem delivering
> configuration event to [org.osgi.service.cm.ConfigurationListener, id=102,
> bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar]
> >>>
> >>> java.lang.NullPointerException
> >>> at
> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226)
> >>> at
> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396)
> >>> at
> org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:344)
> >>> at
> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032)
> >>> at
> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002)
> >>> at
> org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103)
> >>> at java.lang.Thread.run(Thread.java:722)
> >>>
> >>>
> >>> I'm not sure, but it seems that ConfigAdmin is providing a null
> dictionary, when delivering a CM_LOCATION_CHANGED event ? if correct, then
> Is this a normal behavior ?
> >>>
> >>> This is strange; perhaps I shall start a new integration test ?
> >>>
> >>> /Pierre
> >>>
> >>>
> >>>
> >>>
> >>> On Sat, Oct 26, 2013 at 9:54 AM, David Jencks <[email protected]>
> wrote:
> >>> Hi Pierre,
> >>>
> >>> This pointed out a logic error I introduced for Felix 3651. I opened
> https://issues.apache.org/jira/browse/FELIX-4293 and fixed the error I
> found which I think explains the NPE. Could you check this?
> >>>
> >>> Could I ask what you are trying to do by setting the bundleLocation to
> null? If you want to allow any bundle to receive the configuration you
> could use multi-location support and set the location to "?" With the code
> you have now, if the configuration is already in use by a DS component, the
> location changed event will result in the bundle location being reset back
> to what it was.
> >>>
> >>> thanks!
> >>> david jencks
> >>> On Oct 25, 2013, at 8:32 AM, Pierre De Rop <[email protected]>
> wrote:
> >>>
> >>>> Hi David,
> >>>>
> >>>> thanks; The fix is fixing the problem :-)
> >>>>
> >>>> but ... there's now a new different problem: i'm now sometimes
> getting this
> >>>> NPE, after SCR is receiving a CM_LOCATION_CHANGED event:
> >>>>
> >>>> 2013-10-25 16:11:44,674 CM Event Dispatcher (Fire ConfigurationEvent:
> >>>> pid=sipagent) ERROR osgi - [43] Unexpected problem delivering
> >>>> configuration event to [org.osgi.service.cm.ConfigurationListener,
> id=102,
> >>>>
> bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar]
> >>>>
> >>>> java.lang.NullPointerException
> >>>> at
> >>>>
> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226)
> >>>> at
> >>>>
> org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396)
> >>>> at
> >>>>
> org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:390)
> >>>> at
> >>>>
> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032)
> >>>> at
> >>>>
> org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002)
> >>>> at
> org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103)
> >>>> at java.lang.Thread.run(Thread.java:722)
> >>>>
> >>>> Perhaps a new jira issue shall be opened ?
> >>>>
> >>>> I think we are getting a CM_LOCATION_CHANGED event because in our
> >>>> application, we populate configuration admin by doing something like
> this:
> >>>>
> >>>> Configuration cfg = cm.getConfiguration(pid, null)
> >>>> if (config.getBundleLocation() != null) {
> >>>> config.setBundleLocation(null);
> >>>> }
> >>>>
> >>>> The setBundleLocation(null) is probably useless, but this leads to a
> >>>> CM_LOCATION_CHANGED event, which then sometimes ends up with the NPE.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Friday, October 25, 2013, David Jencks <[email protected]>
> wrote:
> >>>>> Hi Pierre,
> >>>>>
> >>>>> You are so good at writing useful tests!!
> >>>>>
> >>>>> I found a place to call setTargets(getProperties()) from inside
> >>>> ComponentFactoryImpl that would have fewer side effects. Could you
> see if
> >>>> this makes your actual applications work properly? I'm uploading a
> >>>> snapshot.
> >>>>>
> >>>>> many thanks
> >>>>> david jencks
> >>>>>
> >>>>> On Oct 24, 2013, at 6:17 AM, Pierre De Rop <[email protected]>
> wrote:
> >>>>>
> >>>>>> Hi David,
> >>>>>>
> >>>>>> Since this application is complex, I'm not able to provide logs
> because
> >>>>>> there are hundreds of components involved which are not mine, and
> for
> >>>> now,
> >>>>>> I'm not able to diagnose the problem.
> >>>>>>
> >>>>>> But I have created FELIX-4290, and joined to it an integration test
> which
> >>>>>> seems to reproduce the kind of problem I think I'm having in my
> >>>>>> application. I also joined the proposed patch.
> >>>>>>
> >>>>>> I did not have time to test the patch you suggested regarding the
> >>>>>> SingleComponentManager.reconfigure method, so let's continue to
> >>>> investigate
> >>>>>> using the jira issue and the test I attached to it.
> >>>>>>
> >>>>>> Thanks;
> >>>>>>
> >>>>>> /Pierre
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Oct 24, 2013 at 12:27 AM, David Jencks <
> [email protected]
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hi Pierre,
> >>>>>>>
> >>>>>>> I believe you that this code path doesn't work :-)
> >>>>>>>
> >>>>>>> I think there should be a less invasive way to fix this. By any
> chance
> >>>>>>> can you get a debug-enabled log from when this problem occurs? It
> would
> >>>>>>> help confirm my suspicions of what might be missing.
> >>>>>>>
> >>>>>>> FWIW I suspect SingleComponentManager.reconfigure is missing a
> check for
> >>>>>>> m_factoryProperties here (line 561):
> >>>>>>>
> >>>>>>> // nothing to do if there is no configuration (see
> FELIX-714)
> >>>>>>> if ( configuration == null && m_configurationProperties ==
> >>>>>>> null )
> >>>>>>> {
> >>>>>>> log( LogService.LOG_DEBUG, "No configuration provided
> (or
> >>>>>>> deleted), nothing to do", null );
> >>>>>>> return;
> >>>>>>> }
> >>>>>>>
> >>>>>>> Unless we can't figure anything out for sure I'd prefer to fix this
> >>>> before
> >>>>>>> the release.
> >>>>>>>
> >>>>>>> thanks
> >>>>>>> david jencks
> >>>>>>>
> >>>>>>> On Oct 23, 2013, at 3:09 PM, Pierre De Rop <[email protected]
> >
> >>>> wrote:
> >>>>>>>
> >>>>>>>> Hi David,
> >>>>>>>>
> >>>>>>>> (sorry to do all this noise while you are releasing ...)
> >>>>>>>>
> >>>>>>>> We are indeed using factory components; and today, I finally
> found and
> >>>>>>>> fixed a cycle, using the Apache Service Diagnostic tool; and I'm
> going
> >>>>>>>> further on but now I'm facing another problem which I did not
> have in
> >>>> the
> >>>>>>>> scr 1.6.2.
> >>>>>>>>
> >>>>>>>> So, I would like to discuss about this new problem with you
> before you
> >>>>>>> redo
> >>>>>>>> a release, in order to decide if this problem (if there is really
> one
> >>>> ?)
> >>>>>>>> shall be addressed now or after the upcoming release ?
> >>>>>>>>
> >>>>>>>> So, in our application, we are extensively using factory
> components
> >>>>>>>> (@Component(factory=XXX")).
> >>>>>>>> When we instantiate a factory component (using
> >>>>>>>> ComponentFactory.newInstance()), We pass to the newInstance()
> method
> >>>> some
> >>>>>>>> additional component properties which may also contain some target
> >>>>>>> filters.
> >>>>>>>>
> >>>>>>>> This allows to dynamically configure the filter of some References
> >>>>>>> declared
> >>>>>>>> in the factory component.
> >>>>>>>> in the scr 1.6.2, this mechanism was working fine. But using
> trunk,
> >>>> this
> >>>>>>>> does not work all the time. Some target filters seem to be
> correctly
> >>>>>>>> configured, and some others are not (I'm not sure, actually, it's
> late
> >>>>>>> ...).
> >>>>>>>>
> >>>>>>>> So, it looks like sometimes, some target filters are not updated
> before
> >>>>>>>> activating components ? or factory components ?
> >>>>>>>>
> >>>>>>>> I'm not sure but this might be related to the old FELIX-3726.
> >>>>>>>> Now, interestingly, I did the following patch and my application
> is now
> >>>>>>>> working fine: In the AbstractComponentManager class, I
> systematically
> >>>>>>>> update target filters, like this:
> >>>>>>>>
> >>>>>>>> +++
> >>>>>>>>
> >>>>>>>
> >>>>
> src/main/java/org/apache/felix/scr/impl/manager/AbstractComponentManager.java
> >>>>>>>>
> >>>
> >>>
> >>
> >>
> >
>
>