Hi Pierre, 

Much better to find these problems before a release than just after!

I saw an OOM once recently but haven't been able to reproduce it.

I'm looking into the NPE.  I think I see the timing hole it is using but need 
to think about it some more.

many thanks!
david jencks

On Oct 27, 2013, at 2:58 AM, Pierre De Rop <[email protected]> wrote:

> Hi David,
> 
> Looking at our configurator component we are currently using (but we will fix 
> it in order to use the multi-location "?"), I see this:
> 
> void configure(String pid, Dictionary pidConf) {
>      Configuration config = getConfiguration(_pid, null);
>      if (config.getBundleLocation() != null) {
>          config.setBundleLocation(null);
>      }
>      config.update(pidConf);
> }
> 
> So I believe that you are getting a null configuration because there is a 
> short window between the setBundleLocation(null) (at this point, the 
> configuration is null) and the config.update(pidConf) call ...
> 
> So, the good news is that I'm not having anymore some NPE using your latest 
> commits :-) and I think our application is now fully operational.
> 
> but ... (please don't start to abominate me  ) now, in order to do a final 
> check, I restarted the integration tests and there is still two problems:
> 
> 1) I'm sometimes getting some out of memory errors: this is probably caused 
> by the ComponentConcurrencyTest/Felix3680Test tests, which are currently 
> configured in DEBUG mode ?
> 
> 2) I ran the tests two times, and the second time, I got this exception with 
> the failing 
> Felix3680_2Test:
> 
> test_concurrent_injection_with_bundleContext(org.apache.felix.scr.integration.Felix3680_2Test)
>   Time elapsed: 36.597 sec  <<< ERROR!
> java.lang.NullPointerException
>         at 
> org.apache.felix.scr.impl.manager.DependencyManager.invokeUnbindMethod(DependencyManager.java:1710)
>         at 
> org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUnbindMethod(SingleComponentManager.java:387)
>         at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:355)
>         at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.removedService(DependencyManager.java:290)
>         at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1503)
>         at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerRemoved(ServiceTracker.java:1398)
>         at 
> org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.untrack(ServiceTracker.java:1258)
>         at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1437)
>         at 
> org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:932)
>         at 
> org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793)
>         at 
> org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543)
>         at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4260)
>         at org.apache.felix.framework.Felix.access$000(Felix.java:74)
>         at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:390)
>         at 
> org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:148)
>         at 
> org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:127)
>         at 
> org.apache.felix.scr.integration.components.felix3680_2.Main$RegistrationHelper$2.run(Main.java:136)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> 
> Are you also getting this exception ?
> 
> thanks
> 
> /Pierre
> 
> 
> 
> 
> 
> 
> 
> On Sat, Oct 26, 2013 at 6:34 PM, David Jencks <[email protected]> wrote:
> Hi PIerre,
> 
> Looking at the CA spec it looks like CA is supposed to send out 
> CM_LOCATION_CHANGED events even before any properties are set when 
> setBundleLocation is called.  I added some code to ignore these events.  Note 
> that DS is "reserving" the configurations for (one of) the component(s) that 
> will be consuming them by calling getConfiguration(pid).
> 
> I do wonder how the location to something non-null on your configurations 
> before the properties are set.
> 
> Waiting for the next bug :-)
> 
> thanks
> david jencks
> 
> On Oct 26, 2013, at 3:00 AM, Pierre De Rop <[email protected]> wrote:
> 
> > Hello David,
> >
> > The code we are using to configure our components is old, at at the time we 
> > wrote it, configadmin was not supporting multi-location. But I do agree, we 
> > can now use the "?" multi-location.
> >
> > Now, I'm sorry but I'm still seeing another NPE (sometimes, not always):
> >
> > 2013-10-26 11:45:44,209 CM Event Dispatcher (Fire ConfigurationEvent: 
> > pid=sipagent) ERROR osgi  - [43] Unexpected problem delivering 
> > configuration event to [org.osgi.service.cm.ConfigurationListener, id=102, 
> > bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar]
> >
> > java.lang.NullPointerException
> >         at 
> > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226)
> >         at 
> > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396)
> >         at 
> > org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:344)
> >         at 
> > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032)
> >         at 
> > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002)
> >         at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103)
> >         at java.lang.Thread.run(Thread.java:722)
> >
> >
> > I'm not sure, but it seems that ConfigAdmin is providing a null dictionary, 
> > when delivering a CM_LOCATION_CHANGED event ? if correct, then Is this a 
> > normal behavior ?
> >
> > This is strange; perhaps I shall start a new integration test ?
> >
> > /Pierre
> >
> >
> >
> >
> > On Sat, Oct 26, 2013 at 9:54 AM, David Jencks <[email protected]> 
> > wrote:
> > Hi Pierre,
> >
> > This pointed out a logic error I introduced for Felix 3651.  I opened 
> > https://issues.apache.org/jira/browse/FELIX-4293 and fixed the error I 
> > found which I think explains the NPE.  Could you check this?
> >
> > Could I ask what you are trying to do by setting the bundleLocation to 
> > null?  If you want to allow any bundle to receive the configuration you 
> > could use multi-location support and set the location to "?"  With the code 
> > you have now, if the configuration is already in use by a DS component, the 
> > location changed event will result in the bundle location being reset back 
> > to what it was.
> >
> > thanks!
> > david jencks
> > On Oct 25, 2013, at 8:32 AM, Pierre De Rop <[email protected]> wrote:
> >
> > > Hi David,
> > >
> > > thanks; The fix is fixing the problem :-)
> > >
> > > but ... there's now a new different problem: i'm now sometimes getting 
> > > this
> > > NPE, after SCR is receiving a CM_LOCATION_CHANGED event:
> > >
> > > 2013-10-25 16:11:44,674 CM Event Dispatcher (Fire ConfigurationEvent:
> > > pid=sipagent) ERROR osgi  - [43] Unexpected problem delivering
> > > configuration event to [org.osgi.service.cm.ConfigurationListener, id=102,
> > > bundle=341/reference:file:/home/nxuser/pp/bundles/custo/org.apache.felix.scr.jar]
> > >
> > > java.lang.NullPointerException
> > >       at
> > > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.getProperties(ComponentFactoryImpl.java:226)
> > >       at
> > > org.apache.felix.scr.impl.manager.ComponentFactoryImpl.configurationUpdated(ComponentFactoryImpl.java:396)
> > >       at
> > > org.apache.felix.scr.impl.config.ConfigurationSupport.configurationEvent(ConfigurationSupport.java:390)
> > >       at
> > > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2032)
> > >       at
> > > org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2002)
> > >       at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:103)
> > >       at java.lang.Thread.run(Thread.java:722)
> > >
> > > Perhaps a new jira issue shall be opened ?
> > >
> > > I think we are getting a CM_LOCATION_CHANGED event because in our
> > > application, we populate configuration admin by doing something like this:
> > >
> > > Configuration cfg = cm.getConfiguration(pid, null)
> > > if (config.getBundleLocation() != null) {
> > >     config.setBundleLocation(null);
> > > }
> > >
> > > The setBundleLocation(null) is probably useless, but this leads to a
> > > CM_LOCATION_CHANGED event, which then sometimes ends up with the NPE.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Friday, October 25, 2013, David Jencks <[email protected]> wrote:
> > >> Hi Pierre,
> > >>
> > >> You are so good at writing useful tests!!
> > >>
> > >> I found a place to call setTargets(getProperties()) from inside
> > > ComponentFactoryImpl that would have fewer side effects.  Could you see if
> > > this makes your actual applications work properly?  I'm uploading a
> > > snapshot.
> > >>
> > >> many thanks
> > >> david jencks
> > >>
> > >> On Oct 24, 2013, at 6:17 AM, Pierre De Rop <[email protected]> 
> > >> wrote:
> > >>
> > >>> Hi David,
> > >>>
> > >>> Since this application is complex, I'm not able to provide logs because
> > >>> there are hundreds of components involved which are not mine, and for
> > > now,
> > >>> I'm not able to diagnose the problem.
> > >>>
> > >>> But I have created FELIX-4290, and joined to it an integration test 
> > >>> which
> > >>> seems to reproduce the kind of problem I think I'm having in my
> > >>> application. I also joined the proposed patch.
> > >>>
> > >>> I did not have time to test the patch you suggested regarding the
> > >>> SingleComponentManager.reconfigure method, so let's continue to
> > > investigate
> > >>> using the jira issue and the test I attached to it.
> > >>>
> > >>> Thanks;
> > >>>
> > >>> /Pierre
> > >>>
> > >>>
> > >>> On Thu, Oct 24, 2013 at 12:27 AM, David Jencks <[email protected]
> > >> wrote:
> > >>>
> > >>>> Hi Pierre,
> > >>>>
> > >>>> I believe you that this code path doesn't work :-)
> > >>>>
> > >>>> I think there should be a less invasive way to fix this.  By any chance
> > >>>> can you get a debug-enabled log from when this problem occurs?  It 
> > >>>> would
> > >>>> help confirm my suspicions of what might be missing.
> > >>>>
> > >>>> FWIW I suspect SingleComponentManager.reconfigure is missing a check 
> > >>>> for
> > >>>> m_factoryProperties here (line 561):
> > >>>>
> > >>>>           // nothing to do if there is no configuration (see FELIX-714)
> > >>>>           if ( configuration == null && m_configurationProperties ==
> > >>>> null )
> > >>>>           {
> > >>>>               log( LogService.LOG_DEBUG, "No configuration provided (or
> > >>>> deleted), nothing to do", null );
> > >>>>               return;
> > >>>>           }
> > >>>>
> > >>>> Unless we can't figure anything out for sure I'd prefer to fix this
> > > before
> > >>>> the release.
> > >>>>
> > >>>> thanks
> > >>>> david jencks
> > >>>>
> > >>>> On Oct 23, 2013, at 3:09 PM, Pierre De Rop <[email protected]>
> > > wrote:
> > >>>>
> > >>>>> Hi David,
> > >>>>>
> > >>>>> (sorry to do all this noise while you are releasing ...)
> > >>>>>
> > >>>>> We are indeed using factory components; and today, I finally found and
> > >>>>> fixed a cycle, using the Apache Service Diagnostic tool; and I'm going
> > >>>>> further on but now I'm facing another problem which I did not have in
> > > the
> > >>>>> scr 1.6.2.
> > >>>>>
> > >>>>> So, I would like to discuss about this new problem with you before you
> > >>>> redo
> > >>>>> a release, in order to decide if this problem (if there is really one
> > > ?)
> > >>>>> shall be addressed now or after the upcoming release ?
> > >>>>>
> > >>>>> So, in our application, we are extensively using factory components
> > >>>>> (@Component(factory=XXX")).
> > >>>>> When we instantiate a factory component (using
> > >>>>> ComponentFactory.newInstance()), We pass to the newInstance() method
> > > some
> > >>>>> additional component properties which may also contain some target
> > >>>> filters.
> > >>>>>
> > >>>>> This allows to dynamically configure the filter of some References
> > >>>> declared
> > >>>>> in the factory component.
> > >>>>> in the scr 1.6.2, this mechanism was working fine. But using trunk,
> > > this
> > >>>>> does not work all the time. Some target filters seem to be correctly
> > >>>>> configured, and some others are not (I'm not sure, actually, it's late
> > >>>> ...).
> > >>>>>
> > >>>>> So, it looks like sometimes, some target filters are not updated 
> > >>>>> before
> > >>>>> activating components ? or factory components ?
> > >>>>>
> > >>>>> I'm not sure but this might be related to the old FELIX-3726.
> > >>>>> Now, interestingly, I did the following patch and my application is 
> > >>>>> now
> > >>>>> working fine: In the  AbstractComponentManager class, I systematically
> > >>>>> update target filters, like this:
> > >>>>>
> > >>>>> +++
> > >>>>>
> > >>>>
> > > src/main/java/org/apache/felix/scr/impl/manager/AbstractComponentManager.java
> > >>>>>
> >
> >
> 
> 

Reply via email to