Could we get a complete thread dump and open a JIRA issue on this so we can try to figure out who is responsible?

It is not clear to me who is holding which locks from this info.

-> richard

On 9/11/09 10:09, Guillaume Nodet wrote:
I think the problem is that the Felix.refreshPackages holds the global
lock and try to stop the bundle.  This thread is blocked on
AbstractComponentManager.disposeInternal which is a synchronized
method, because the other thread (which has the lock on
AbstractComponentManager) is waiting for the global lock.

This looks like a synchronization / lock / reentrance issue in either
Felix Framework or SCR.  I suspect it can be fixed in the Framework
somehow, as this problem happens with both SCR and Spring-DM and not
on Equinox.

On Fri, Sep 11, 2009 at 15:15, Jeremias Maerki<[email protected]>  wrote:
I tried to scale down the environment in which to reproduce the problem.
I'm now running a bare Karaf (SVN Trunk, default settings) with just SCR
added and a simple bundle publishing a "Greeter" service with an SCR component
factory.

In about 50% of the cases I can force the "cannot be started, since it
is either starting or stopping" error by recompiling SCR Trunk and
copying the JAR into the Karaf deploy directory. 50%. That means it
smells like a multi-threading problem. BTW, the error gets repeated
indefinitely in this case. Karaf cannot be shutdown gracefully anymore. I
ran JConsole and found this when the error occurs:

Name: FelixPackageAdmin
State: BLOCKED on org.apache.felix.scr.impl.manager.componentfactoryi...@c41424 
owned by: SCR Component Actor
Total blocked: 9  Total waited: 4

Stack trace:
org.apache.felix.scr.impl.manager.AbstractComponentManager.disposeInternal(AbstractComponentManager.java:646)
org.apache.felix.scr.impl.manager.AbstractComponentManager.dispose(AbstractComponentManager.java:548)
org.apache.felix.scr.impl.manager.ComponentFactoryImpl.disposeComponents(ComponentFactoryImpl.java:346)
org.apache.felix.scr.impl.BundleComponentActivator.dispose(BundleComponentActivator.java:315)
org.apache.felix.scr.impl.Activator.disposeComponents(Activator.java:285)
org.apache.felix.scr.impl.Activator.bundleChanged(Activator.java:198)
org.apache.felix.framework.util.EventDispatcher.invokeBundleListenerCallback(EventDispatcher.java:800)
org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:728)
org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:610)
org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:3592)
org.apache.felix.framework.Felix.stopBundle(Felix.java:2063)
org.apache.felix.framework.Felix$RefreshHelper.stop(Felix.java:4271)
org.apache.felix.framework.Felix.refreshPackages(Felix.java:3410)
org.apache.felix.framework.PackageAdminImpl.run(PackageAdminImpl.java:329)
java.lang.Thread.run(Thread.java:619)

and...

Name: SCR Component Actor
State: WAITING on [Ljava.lang.Object;@19de9c9
Total blocked: 2  Total waited: 2

Stack trace:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
org.apache.felix.framework.Felix.acquireGlobalLock(Felix.java:4573)
org.apache.felix.framework.Felix.access$400(Felix.java:40)
org.apache.felix.framework.Felix$FelixResolver.resolveDynamicImport(Felix.java:3912)
org.apache.felix.framework.ModuleImpl.searchDynamicImports(ModuleImpl.java:1367)
org.apache.felix.framework.ModuleImpl.findClassOrResourceByDelegation(ModuleImpl.java:677)
org.apache.felix.framework.ModuleImpl.access$100(ModuleImpl.java:60)
org.apache.felix.framework.ModuleImpl$ModuleClassLoader.loadClass(ModuleImpl.java:1629)
java.lang.ClassLoader.loadClass(ClassLoader.java:252)
java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
   - locked org.apache.felix.framework.moduleimpl$moduleclassloa...@2098ff
org.apache.felix.scr.impl.BundleComponentActivator.log(BundleComponentActivator.java:548)
org.apache.felix.scr.impl.manager.AbstractComponentManager.log(AbstractComponentManager.java:787)
org.apache.felix.scr.impl.manager.AbstractComponentManager$Enabled.activateInternal(AbstractComponentManager.java:230)
org.apache.felix.scr.impl.manager.AbstractComponentManager.activateInternal(AbstractComponentManager.java:636)
   - locked org.apache.felix.scr.impl.manager.componentfactoryi...@c41424
org.apache.felix.scr.impl.manager.AbstractComponentManager$1.doRun(AbstractComponentManager.java:501)
org.apache.felix.scr.impl.ComponentActivatorTask.run(ComponentActivatorTask.java:68)
org.apache.felix.scr.impl.ComponentActorThread.run(ComponentActorThread.java:97)

Then I removed SCR 1.0.9.SNAPSHOT and added SCR 1.0.8. Then I created a
copy of that JAR and added a random text file to the JAR so I have a JAR
with a different checksum. If I switch between the two, I get a single
"cannot be started..." every now and then but the system recovers.

I then went back to the latest SCR Trunk and instead switched the
Framework to 2.0.0 and then to 1.6.1. The problem is always reproducible
(after at most 3 attempts).

Switching to Equinox, the problem disappears.

I hope that helps to reproduce the problem on your side.

On 09.09.2009 15:47:26 Jeremias Maerki wrote:
Good idea! However, using Equinox, I'm seeing something else (not always
but most cases):

15:24:55,765|ERROR| derThread-11 | WaiterApplicationContextExecutor | 
WaiterApplicationContextExecutor  432 | Unable to create application context for
[ch.jm.xxx.job.submission.http], unsatisfied dependencies: none
java.lang.IllegalStateException: BundleContext is no longer valid
         at 
org.eclipse.osgi.framework.internal.core.BundleContextImpl.checkValid(BundleContextImpl.java:1000)
         at 
org.eclipse.osgi.framework.internal.core.BundleContextImpl.getServiceReferences(BundleContextImpl.java:571)
         at 
org.springframework.osgi.util.OsgiServiceReferenceUtils.getServiceReferences(OsgiServiceReferenceUtils.java:159)
         at 
org.springframework.osgi.util.OsgiServiceReferenceUtils.getServiceReferences(OsgiServiceReferenceUtils.java:195)
         at 
org.springframework.osgi.util.OsgiServiceReferenceUtils.isServicePresent(OsgiServiceReferenceUtils.java:327)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.MandatoryServiceDependency.isServicePresent(MandatoryServiceDependency.java:82)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager.doFindDependencies(DependencyServiceManager.java:287)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager.access$700(DependencyServiceManager.java:40)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager$1.run(DependencyServiceManager.java:213)
         at 
org.springframework.osgi.extender.internal.util.PrivilegedUtils.executeWithCustomTCCL(PrivilegedUtils.java:124)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager.findServiceDependencies(DependencyServiceManager.java:209)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.stageOne(DependencyWaiterApplicationContextExecutor.java:248)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.refresh(DependencyWaiterApplicationContextExecutor.java:175)
         at 
org.springframework.osgi.context.support.AbstractDelegatedExecutionApplicationContext.refresh(AbstractDelegatedExecutionApplicationContext.java:175)
         at 
org.springframework.osgi.extender.internal.activator.ContextLoaderListener$2.run(ContextLoaderListener.java:718)
         at java.lang.Thread.run(Thread.java:619)
15:24:55,765|ERROR| derThread-11 | ContextLoaderListener            | 
BundleApplicationContextListener   50 | Application context refresh failed
(OsgiBundleXmlApplicationContext(bundle=ch.jm.xxx.job.submission.http, 
config=osgibundle:/META-INF/spring/*.xml))
java.lang.IllegalStateException: BundleContext is no longer valid
         at 
org.eclipse.osgi.framework.internal.core.BundleContextImpl.checkValid(BundleContextImpl.java:1000)
         at 
org.eclipse.osgi.framework.internal.core.BundleContextImpl.getServiceReferences(BundleContextImpl.java:571)
         at 
org.springframework.osgi.util.OsgiServiceReferenceUtils.getServiceReferences(OsgiServiceReferenceUtils.java:159)
         at 
org.springframework.osgi.util.OsgiServiceReferenceUtils.getServiceReferences(OsgiServiceReferenceUtils.java:195)
         at 
org.springframework.osgi.util.OsgiServiceReferenceUtils.isServicePresent(OsgiServiceReferenceUtils.java:327)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.MandatoryServiceDependency.isServicePresent(MandatoryServiceDependency.java:82)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager.doFindDependencies(DependencyServiceManager.java:287)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager.access$700(DependencyServiceManager.java:40)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager$1.run(DependencyServiceManager.java:213)
         at 
org.springframework.osgi.extender.internal.util.PrivilegedUtils.executeWithCustomTCCL(PrivilegedUtils.java:124)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyServiceManager.findServiceDependencies(DependencyServiceManager.java:209)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.stageOne(DependencyWaiterApplicationContextExecutor.java:248)
         at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.refresh(DependencyWaiterApplicationContextExecutor.java:175)
         at 
org.springframework.osgi.context.support.AbstractDelegatedExecutionApplicationContext.refresh(AbstractDelegatedExecutionApplicationContext.java:175)
         at 
org.springframework.osgi.extender.internal.activator.ContextLoaderListener$2.run(ContextLoaderListener.java:718)
         at java.lang.Thread.run(Thread.java:619)

So, something is still off. Still, the application context is
initialized properly in the end, the services active.

I'm not sure if it is a follow-up error of some kind and if it could
have an influence here, but there's a bundle "B" that depends on the one
above bundle "A" (package import) and publishes a couple of services (B1,
B2) via SCR. Another bundle "C" provides (via SCR) a service C1 which is
used by bundle "A". C1 references B1 (dynamic, 0..1) and B2 (dynamic,
0..n). After updating bundle "A", bundle "B" is refreshed (and therefore
republishes its services), but the old B1 and B2 are still reported
available through the SCR webconsole plug-in although B1new and B2new
have been added in the meantime and B1 and B2 are no more in the service
registry. Very weird.

On 09.09.2009 14:20:38 Guillaume Nodet wrote:
Can you try to reproduce the problem using equinox instead ?
Just edit the etc/config.properties file and switch to
"karaf.framework=equinox".

I think the Felix framework may react badly.  Looking at [1], the
framework should try to stop the bundle, which means it must wait
until the bundle is either started or stopped, then stop if if needed.
  Looking at the code, if the bundle is either starting or stopping,
the exception you've seen is thrown.

[1] 
http://www.osgi.org/javadoc/r4v41/org/osgi/framework/Bundle.html#update%28%29

On Wed, Sep 9, 2009 at 12:38, Jeremias Maerki<[email protected]>  wrote:
Keywords: FileInstall, Karaf, SpringDM, CXF

I'm suspecting something wrong in either FileInstall, Felix Framework,
SpringDM, CXF or my spring.xml in a bundle. Not sure which one it is,
yet, though I have the impression that it's one of the first two in the
list. I'm hoping someone might have an idea.

Stopping and starting the bundle manually (via WebConsole) works fine.

But re-deploying the bundle after a new build into the directory
observed by FileInstall (from Karaf Trunk build made today), the Spring
Context doesn't seem to be closing properly. In this case, I'm seeing a
number of these (can be quite many, issued until the bundle is really
stopped):

11:57:19,828|ERROR| ted-bundles} | fileinstall                      | ?         
                          ? | Failed to update artifact
C:\Dev\xxxxx\_container\apache-felix-karaf-0.9.0-SNAPSHOT\..\deploy\ch.jm.xxx.job.submission.http.jar
org.osgi.framework.BundleException: Bundle ch.jm.xxx.job.submission.http [115] 
cannot be update, since it is either starting or stopping.
        at org.apache.felix.framework.Felix.updateBundle(Felix.java:1786)
        at org.apache.felix.framework.BundleImpl.update(BundleImpl.java:908)
        at 
org.apache.felix.fileinstall.internal.DirectoryWatcher.update(DirectoryWatcher.java:762)
        at 
org.apache.felix.fileinstall.internal.DirectoryWatcher.update(DirectoryWatcher.java:621)
        at 
org.apache.felix.fileinstall.internal.DirectoryWatcher.run(DirectoryWatcher.java:286)

Right after that:

11:57:29,828|ERROR| PackageAdmin | RunnableTimedExecution           | 
oncurrent.RunnableTimedExecution  109 | Closing runnable for context 
OsgiBundleXmlApplicationContext(bundle=ch.jm.xxx.job.submissi
on.http, config=osgibundle:/META-INF/spring/*.xml) did not finish in 10000ms; 
consider taking a snapshot and then shutdown the VM in case the thread still 
hangs
11:57:29,843|INFO | PackageAdmin | ultOsgiApplicationContextCreator | 
ultOsgiApplicationContextCreator   67 | Discovered configurations 
{osgibundle:/META-INF/spring/*.xml} in bundle [xxx :: Job :: Sub
mission :: HTTP POST Adapter (ch.jm.xxx.job.submission.http)]
11:57:29,843|INFO | derThread-18 | OsgiBundleXmlApplicationContext  | 
pport.AbstractApplicationContext  411 | Refreshing 
org.springframework.osgi.context.support.osgibundlexmlapplicationcont...@c81850
: display name 
[OsgiBundleXmlApplicationContext(bundle=ch.jm.xxx.job.submission.http, 
config=osgibundle:/META-INF/spring/*.xml)]; startup date [Wed Sep 09 11:57:29 
CEST 2009]; root of context hierarch
y
11:57:29,843|INFO | derThread-18 | OsgiBundleXmlApplicationContext  | 
ractOsgiBundleApplicationContext  359 | Unpublishing application context OSGi 
service for bundle xxx :: Job :: Submission :: HTTP
POST Adapter (ch.jm.xxx.job.submission.http)
11:57:29,859|INFO | ted-bundles} | fileinstall                      | ?         
                          ? | Started bundle: 
file:/C:/Dev/xxxxx/_container/deploy/ch.jm.xxx.job.submission.http.jar
11:57:29,859|INFO | ted-bundles} | fileinstall                      | ?         
                          ? | Started bundle: 
file:/C:/Dev/xxxxx/_container/deploy/ch.jm.xxx.job.submission.http.jar

Note again that there are multiple notifications that the bundle has
been started.

After that I get some follow-up errors because Spring doesn't seem to
have stopped or started gracefully.

I get the impression that in conjunction with FileInstall, an update
doesn't wait until the bundle is properly stopped and already starts the
updated one which causes trouble inside SpringDM.

For completeness, here is one of the error messages from SpringDM:

(the following appears on the console and in the log)
ka...@root>  Exception in thread "SpringOsgiExtenderThread-9" 
java.lang.IllegalStateException: BeanFactory not initialized or already closed - call 
'refresh' before accessing beans via the ApplicationC
ontext
        at 
org.springframework.context.support.AbstractRefreshableApplicationContext.getBeanFactory(AbstractRefreshableApplicationContext.java:153)
        at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.close(DependencyWaiterApplicationContextExecutor.java:345)
        at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.fail(DependencyWaiterApplicationContextExecutor.java:401)
        at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.stageOne(DependencyWaiterApplicationContextExecutor.java:287)
        at 
org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor.refresh(DependencyWaiterApplicationContextExecutor.java:175)
        at 
org.springframework.osgi.context.support.AbstractDelegatedExecutionApplicationContext.refresh(AbstractDelegatedExecutionApplicationContext.java:175)
        at 
org.springframework.osgi.extender.internal.activator.ContextLoaderListener$2.run(ContextLoaderListener.java:718)
        at java.lang.Thread.run(Thread.java:619)

Please note that the errors issued by Spring are not always the same.
I've only listed the consistent exception in that context.

This is not a crucial problem for me as it only appears during a
re-deployment. I can work around that by stopping the bundle manually
before re-deployment. But maybe this gives some hints about a possible
bug somewhere.

Thanks,
Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]




--
Cheers,
Guillaume Nodet
------------------------
Blog: http://gnodet.blogspot.com/
------------------------
Open Source SOA
http://fusesource.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]




Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]




Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]




Reply via email to