Hi Roy, On Thu, 2017-01-26 at 12:07 +0100, Roy Teeuwen wrote: > Hey all, > > We are in a situation where sometimes we have a deadlock when > installing bundles. The routine we follow is the following: > > - We upload a vlt package that has a filter on > /apps/myproject/install that removes all existing bundles (it has > mode="replace"), it is called the cleanup package. We do this to make > sure that all osgi configs that were manually overriden in the system > console get removed again and that all old bundles are deleted from > the repo. > - After the cleanup is installed, we wait until all bundles are up > again by checking the system console to see if all bundles are active > - When all bundles are active, we install our second vlt package that > contains sub packages with our content, jar bundles, > configurations,... > > This seems to fail sometimes, the bundles don't come up again, we > can't even go to system/console to check in which state he is, it > does not open. > If we then send a shutdown signal to sling, it is stuck in shutdown > and won't go off, if we go look in the threads, we see the following: > > - FelixShutdown thread that switches between waiting and runnable > - org.apache.sling.installer.core.OsgiInstallerImpl that switches > between waiting and runnable (if I look at the methods it is stuck in > RestartActivateBundlesTask.execute => BundleImpl.start) > > If i go look in the Felix framework code, I see the following comment > for bundle.start: > > void startBundle(BundleImpl bundle, int options) throws > BundleException > { > // CONCURRENCY NOTE: > // We will first acquire the bundle lock for the specific bundle > // as long as the bundle is INSTALLED, RESOLVED, or ACTIVE. If > this > // bundle is not yet resolved, then it will be resolved too. In > // that case, the global lock will be acquired to make sure no > // bundles can be installed or uninstalled during the resolve. > .... > > So I suspect there to be a sort of deadlock after the shutdown > between the OsgiInstaller and the FelixShutdown. > We just don't know where it is happening, seeing as the problem is > already there before we send a shutdown signal, what could we do to > investigate / avoid this, any help is appreciated.
A complete stack trace would help pinpointing the origin, although this sounds like a bug in Felix. Robert > > > Thanks! > Roy