Hi Roy,

On Thu, 2017-01-26 at 12:07 +0100, Roy Teeuwen wrote:
> Hey all,
> 
> We are in a situation where sometimes we have a deadlock when
> installing bundles. The routine we follow is the following:
> 
> - We upload a vlt package that has a filter on
> /apps/myproject/install that removes all existing bundles (it has
> mode="replace"), it is called the cleanup package. We do this to make
> sure that all osgi configs that were manually overriden in the system
> console get removed again and that all old bundles are deleted from
> the repo.
> - After the cleanup is installed, we wait until all bundles are up
> again by checking the system console to see if all bundles are active
> - When all bundles are active, we install our second vlt package that
> contains sub packages with our content, jar bundles,
> configurations,...
> 
> This seems to fail sometimes, the bundles don't come up again, we
> can't even go to system/console to check in which state he is, it
> does not open.
> If we then send a shutdown signal to sling, it is stuck in shutdown
> and won't go off, if we go look in the threads, we see the following:
> 
> - FelixShutdown thread that switches between waiting and runnable
> - org.apache.sling.installer.core.OsgiInstallerImpl that switches
> between waiting and runnable (if I look at the methods it is stuck in
> RestartActivateBundlesTask.execute => BundleImpl.start)
> 
> If i go look in the Felix framework code, I see the following comment
> for bundle.start:
> 
> void startBundle(BundleImpl bundle, int options) throws
> BundleException
> {
>     // CONCURRENCY NOTE:
>     // We will first acquire the bundle lock for the specific bundle
>     // as long as the bundle is INSTALLED, RESOLVED, or ACTIVE. If
> this
>     // bundle is not yet resolved, then it will be resolved too. In
>     // that case, the global lock will be acquired to make sure no
>     // bundles can be installed or uninstalled during the resolve.
>               ....
> 
> So I suspect there to be a sort of deadlock after the shutdown
> between the OsgiInstaller and the FelixShutdown. 
> We just don't know where it is happening, seeing as the problem is
> already there before we send a shutdown signal, what could we do to
> investigate / avoid this, any help is appreciated.

A complete stack trace would help pinpointing the origin, although this
sounds like a bug in Felix.

Robert

> 
> 
> Thanks!
> Roy

Reply via email to