[ https://issues.apache.org/jira/browse/SLING-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858099#comment-15858099 ]
Karl Pauls commented on SLING-5457: ----------------------------------- I think I can see what is going on namely, while the installer is active there is a start level change going on at the same time and the two are racing for the same bundle. That makes it so that sometimes the interaction is: Bundle: ACTIVE Installer: stop bundle Bundle: STOPPED Startlevel: start bundle Bundle: STARTING Installer: update bundle Exception: bundle STARTING Bundle: ACTIVE In reality, this can be generalised to any two management agents racing for the same bundle in this sequence. The bundle update isn’t trying to wait for a bundle that is in the STOPPING or STARTING state. Instead, as mentioned in the issue, an exception is thrown and I think that is actually a bug in Felix (technically, its more a missing feature but that is besides the point) as newer versions of the spec mandate that on an update the framework should wait for bundles that are STOPPING or STARTING - hence, the real fix for this issue is to implement that behaviour in the Felix framework. However, additionally, I think that this specific interaction with the start level change and the installer is somewhat unfortunate. It probably would be worthwhile for the installer to try to only be active when there is no start level change going on (I remember that there was some other bug report on the sling dev list recently that I suspect might be related to this interaction). Implementing a retry as proposed here should be ok as a short term bandaid. Ultimatly, I’d say this should be addressed by an improved Felix framework and possibly a better handling of start level changes by the installer. I created FELIX-5528 to try to address this in the framework (as well as trying to improve the error message as well as part of FELIX-5138). > OsgiInstaller should retry to start bundles on failures > ------------------------------------------------------- > > Key: SLING-5457 > URL: https://issues.apache.org/jira/browse/SLING-5457 > Project: Sling > Issue Type: Bug > Components: Installer > Affects Versions: Installer Core 3.6.4 > Reporter: Jörg Hoh > > The OsgiInstaller doesn't update a bundle properly, if there's an exception > from the framework. > I have this exception: > {code} > 11.12.2015 14:09:36.753 *INFO* [FelixStartLevel] my.custom.bundle BundleEvent > RESOLVED > 11.12.2015 14:09:36.753 *INFO* [FelixStartLevel] my.custom.bundle BundleEvent > STARTING > 11.12.2015 14:09:36.754 INFO [OsgiInstallerImpl] > org.apache.sling.installer.core.impl.tasks.BundleUpdateTask Removing failing > update task - unable to retry: BundleUpdateTask: > TaskResource(url=jcrinstall:/apps/myapp/install/my.custom.bundle-1.5.6-SNAPSHOT.jar, > entity=bundle:my.custom.bundle, state=INSTALL, > attributes=[org.apache.sling.installer.api.tasks.ResourceTransformer=:28:84:15:, > Bundle-SymbolicName=my.custom.bundle, Bundle-Version=1.5.6-SNAPSHOT], > digest=1449838063263) > org.osgi.framework.BundleException: Bundle my.custom.bundle [252] cannot be > update, since it is either starting or stopping. > at org.apache.felix.framework.Felix.updateBundle(Felix.java:2311) > at org.apache.felix.framework.BundleImpl.update(BundleImpl.java:995) > at > org.apache.sling.installer.core.impl.tasks.BundleUpdateTask.execute(BundleUpdateTask.java:92) > at > org.apache.sling.installer.core.impl.OsgiInstallerImpl.doExecuteTasks(OsgiInstallerImpl.java:847) > at > org.apache.sling.installer.core.impl.OsgiInstallerImpl.executeTasks(OsgiInstallerImpl.java:689) > at > org.apache.sling.installer.core.impl.OsgiInstallerImpl.run(OsgiInstallerImpl.java:265) > at java.lang.Thread.run(Thread.java:767) > {code} > I don't know for what reason the Felix.updateBundle() failed (see also > FELIX-5138 to get some more information in this case), but from my point of > view there should be a dedicated error handling just for the > {code}BundleImpl.update{code} call. Does it make sense to retry the > installation at a later point in time (maybe 3 times at max)? > (I got this exception when I deployed a large number of bundles through the > JCR installer. It happens only once in a while, but it's an annoying task to > fix it manually.) -- This message was sent by Atlassian JIRA (v6.3.15#6346)