[ 
https://issues.apache.org/jira/browse/SLING-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hans-Peter Stoerr updated SLING-10372:
--------------------------------------
    Description: 
(As requested in SLING-10362) When starting a Sling Starter 12 with a feature 
archive containing a couple of packages and having a couple of packages 
installed with the Sling [fileinstaller 
provider|https://sling.apache.org/documentation/bundles/file-installer-provider.html],
 I often get a NPE, stacktrace is appended. This stops the installation of the 
package when this happens. It isn't about that particular package, though - if 
I take out other packages from the automatic installation or put it into the 
fileinstall directory it later, it installs happily.

It's a rather difficult to give detailed steps to reproduce that, but I have 
guess what's happening. I do have a particular setting where it always happens 
on my machine, but that might be sensitive to the speed of my machine and 
whatnot. Basically, I'm starting the feature launcher with a FAR containg 
several packages of ours, and also give the arguments

-Dsling.fileinstall.dir=launcher/fileinstall -Dfelix.startlevel.bundle=30

to the launcher, having placed several packages in the fileinstall directory. I 
guess the NPE happens only when enough packages are placed there, and it 
happens only on the first startup (i.e., there was no launcher directory yet).

I had a look around with the debugger: it seems the SlingRepository was stopped 
but not yet started again for a restart just before the PackageTransformer is 
trying to process the package, probably due to some kind of configuration 
change. It tries to access the repository via a reference of the 
OakSlingRepository whose manager already has been stopped so that 
getRepository() returns null. Hence the NPE. Probably the 
org.apache.sling.installer.factory.packages.impl.PackageTransformer should 
somehow handle such temporary failures that don't have anything to do with the 
package? Another way to solve seems to be to set the start level of the 
org.apache.sling.installer.factory.packages bundle to 21. Probably because when 
reaching the start level 20 so much happens at once, so that transition is not 
a good time to install packages.

Here is the stacktrace that marks the error. I'll attach a logfile for some 
more context. BTW: Interesting might be also the exceptions "Can't create child 
on a synthetic root" in the log file, which I receive regularily during 
startup, but that's probably not related to this problem, as it also happens 
when things work properly.
{code:java}
11.05.2021 13:27:49.462 *ERROR* [OsgiInstallerImpl] 
org.apache.sling.installer.factory.packages.impl.PackageTransformer Error while 
processing install content package task* of 
TaskResource(url=fileinstallff43091e0ee8ac91416c79636bdce5f4:/Users/hps/dev/composum/composum-launch/feature/composumstarter/target/launcher/fileinstall/99/composum-site-app-package-1.0.0-SNAPSHOT.zip,
 entity=content-package:tenants/ist:composum-site-app-package, state=INSTALL, 
attributes=[org.apache.sling.installer.api.tasks.ResourceTransformer=:27:23:1243:,
 package-id=tenants/ist:composum-site-app-package:1.0.0-SNAPSHOT, 
Bundle-Version=1.0.0.SNAPSHOT], digest=1620718306467) due to null, no retry.
 java.lang.NullPointerException: null
         at 
org.apache.sling.jcr.oak.server.internal.OakSlingRepository$2.run(OakSlingRepository.java:99)
 [org.apache.sling.jcr.oak.server:1.2.10]
         at 
org.apache.sling.jcr.oak.server.internal.OakSlingRepository$2.run(OakSlingRepository.java:96)
 [org.apache.sling.jcr.oak.server:1.2.10]
         at java.base/java.security.AccessController.doPrivileged(Native Method)
         at 
java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
         at 
org.apache.sling.jcr.oak.server.internal.OakSlingRepository.createServiceSession(OakSlingRepository.java:96)
 [org.apache.sling.jcr.oak.server:1.2.10]
         at 
org.apache.sling.jcr.base.AbstractSlingRepository2.createServiceSession(AbstractSlingRepository2.java:166)
 [org.apache.sling.jcr.base:3.1.6]
         at 
org.apache.sling.jcr.base.AbstractSlingRepository2.loginService(AbstractSlingRepository2.java:383)
 [org.apache.sling.jcr.base:3.1.6]
         at 
org.apache.sling.installer.factory.packages.impl.PackageTransformer$AbstractPackageInstallTask.execute(PackageTransformer.java:263)
 [org.apache.sling.installer. factory.packages:1.0.4]
         at 
org.apache.sling.installer.core.impl.OsgiInstallerImpl.doExecuteTasks(OsgiInstallerImpl.java:918)
 [org.apache.sling.installer.core:3.11.4]
         at 
org.apache.sling.installer.core.impl.OsgiInstallerImpl.executeTasks(OsgiInstallerImpl.java:755)
 [org.apache.sling.installer.core:3.11.4]
         at 
org.apache.sling.installer.core.impl.OsgiInstallerImpl.run(OsgiInstallerImpl.java:304)
 [org.apache.sling.installer.core:3.11.4]
         at java.base/java.lang.Thread.run(Thread.java:834)
{code}
I'm not sure whether this is a a Minor or Major - it breaks things in the 
startup, but I've found a way to modify the starter to avoid it, see above.

  was:
(As requested in SLING-10362) When starting a Sling Starter 12 with a feature 
archive containing a couple of packages and having a couple of packages 
installed with the Sling [fileinstaller 
provider|https://sling.apache.org/documentation/bundles/file-installer-provider.html],
 I often get a NPE, stacktrace is appended. This stops the installation of the 
package when this happens. It isn't about that particular package, though - if 
I take out other packages from the automatic installation or put it into the 
fileinstall directory it later, it installs happily.

It's a rather difficult to give detailed steps to reproduce that, but I have 
guess what's happening. I do have a particular setting where it always happens 
on my machine, but that might be sensitive to the speed of my machine and 
whatnot. Basically, I'm starting the feature launcher with a FAR containg 
several packages of ours, and also give the arguments

-Dsling.fileinstall.dir=launcher/fileinstall -Dfelix.startlevel.bundle=30

to the launcher, having placed several packages in the fileinstall directory. I 
guess the NPE happens only when enough packages are placed there, and it 
happens only on the first startup (i.e., there was no launcher directory yet).

I had a look around with the debugger: it seems the SlingRepository was stopped 
but not yet started again for a restart just before the PackageTransformer is 
trying to process the package, probably due to some kind of configuration 
change. It tries to access the repository via a reference of the 
OakSlingRepository whose manager already has been stopped so that 
getRepository() returns null. Hence the NPE. Probably the 
org.apache.sling.installer.factory.packages.impl.PackageTransformer should 
somehow handle such temporary failures that don't have anything to do with the 
package? Another way to solve seems to be to set the start level of the 
org.apache.sling.installer.factory.packages bundle to 21. Probably because when 
reaching the start level 20 so much happens at once, so that transition is not 
a good time to install packages.

Here is the stacktrace that marks the error. I'll attach a logfile for some 
more context. BTW: Interesting might be also the exceptions "Can't create child 
on a synthetic root" in the log file, which I receive regularily during 
startup, but that's probably not related to this problem, as it also happens 
when things work properly.

{code}
11.05.2021 13:27:49.462 *ERROR* [OsgiInstallerImpl] 
org.apache.sling.installer.factory.packages.impl.PackageTransformer Error while 
processing install content package task* of 
TaskResource(url=fileinstallff43091e0ee8ac91416c79636bdce5f4:/Users/hps/dev/composum/composum-launch/feature/composumstarter/target/launcher/fileinstall/99/composum-si*te-app-package-1.0.0-SNAPSHOT.zip,
 entity=content-package:tenants/ist:composum-site-app-package, state=INSTALL, 
attributes=[org.apache.sling.installer.api.tasks.ResourceTr*ansformer=:27:23:1243:,
 package-id=tenants/ist:composum-site-app-package:1.0.0-SNAPSHOT, 
Bundle-Version=1.0.0.SNAPSHOT], digest=1620718306467) due to null, no retry.
 java.lang.NullPointerException: null
         at 
org.apache.sling.jcr.oak.server.internal.OakSlingRepository$2.run(OakSlingRepository.java:99)
 [org.apache.sling.jcr.oak.server:1.2.10]
         at 
org.apache.sling.jcr.oak.server.internal.OakSlingRepository$2.run(OakSlingRepository.java:96)
 [org.apache.sling.jcr.oak.server:1.2.10]
         at java.base/java.security.AccessController.doPrivileged(Native Method)
         at 
java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
         at 
org.apache.sling.jcr.oak.server.internal.OakSlingRepository.createServiceSession(OakSlingRepository.java:96)
 [org.apache.sling.jcr.oak.server:1.2.10]
         at 
org.apache.sling.jcr.base.AbstractSlingRepository2.createServiceSession(AbstractSlingRepository2.java:166)
 [org.apache.sling.jcr.base:3.1.6]
         at 
org.apache.sling.jcr.base.AbstractSlingRepository2.loginService(AbstractSlingRepository2.java:383)
 [org.apache.sling.jcr.base:3.1.6]
         at 
org.apache.sling.installer.factory.packages.impl.PackageTransformer$AbstractPackageInstallTask.execute(PackageTransformer.java:263)
 [org.apache.sling.installer. factory.packages:1.0.4]
         at 
org.apache.sling.installer.core.impl.OsgiInstallerImpl.doExecuteTasks(OsgiInstallerImpl.java:918)
 [org.apache.sling.installer.core:3.11.4]
         at 
org.apache.sling.installer.core.impl.OsgiInstallerImpl.executeTasks(OsgiInstallerImpl.java:755)
 [org.apache.sling.installer.core:3.11.4]
         at 
org.apache.sling.installer.core.impl.OsgiInstallerImpl.run(OsgiInstallerImpl.java:304)
 [org.apache.sling.installer.core:3.11.4]
         at java.base/java.lang.Thread.run(Thread.java:834)
{code}

I'm not sure whether this is a a Minor or Major - it breaks things in the 
startup, but I've found a way to modify the starter to avoid it, see above.


> OSGi Installer: NPE during package installation during startup
> --------------------------------------------------------------
>
>                 Key: SLING-10372
>                 URL: https://issues.apache.org/jira/browse/SLING-10372
>             Project: Sling
>          Issue Type: Bug
>          Components: Installer
>    Affects Versions: Starter 12
>         Environment:  Sling-Starter 12-SNAPSHOT (commit 0e6a8e41) with JDK 11 
> on MacOS
>            Reporter: Hans-Peter Stoerr
>            Priority: Minor
>             Fix For: Installer Packages Factory 1.0.6
>
>         Attachments: error.log
>
>
> (As requested in SLING-10362) When starting a Sling Starter 12 with a feature 
> archive containing a couple of packages and having a couple of packages 
> installed with the Sling [fileinstaller 
> provider|https://sling.apache.org/documentation/bundles/file-installer-provider.html],
>  I often get a NPE, stacktrace is appended. This stops the installation of 
> the package when this happens. It isn't about that particular package, though 
> - if I take out other packages from the automatic installation or put it into 
> the fileinstall directory it later, it installs happily.
> It's a rather difficult to give detailed steps to reproduce that, but I have 
> guess what's happening. I do have a particular setting where it always 
> happens on my machine, but that might be sensitive to the speed of my machine 
> and whatnot. Basically, I'm starting the feature launcher with a FAR containg 
> several packages of ours, and also give the arguments
> -Dsling.fileinstall.dir=launcher/fileinstall -Dfelix.startlevel.bundle=30
> to the launcher, having placed several packages in the fileinstall directory. 
> I guess the NPE happens only when enough packages are placed there, and it 
> happens only on the first startup (i.e., there was no launcher directory yet).
> I had a look around with the debugger: it seems the SlingRepository was 
> stopped but not yet started again for a restart just before the 
> PackageTransformer is trying to process the package, probably due to some 
> kind of configuration change. It tries to access the repository via a 
> reference of the OakSlingRepository whose manager already has been stopped so 
> that getRepository() returns null. Hence the NPE. Probably the 
> org.apache.sling.installer.factory.packages.impl.PackageTransformer should 
> somehow handle such temporary failures that don't have anything to do with 
> the package? Another way to solve seems to be to set the start level of the 
> org.apache.sling.installer.factory.packages bundle to 21. Probably because 
> when reaching the start level 20 so much happens at once, so that transition 
> is not a good time to install packages.
> Here is the stacktrace that marks the error. I'll attach a logfile for some 
> more context. BTW: Interesting might be also the exceptions "Can't create 
> child on a synthetic root" in the log file, which I receive regularily during 
> startup, but that's probably not related to this problem, as it also happens 
> when things work properly.
> {code:java}
> 11.05.2021 13:27:49.462 *ERROR* [OsgiInstallerImpl] 
> org.apache.sling.installer.factory.packages.impl.PackageTransformer Error 
> while processing install content package task* of 
> TaskResource(url=fileinstallff43091e0ee8ac91416c79636bdce5f4:/Users/hps/dev/composum/composum-launch/feature/composumstarter/target/launcher/fileinstall/99/composum-site-app-package-1.0.0-SNAPSHOT.zip,
>  entity=content-package:tenants/ist:composum-site-app-package, state=INSTALL, 
> attributes=[org.apache.sling.installer.api.tasks.ResourceTransformer=:27:23:1243:,
>  package-id=tenants/ist:composum-site-app-package:1.0.0-SNAPSHOT, 
> Bundle-Version=1.0.0.SNAPSHOT], digest=1620718306467) due to null, no retry.
>  java.lang.NullPointerException: null
>          at 
> org.apache.sling.jcr.oak.server.internal.OakSlingRepository$2.run(OakSlingRepository.java:99)
>  [org.apache.sling.jcr.oak.server:1.2.10]
>          at 
> org.apache.sling.jcr.oak.server.internal.OakSlingRepository$2.run(OakSlingRepository.java:96)
>  [org.apache.sling.jcr.oak.server:1.2.10]
>          at java.base/java.security.AccessController.doPrivileged(Native 
> Method)
>          at 
> java.base/javax.security.auth.Subject.doAsPrivileged(Subject.java:550)
>          at 
> org.apache.sling.jcr.oak.server.internal.OakSlingRepository.createServiceSession(OakSlingRepository.java:96)
>  [org.apache.sling.jcr.oak.server:1.2.10]
>          at 
> org.apache.sling.jcr.base.AbstractSlingRepository2.createServiceSession(AbstractSlingRepository2.java:166)
>  [org.apache.sling.jcr.base:3.1.6]
>          at 
> org.apache.sling.jcr.base.AbstractSlingRepository2.loginService(AbstractSlingRepository2.java:383)
>  [org.apache.sling.jcr.base:3.1.6]
>          at 
> org.apache.sling.installer.factory.packages.impl.PackageTransformer$AbstractPackageInstallTask.execute(PackageTransformer.java:263)
>  [org.apache.sling.installer. factory.packages:1.0.4]
>          at 
> org.apache.sling.installer.core.impl.OsgiInstallerImpl.doExecuteTasks(OsgiInstallerImpl.java:918)
>  [org.apache.sling.installer.core:3.11.4]
>          at 
> org.apache.sling.installer.core.impl.OsgiInstallerImpl.executeTasks(OsgiInstallerImpl.java:755)
>  [org.apache.sling.installer.core:3.11.4]
>          at 
> org.apache.sling.installer.core.impl.OsgiInstallerImpl.run(OsgiInstallerImpl.java:304)
>  [org.apache.sling.installer.core:3.11.4]
>          at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
> I'm not sure whether this is a a Minor or Major - it breaks things in the 
> startup, but I've found a way to modify the starter to avoid it, see above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to