[
https://issues.apache.org/jira/browse/KARAF-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vineeth updated KARAF-7948:
---------------------------
Description:
We have a customer case where Apache Karaf(agent) is installed as a *systemd*
[service|https://github.com/apache/karaf/blob/main/assemblies/features/base/src/main/resources/resources/bin/contrib/karaf-service-template.systemd].
When the virtual machine (VM) is restarted, where Apache Karaf is set up as a
service, the Karaf process sometimes enters a {*}stuck state{*}. Although the
PID is still running, Karaf becomes completely unresponsive.
There is no definite pattern — it happens intermittently. We investigated
multiple heap dumps but couldn't find any clear clues. However, in one
instance, we observed *two instances* of the {{FeaturesServiceImpl}} class,
which suggests that the Activator may have initialized {{FeaturesServiceImpl}}
{*}twice{*}, even though this only occurred once during our tests.
Later, we checked the state of each bundle. We noticed that the
mvn:org.apache.karaf.features/org.apache.karaf.features.extension/4.2.16 bundle
and the mvn:org.ops4j.pax.logging/pax-logging-log4j2-extra/1.11.15 bundle were
in *state 2* (resolved), while all other bundles were in *state 32*
(uninstalled). This points to a possible *race condition* or a low-level crash
occurring during startup.
We were able to reproduce the issue by looping a restart of the systemd service
using {{systemctl restart agent.service}} every 4 minutes. However, we have
been unable to pinpoint the exact cause of the problem.
Please review this and let us know if you can help us further diagnose the
issue.
Details
Customer uses Suse Linux on z , installed as rpm package using zypper
Apache karaf version: 4.2.16
Log : There no log wriiten after startup. It will simply writes nothing.
We can share more details, if needed
was:
We have a customer case where Apache Karaf(agent) is installed as a *systemd*
service. When the virtual machine (VM) is restarted, where Apache Karaf is set
up as a service, the Karaf process sometimes enters a {*}stuck state{*}.
Although the PID is still running, Karaf becomes completely unresponsive.
There is no definite pattern — it happens intermittently. We investigated
multiple heap dumps but couldn't find any clear clues. However, in one
instance, we observed *two instances* of the {{FeaturesServiceImpl}} class,
which suggests that the Activator may have initialized {{FeaturesServiceImpl}}
{*}twice{*}, even though this only occurred once during our tests.
Later, we checked the state of each bundle. We noticed that the
mvn:org.apache.karaf.features/org.apache.karaf.features.extension/4.2.16 bundle
and the mvn:org.ops4j.pax.logging/pax-logging-log4j2-extra/1.11.15 bundle were
in *state 2* (resolved), while all other bundles were in *state 32*
(uninstalled). This points to a possible *race condition* or a low-level crash
occurring during startup.
We were able to reproduce the issue by looping a restart of the systemd service
using {{systemctl restart agent.service}} every 4 minutes. However, we have
been unable to pinpoint the exact cause of the problem.
Please review this and let us know if you can help us further diagnose the
issue.
Details
Customer uses Suse Linux on z , installed as rpm package using zypper
Apache karaf version: 4.2.16
Log : There no log wriiten after startup. It will simply writes nothing.
We can share more details, if needed
> Apache karaf got into stuck state
> ---------------------------------
>
> Key: KARAF-7948
> URL: https://issues.apache.org/jira/browse/KARAF-7948
> Project: Karaf
> Issue Type: Bug
> Components: karaf
> Affects Versions: 4.2.16
> Reporter: Vineeth
> Priority: Major
>
> We have a customer case where Apache Karaf(agent) is installed as a *systemd*
> [service|https://github.com/apache/karaf/blob/main/assemblies/features/base/src/main/resources/resources/bin/contrib/karaf-service-template.systemd].
> When the virtual machine (VM) is restarted, where Apache Karaf is set up as
> a service, the Karaf process sometimes enters a {*}stuck state{*}. Although
> the PID is still running, Karaf becomes completely unresponsive.
> There is no definite pattern — it happens intermittently. We investigated
> multiple heap dumps but couldn't find any clear clues. However, in one
> instance, we observed *two instances* of the {{FeaturesServiceImpl}} class,
> which suggests that the Activator may have initialized
> {{FeaturesServiceImpl}} {*}twice{*}, even though this only occurred once
> during our tests.
> Later, we checked the state of each bundle. We noticed that the
> mvn:org.apache.karaf.features/org.apache.karaf.features.extension/4.2.16
> bundle and the mvn:org.ops4j.pax.logging/pax-logging-log4j2-extra/1.11.15
> bundle were in *state 2* (resolved), while all other bundles were in *state
> 32* (uninstalled). This points to a possible *race condition* or a low-level
> crash occurring during startup.
> We were able to reproduce the issue by looping a restart of the systemd
> service using {{systemctl restart agent.service}} every 4 minutes. However,
> we have been unable to pinpoint the exact cause of the problem.
> Please review this and let us know if you can help us further diagnose the
> issue.
> Details
> Customer uses Suse Linux on z , installed as rpm package using zypper
> Apache karaf version: 4.2.16
> Log : There no log wriiten after startup. It will simply writes nothing.
> We can share more details, if needed
--
This message was sent by Atlassian Jira
(v8.20.10#820010)