On Fri, Oct 13, 2017 at 12:59 AM, Muthukumaran K <
muthukumara...@ericsson.com> wrote:

> Thanks a lot for the pointers Daniel and JamO.
>
>
>
> https://git.opendaylight.org/gerrit/gitweb?p=releng/
> builder.git;a=blob;f=jjb/packaging/stop-odl.sh;h=
> 2e3e7bf15dfbe6e59bddfbfd4ce4805fb47b2a69;hb=refs/heads/master#l27 which
> aligns with my thought too .. J
>
>
>
> Just a clarification .. had there been any situation which you could
> recollect where the karaf PID lingered abnormally long (beyond 10 – 15
> mins) during stop phase ? Have seen this once using vanilla distro  but was
> never able to repro the same for past 1 month or so even after several day
> 2 day restarts. May it was an env issue locally. So, I was a bit reserved
> in rolling the approach of stop followed by waiting till PID vanishes into
> production
>
>
>
> @Tom, @Robert,
>
>
>
> Not directly related but I will fire away …
>
>
>
> Erstwhile https://github.com/opendaylight/controller/blob/
> master/opendaylight/md-sal/sal-clustering-commons/src/
> main/java/org/opendaylight/controller/cluster/common/
> actor/QuarantinedMonitorActor.java used to restart the entire container
> and now on master Quarantined state just restarts the ActorSystem – is my
> understanding right ?
>

It restarts the enclosing bundle:

return QuarantinedMonitorActor.props(() -> {
            // restart the entire karaf container
            LOG.warn("Restarting karaf container");
            System.setProperty("karaf.restart.jvm", "true");
            bundleContext.getBundle().stop();
        });

It used to restart bundle 0. Not sure why that was changed....


>
> Regards
>
> Muthu
>
>
>
>
>
>
>
> *From:* Daniel Farrell [mailto:dfarr...@redhat.com]
> *Sent:* Friday, October 13, 2017 6:19 AM
> *To:* Jamo Luhrsen; Muthukumaran K; controller-dev@lists.opendaylight.org;
> integration-...@lists.opendaylight.org
> *Subject:* Re: [controller-dev] Best way to gracefully shutdown Karaf in
> ODL context
>
>
>
> Hey Muthu,
>
>
>
> Yes, I think you should take a look at the systemd configuration we ship
> in ODL's packages. As far as I know it does a good job of
> starting/stopping/restarting ODL's service.
>
>
>
> https://git.opendaylight.org/gerrit/gitweb?p=integration/
> packaging.git;a=blob;f=packages/rpm/unitfiles/opendaylight.service;h=
> ac436592d2880047986b856c7dd6810665ba0d3e;hb=refs/heads/master
>
>
>
> Here's a Nitrogen RPM that contains that systemd config:
>
>
>
> http://cbs.centos.org/repos/nfv7-opendaylight-70-release/
> x86_64/os/Packages/opendaylight-7.0.0-1.el7.noarch.rpm
>
>
>
> This test job shows examples of `sudo systemctl [start, stop, status]`
> working:
>
>
>
> https://jenkins.opendaylight.org/releng/job/packaging-test-rpm-master
>
>
>
> The logic for that job is here:
>
>
>
> https://git.opendaylight.org/gerrit/gitweb?p=releng/
> builder.git;a=blob;f=jjb/packaging/packaging.yaml;h=
> e4de235ca543506063b7fb57c3d257f0b983abe3;hb=refs/heads/master#l346
>
>
>
> That systemd config is also exercised in tests for puppet-opendaylight,
> ansible-opendaylight, OPNFV Apex and other OPNFV installers.
>
>
>
> It seems like you've put some good thought into this, so if you have any
> suggestions for things we can do better please let us know. :)
>
>
>
> Daniel
>
>
>
> On Thu, Oct 12, 2017 at 11:47 AM Jamo Luhrsen <jluhr...@gmail.com> wrote:
>
> +Daniel and Integration-dev,
>
> Daniel,
>
> does our rpm package and the systemd work you did for it answer any of
> Muthu's
> questions below? I'm assuming it *IS* the answer, but you will know better.
>
> Thanks,
> JamO
>
> On 10/12/2017 04:56 AM, Muthukumaran K wrote:
> > Hi,
> >
> > * *
> >
> > *Context* : Figuring out the best possible way to gracefully shutdown
> Karaf process using standard Karaf commands.
> >
> > This would be required because framework-level shutdown-sequence in
> Karaf would give opportunity framework to properly
> > execute bundle lifecycle listeners. What I mean is – abrupt kill can
> potentially prevent lifecycle listeners from being
> > properly executed and may also impact any inflight transactions which
> may be in various stages of replication and/or commit
> > phases. This can in turn lead to troubles during recovery / restart
> phase.
> >
> >
> >
> > So, I thought of middle-ground where
> >
> > 1)      We execute karaf stop followed by
> >
> > 2)      Periodic check  if the last PID indeed terminates
> >
> >
> >
> > Doing a straight kill -9 could lead to rare heisenbugs during wherein
> recovery could suffer since there may not be room for
> > lifecycle listeners to execute (unless Karaf handles it as unified
> shutdownhook and execute same path as that of stop or any
> > graceful shutdown methods)
> >
> >
> >
> > Have anybody tried any better methods without side-effects ?
> >
> >
> >
> >
> >
> > *Option was tried and observation is as follows *
> >
> > Using Karaf stop followed by Karaf status command to check if the
> process has come to a graceful termination. But, it appears
> > that though ‘status’ command reports Karaf instance as ‘Not Running’,
> the PID still lingers for 2 to 3 mins roughly in ODL
> > context. I am biased to think that there are indeed some lifecycle
> listeners executing … During this ‘PID lingering’ phase,
> > the thread-dump hints the System Bundle Shutdown is waiting for the BP
> container to shutdown the components (probably
> > executing the lifecycle listeners at application and platform levels)
> >
> >
> >
> > "System Bundle Shutdown" #1582 daemon prio=5 os_prio=0
> tid=0x00007fb05003d800 nid=0xe68 waiting on condition [0x00007faf77678000]
> >
> >    java.lang.Thread.State: TIMED_WAITING (parking)
> >
> >                 at sun.misc.Unsafe.park(Native Method)
> >
> >                 - parking to wait for  <0x00000000e9064250> (a
> com.google.common.util.concurrent.AbstractFuture$Sync)
> >
> >                 at java.util.concurrent.locks.LockSupport.parkNanos(
> LockSupport.java:215)
> >
> >                 at
> > java.util.concurrent.locks.AbstractQueuedSynchronizer.
> doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
> >
> >                 at
> > java.util.concurrent.locks.AbstractQueuedSynchronizer.
> tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> >
> >                 at com.google.common.util.concurrent.AbstractFuture$
> Sync.get(AbstractFuture.java:268)
> >
> >                 at com.google.common.util.concurrent.AbstractFuture.get(
> AbstractFuture.java:96)
> >
> >                 at org.opendaylight.openflowplugin.openflow.md.core.
> MDController.stop(MDController.java:358)
> >
> >                 at
> > org.opendaylight.openflowplugin.openflow.md.core.sal.
> OpenflowPluginProvider.close(OpenflowPluginProvider.java:121)
> >
> >                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> >
> >                 at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> >
> >                 at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> >
> >                 at java.lang.reflect.Method.invoke(Method.java:498)
> >
> >                 at org.apache.aries.blueprint.
> utils.ReflectionUtils.invoke(ReflectionUtils.java:299)
> >
> >                 at org.apache.aries.blueprint.
> container.BeanRecipe.invoke(BeanRecipe.java:980)
> >
> >                 at org.apache.aries.blueprint.
> container.BeanRecipe.destroy(BeanRecipe.java:887)
> >
> >                 at org.apache.aries.blueprint.
> container.BlueprintRepository.destroy(BlueprintRepository.java:329)
> >
> >                 at org.apache.aries.blueprint.container.
> BlueprintContainerImpl.destroyComponents(BlueprintContainerImpl.java:765)
> >
> >                 at org.apache.aries.blueprint.container.
> BlueprintContainerImpl.tidyupComponents(BlueprintContainerImpl.java:964)
> >
> >                 at org.apache.aries.blueprint.container.
> BlueprintContainerImpl.destroy(BlueprintContainerImpl.java:909)
> >
> >                 at org.apache.aries.blueprint.
> container.BlueprintExtender$3.run(BlueprintExtender.java:325)
> >
> >                 at java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:511)
> >
> >                 at java.util.concurrent.FutureTask.run(FutureTask.
> java:266)
> >
> >                 at org.apache.aries.blueprint.
> container.BlueprintExtender.destroyContainer(BlueprintExtender.java:346)
> >
> >                 at org.apache.aries.blueprint.
> container.BlueprintExtender.access$400(BlueprintExtender.java:68)
> >
> >                 at
> > org.apache.aries.blueprint.container.BlueprintExtender$
> BlueprintContainerServiceImpl.destroyContainer(BlueprintExtender.java:624)
> >
> >                 at
> > org.opendaylight.controller.blueprint.BlueprintBundleTracker.
> shutdownAllContainers(BlueprintBundleTracker.java:251)
> >
> >                 at org.opendaylight.controller.blueprint.
> BlueprintBundleTracker.bundleChanged(BlueprintBundleTracker.java:150)
> >
> >                 at org.eclipse.osgi.framework.internal.core.
> BundleContextImpl.dispatchEvent(BundleContextImpl.java:847)
> >
> >                 at org.eclipse.osgi.framework.eventmgr.EventManager.
> dispatchEvent(EventManager.java:230)
> >
> >                 at org.eclipse.osgi.framework.eventmgr.ListenerQueue.
> dispatchEventSynchronous(ListenerQueue.java:148)
> >
> >                 at org.eclipse.osgi.framework.internal.core.Framework.
> publishBundleEventPrivileged(Framework.java:1568)
> >
> >                 at org.eclipse.osgi.framework.internal.core.Framework.
> publishBundleEvent(Framework.java:1504)
> >
> >                 at org.eclipse.osgi.framework.internal.core.Framework.
> publishBundleEvent(Framework.java:1499)
> >
> >                 at org.eclipse.osgi.framework.internal.core.Framework.
> shutdown(Framework.java:681)
> >
> >                 - locked <0x000000008060b4d0> (a
> org.eclipse.osgi.framework.internal.core.Framework)
> >
> >                 at org.eclipse.osgi.framework.
> internal.core.Framework.close(Framework.java:600)
> >
> >                 - locked <0x000000008060b4d0> (a
> org.eclipse.osgi.framework.internal.core.Framework)
> >
> >                 at org.eclipse.osgi.framework.internal.core.
> InternalSystemBundle$1.run(InternalSystemBundle.java:261)
> >
> >                 at java.lang.Thread.run(Thread.java:745)
> >
> >
> >
> > "Framework Active Thread" #12 prio=5 os_prio=0 tid=0x00007fb0dc4bd000
> nid=0x52a waiting for monitor entry [0x00007fb0c14b0000]
> >
> >    java.lang.Thread.State: BLOCKED (on object monitor)
> >
> >                 at java.lang.Object.wait(Native Method)
> >
> >                 at org.eclipse.osgi.framework.
> internal.core.Framework.run(Framework.java:1862)
> >
> >                 - locked <0x000000008060b4d0> (a
> org.eclipse.osgi.framework.internal.core.Framework)
> >
> >                 at java.lang.Thread.run(Thread.java:745)
> >
> >
> >
> > "main" #1 prio=5 os_prio=0 tid=0x00007fb0dc00b800 nid=0x514 in
> Object.wait() [0x00007fb0e5134000]
> >
> >    java.lang.Thread.State: WAITING (on object monitor)
> >
> >                 at java.lang.Object.wait(Native Method)
> >
> >                 - waiting on <0x000000008060b4d0> (a
> org.eclipse.osgi.framework.internal.core.Framework)
> >
> >                 at org.eclipse.osgi.framework.internal.core.Framework.
> waitForStop(Framework.java:1884)
> >
> >                 - locked <0x000000008060b4d0> (a
> org.eclipse.osgi.framework.internal.core.Framework)
> >
> >                 at org.eclipse.osgi.framework.
> internal.core.EquinoxLauncher.waitForStop(EquinoxLauncher.java:118)
> >
> >                 at org.eclipse.osgi.launch.Equinox.waitForStop(Equinox.
> java:182)
> >
> >                 at org.apache.karaf.main.Main.
> awaitShutdown(Main.java:487)
> >
> >                 at org.apache.karaf.main.Main.main(Main.java:177)
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Regards
> >
> > Muthu
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > controller-dev mailing list
> > controller-dev@lists.opendaylight.org
> > https://lists.opendaylight.org/mailman/listinfo/controller-dev
> >
>
>
> _______________________________________________
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>
>
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to