Regarding Netty ‘s io.netty.util.concurrent.FastThreadLocalThread observation

https://github.com/netty/netty/issues/6565

Issue seems very recent. Looking further what could be the reason

Regards
Muthu


From: controller-dev-boun...@lists.opendaylight.org 
[mailto:controller-dev-boun...@lists.opendaylight.org] On Behalf Of Tom Pantelis
Sent: Monday, May 29, 2017 7:30 PM
To: Michael Vorburger
Cc: controller-dev; <ovsdb-...@lists.opendaylight.org>; 
mdsal-...@lists.opendaylight.org; openflowplugin-dev
Subject: Re: [controller-dev] [mdsal-dev] Bug 7370 OOM due to suspected memory 
leak in akka.dispatch.Dispatcher found by hprof


yeah that looks like an issue. DeviceInitializationUtils is doing a blocking 
get on a Future which is usually not a good thing. And it occurred via an EOS 
data change and is blocking an akka Dispatcher thread.

On a side note, there's a lot of threads with 
io.netty.util.concurrent.FastThreadLocalThread - not sure if that's normal.



On Mon, May 29, 2017 at 9:30 AM, Michael Vorburger 
<vorbur...@redhat.com<mailto:vorbur...@redhat.com>> wrote:
+openflowplugin-dev & +ovsdb-dev:

Tom,

On Mon, May 29, 2017 at 2:57 PM, Tom Pantelis 
<tompante...@gmail.com<mailto:tompante...@gmail.com>> wrote:
Thanks a lot for replying, really appreciate it!

It looks like the Dispatcher was for data change notifications. I suspect a 
listener was hung or responding slowly so the actor's mailbox filled up with 
change notifications. I would suggest getting a thread dump next time.

Turn out no need to wait for next time - just figured out that we can obtain 
thread dumps à posteriori from an HPROF using MAT... see the [4] 
Bug7370_Threads.zip HTML report just attached to Bug 7370.

It shows 604 threads (a lot?), many of which are e.g. parked ForkJoinPool, and 
a number of them related to ovsdb and openflowplugin stuff... so what are we 
looking for, in this thread dump? I haven't looked thread each thread's stack 
yet, but this one vaguely looks like what you may mean by "a listener was hung 
or responding slowly" (causing "the actor's mailbox filled upwith change 
notifications"), could it possibly be the reason for / having something to do 
with this OOM:

opendaylight-cluster-data-akka.actor.default-dispatcher-16

  at sun.misc.Unsafe.park(ZJ)V (Native Method)

  at java.util.concurrent.locks.LockSupport.park(Ljava/lang/Object;)V 
(LockSupport.java:175)

  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt()Z 
(AbstractQueuedSynchronizer.java:836)

  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(I)V
 (AbstractQueuedSynchronizer.java:997)

  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(I)V
 (AbstractQueuedSynchronizer.java:1304)

  at 
com.google.common.util.concurrent.AbstractFuture$Sync.get()Ljava/lang/Object; 
(AbstractFuture.java:285)

  at com.google.common.util.concurrent.AbstractFuture.get()Ljava/lang/Object; 
(AbstractFuture.java:116)

  at 
org.opendaylight.openflowplugin.impl.util.DeviceInitializationUtils.initializeNodeInformation(Lorg/opendaylight/openflowplugin/api/openflow/device/DeviceContext;ZLorg/opendaylight/openflowplugin/openflow/md/core/sal/convertor/ConvertorExecutor;)V
 (DeviceInitializationUtils.java:155)

  at 
org.opendaylight.openflowplugin.impl.device.DeviceContextImpl.onContextInstantiateService(Lorg/opendaylight/openflowplugin/api/openflow/connection/ConnectionContext;)Z
 (DeviceContextImpl.java:730)

  at 
org.opendaylight.openflowplugin.impl.lifecycle.LifecycleServiceImpl.instantiateServiceInstance()V
 (LifecycleServiceImpl.java:53)

  at 
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceRegistrationDelegator.instantiateServiceInstance()V
 (ClusterSingletonServiceRegistrationDelegator.java:46)

  at 
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.takeOwnership()V
 (ClusterSingletonServiceGroupImpl.java:291)

  at 
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.ownershipChanged(Lorg/opendaylight/mdsal/eos/common/api/GenericEntityOwnershipChange;)V
 (ClusterSingletonServiceGroupImpl.java:237)

  at 
org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.ownershipChanged(Lorg/opendaylight/mdsal/eos/common/api/GenericEntityOwnershipChange;)V
 (AbstractClusterSingletonServiceProviderImpl.java:145)

  at 
org.opendaylight.mdsal.singleton.dom.impl.DOMClusterSingletonServiceProviderImpl.ownershipChanged(Lorg/opendaylight/mdsal/eos/dom/api/DOMEntityOwnershipChange;)V
 (DOMClusterSingletonServiceProviderImpl.java:23)

  at 
org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.onEntityOwnershipChanged(Lorg/opendaylight/mdsal/eos/dom/api/DOMEntityOwnershipChange;)V
 (EntityOwnershipListenerActor.java:46)

  at 
org.opendaylight.controller.cluster.datastore.entityownership.EntityOwnershipListenerActor.handleReceive(Ljava/lang/Object;)V
 (EntityOwnershipListenerActor.java:36)

  at 
org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(Ljava/lang/Object;)V
 (AbstractUntypedActor.java:26)

  at 
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(Ljava/lang/Object;Lscala/Function1;)Ljava/lang/Object;
 (UntypedActor.scala:165)

  at 
akka.actor.Actor$class.aroundReceive(Lakka/actor/Actor;Lscala/PartialFunction;Ljava/lang/Object;)V
 (Actor.scala:484)

  at 
akka.actor.UntypedActor.aroundReceive(Lscala/PartialFunction;Ljava/lang/Object;)V
 (UntypedActor.scala:95)

  at akka.actor.ActorCell.receiveMessage(Ljava/lang/Object;)V 
(ActorCell.scala:526)

  at akka.actor.ActorCell.invoke(Lakka/dispatch/Envelope;)V 
(ActorCell.scala:495)

  at akka.dispatch.Mailbox.processMailbox(IJ)V (Mailbox.scala:257)

  at akka.dispatch.Mailbox.run()V (Mailbox.scala:224)

  at akka.dispatch.Mailbox.exec()Z (Mailbox.scala:234)

  at scala.concurrent.forkjoin.ForkJoinTask.doExec()I (ForkJoinTask.java:260)

  at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(Lscala/concurrent/forkjoin/ForkJoinTask;)V
 (ForkJoinPool.java:1339)

  at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(Lscala/concurrent/forkjoin/ForkJoinPool$WorkQueue;)V
 (ForkJoinPool.java:1979)

  at scala.concurrent.forkjoin.ForkJoinWorkerThread.run()V 
(ForkJoinWorkerThread.java:107)




On Mon, May 29, 2017 at 7:52 AM, Michael Vorburger 
<vorbur...@redhat.com<mailto:vorbur...@redhat.com>> wrote:
Hi guys,
I just ran MAT([1]) over an HPROF heap dump on OOM in Bug 7370, and it (MAT) 
raises a "leak suspect" in akka.dispatch.Dispatcher - see the [3] 
java_pid19570_Leak_Suspects.zip just attached to Bug 7370 ... questions:
Is this perhaps something you jump at with an "ah that, we know about it and 
already fixed that in ..." ?

If not, how do we go about better understanding the root cause of this, and be 
able to eventually fix this?

My underlying assumption here is that isn't "normal" and not just "by design" - 
if it is, I'd love some education... like I'm hoping that the conclusion here 
isn't simply that MD SAL's data store is a dumb in-memory data base which 
basically just takes a huge amount of GBs to keep (all) YANG model instances on 
the heap - or is it?

Tx,
M.

[1] https://www.eclipse.org/mat/

[2] https://bugs.opendaylight.org/show_bug.cgi?id=7370

[3] https://bugs.opendaylight.org/attachment.cgi?id=1816

[4] https://bugs.opendaylight.org/attachment.cgi?id=1819

--
Michael Vorburger, Red Hat
vorbur...@redhat.com<mailto:vorbur...@redhat.com> | IRC: vorburger @freenode | 
~ = http://vorburger.ch<http://vorburger.ch/>

_______________________________________________
mdsal-dev mailing list
mdsal-...@lists.opendaylight.org<mailto:mdsal-...@lists.opendaylight.org>
https://lists.opendaylight.org/mailman/listinfo/mdsal-dev



_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to