The long story is below.

-------- Forwarded Message --------
Subject: Re: trinity seems not to reap all childs
Date: Wed, 13 Aug 2014 11:02:31 -0400
From: Dave Jones <da...@redhat.com>
To: Toralf Förster <toralf.foers...@gmx.de>
CC: trin...@vger.kernel.org

On Sat, Aug 09, 2014 at 06:35:45PM +0200, Toralf Förster wrote:
 > I do observe in the last few days that under a 32 bit Gentoo UML guest 
 > sometimes 1 trinity job survives although all of its parents are gone 
 > already.
 > 
 > 
 > The console output at th ehost system is :
 > 
 > [main] Bailing main loop because Completed maximum number of operations..
 > [watchdog] [2604] Watchdog exiting because Completed maximum number of 
 > operations..
 > [init] Ran 100001 syscalls. Successes: 21199  Failures: 78802
 > 
 > 
 > A ps shows that there's still 1 job running in the guest :
 > 
 > $ ssh tfoerste@trinity "ps fx -eo pid,start_time,command | grep -e trinity 
 > -e sleep | grep -v grep"
 >  2723 17:55 trinity -C 2 -N 100000 -x mremap -q -V /mnt/ramdisk/victims/v1/v2

If it happens again, grab the output of /proc/2723/stack
(You might need something that enables CONFIG_STACKTRACE in your kernel,
 or apply the patch below if nothing does -- I still need to get that
 upstream)


 > [watchdog] 30087 iterations. [F:23623 S:6463 HI:9706]
 > [watchdog] 40138 iterations. [F:31443 S:8694 HI:9706]
 > [watchdog] 50215 iterations. [F:39370 S:10844 HI:9706]
 > [watchdog] 60221 iterations. [F:47228 S:12992 HI:9706]
 > [watchdog] 70225 iterations. [F:55100 S:15124 HI:9706]
 > [watchdog] 80278 iterations. [F:63007 S:17270 HI:9706]
 > [watchdog] 90287 iterations. [F:71013 S:19273 HI:9706]
 > [main] Bailing main loop because Completed maximum number of operations..
 > [watchdog] [2604] Watchdog exiting because Completed maximum number of 
 > operations..
 > [init] Ran 100001 syscalls. Successes: 21199  Failures: 78802
 > 
 > killing the job helped fortunately:
 > 
 > 
 > $ ssh tfoerste@trinity kill 2723

Puzzling that the watchdog exited while there were still children around.

Something else that might be interesting would be to attach to the
still running pid, and examine shm->running_childs

        Dave

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index cb45f59685e6..38133ddb8bb4 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1008,8 +1008,13 @@ config TRACE_IRQFLAGS
          either tracing or lock debugging.
 
 config STACKTRACE
-       bool
+       bool "Stack backtrace support"
        depends on STACKTRACE_SUPPORT
+       help
+         This option causes the kernel to create a /proc/pid/stack for
+         every process, showing its current stack trace.
+         It is also used by various kernel debugging features that require
+         stack trace generation.
 
 config DEBUG_KOBJECT
        bool "kobject debugging"




------------------------------------------------------------------------------
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Reply via email to