On Tue, Jul 02, 2002 at 01:32:09PM +0200, Sven Neumann wrote:
> Hi,

Hi,

> you should be using _exit().

You are right.  And using _exit() fixes the problem in my example
where the child does not exit and cause a SIGCHLD condition.  Using
_exit(), the child process now gets created and exits as it should
raising the SIGCHLD signal in the parent.

However...

> I'm not sure if this will solve your
> problem. However I'm surprised that you said your example works with
> GTK+-X11 because of the use of exit() instead of _exit().

I should have clarified.  The original (more useful) program complete
with the g_spawn_async_with_pipes() works in X11.  I did not really
test the stripped down "create and exit" child example in X11.  I
didn't think of a reason to.

> This is a
> frequent mistake and it is mentioned in the GTK+ FAQ.

Indeed.  I should have noticed that.

And using _exit() this seems to fix the child-create-and-exit problem
but things don't go very well after that.  For some reason, the
"wait(&status)" in the SIGCHLD signal handler in the parent blocks,
even though there has been a child process exit.

I know this is basic fork()/exec() programming but again, running
against the the X11 or linuxfb gtk libs does not have this problem.  I
am wondering if it's got something to do with the multi-threaded
nature of directfb.

To get back to the real problem (now that child creation/exit seems to
work), here is the meat of my real callback:

      signal(SIGCHLD, child_reaper);
fprintf(stderr, "about to start mplaying\n");
      if (!g_spawn_async_with_pipes(NULL, argv, NULL,
                                    G_SPAWN_DO_NOT_REAP_CHILD
                                    | G_SPAWN_STDOUT_TO_DEV_NULL
                                    | G_SPAWN_STDERR_TO_DEV_NULL
                                    ,
                                    NULL, NULL, NULL, &mplayer_cmd, NULL, NULL,
                                    NULL)) {
        fprintf(stderr, "error starting mplayer\n");
        mplaying = FALSE;
      }
fprintf(stderr, "should be mplaying\n");

And the strace (with annotations below the lines they are annotating):

[ Process startup removed for brevity.  I don't know what the limit on
  message size is on this list.  I would be happy to send a full
  strace if it would be helpful and the list processor doesn't barf on
  it.
]
...
1562  rt_sigaction(SIGCHLD, {0x40548530, [CHLD], SA_RESTART|0x4000000}, {SIG_DFL}, 8) 
= 0

my signal handler being installed

1562  write(2, "about to start mplaying\n", 24) = 24
1562  pipe([10, 12])                    = 0
1562  pipe([13, 14])                    = 0

in g_spawn_async_with_pipes(), set up the pipes

1562  fork()                            = 1572
1572  rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_DFL}, 8) = 0
1572  close(10)                         = 0
1572  close(-1)                         = -1 EBADF (Bad file descriptor)
1572  close(14)                         = 0
1572  close(-1)                         = -1 EBADF (Bad file descriptor)
1572  close(-1)                         = -1 EBADF (Bad file descriptor)
1572  getrlimit(0x7, 0xbfffe7a8)        = 0
1572  fcntl64(3, F_SETFD, FD_CLOEXEC)   = 0
1572  fcntl64(4, F_SETFD, FD_CLOEXEC)   = 0
1572  fcntl64(5, F_SETFD, FD_CLOEXEC)   = 0
...
1572  fcntl64(408, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1572  fcntl64(409, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1572  fcntl64(410, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1567  <... read resumed> "000000000000e6ac 01 SPACE MY_REM"..., 128) = 36
1567  read(8,  <unfinished ...>
1572  fcntl64(411, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1572  fcntl64(412, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1572  fcntl64(413, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
...
1572  fcntl64(1021, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1572  fcntl64(1022, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1572  fcntl64(1023, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
1572  dup2(13, 0)                       = 0
1572  close(13)                         = 0
1572  open("/dev/null", O_WRONLY|O_LARGEFILE) = 10
1572  dup2(10, 1)                       = 1
1572  close(10)                         = 0
1572  open("/dev/null", O_WRONLY|O_LARGEFILE) = 10
1572  dup2(10, 2)                       = 2
1572  close(10)                         = 0
1572  write(7, "\200\32U@\2\0\0\0\0\0\0\0\0\20\0\0\0\0\0\0\241\203\0@\0"..., 148) = 148
1565  <... poll resumed> [{fd=6, events=POLLIN, revents=POLLIN}], 1, 2000) = 1
1572  rt_sigprocmask(SIG_SETMASK, NULL,  <unfinished ...>
1565  getppid( <unfinished ...>
1572  <... rt_sigprocmask resumed> [RTMIN], 8) = 0
1565  <... getppid resumed> )           = 1562
1572  rt_sigsuspend([] <unfinished ...>

this is the last time process 1572 executes any system calls
why does it get suspended before it gets a chance to do anything?

1565  read(6, "\200\32U@\2\0\0\0\0\0\0\0\0\20\0\0\0\0\0\0\241\203\0@\0"..., 148) = 148
1565  kill(1568, SIGRT_1 <unfinished ...>
1568  <... read resumed> 0xbf3ffacc, 64) = ? ERESTARTSYS (To be restarted)
1568  --- SIGRT_1 (Real-time signal 1) ---
1568  _exit(0)                          = ?

I don't know why this process is exiting (this is a clone() of 1565 --
the main process

1565  <... kill resumed> )              = 0
1565  --- SIGRT_1 (Real-time signal 1) ---
1565  sigreturn()                       = ? (mask now ~[TRAP KILL STOP])
1565  kill(1567, SIGRT_1 <unfinished ...>
1567  <... read resumed> 0xbf5ffa8c, 128) = ? ERESTARTSYS (To be restarted)
1567  --- SIGRT_1 (Real-time signal 1) ---
1567  _exit(0)                          = ?

Or this one (which is another clone of 1565)

1565  <... kill resumed> )              = 0
1565  --- SIGRT_1 (Real-time signal 1) ---
1565  sigreturn()                       = ? (mask now ~[TRAP KILL STOP])
1565  kill(1566, SIGRT_1 <unfinished ...>
1566  <... read resumed> 0xbf7ffa0c, 256) = ? ERESTARTSYS (To be restarted)
1566  --- SIGRT_1 (Real-time signal 1) ---
1566  _exit(0)                          = ?

Or this one (which is again a clone of 1565)

1565  <... kill resumed> )              = 0
1565  --- SIGRT_1 (Real-time signal 1) ---
1565  sigreturn()                       = ? (mask now ~[TRAP KILL STOP])
1565  wait4(1568,  <unfinished ...>
1562  close(12)                         = 0
1562  close(-1)                         = -1 EBADF (Bad file descriptor)
1562  close(13)                         = 0
1562  close(-1)                         = -1 EBADF (Bad file descriptor)
1562  close(-1)                         = -1 EBADF (Bad file descriptor)
1562  read(10, 

Note that strace spits out the following messages:

Process 1572 attached
Process 1568 detached
Process 1567 detached
Process 1566 detached
Process 1565 suspended

I am not sure why it's reporting process 1565 as being suspended

Is there some strange interaction going on here between threads and
spawned processes?

b.

-- 
Brian J. Murrell

Attachment: msg00752/pgp00000.pgp
Description: PGP signature

Reply via email to