Re: PATCH: SysV semaphore race vs SIGSTOP
lør, 05,.02.2005 kl. 09.37 +0100, skrev Manfred Spraul: > Hi Ove, > > >As I mentioned in an earlier mail, there is a race when SIGSTOP-ing a > >process waiting for a SysV semaphore, where if a process holding a > >semaphore suspends another process waiting on the semaphore and then > >releases the semaphore, > > > Your patch looks correct (for 2.4 - 2.6 uses a different approach), but Actually I myself don't think the patch I sent is 100% correct. The most glaring problem is of course that it only handles SIGSTOP and nothing else, but I wasn't sure how to handle anything else. Also, I've since found that the "continue" in there made it impossible to attach debuggers or the like to a process blocked in semop. Finally, it would occasionally continue to loop after the condition is satisfied, e.g. when the semop is 0. So, the if check in my patch is more correct as + if (is_stopping(current) && queue.status == 1) + /* Could either EINTR out or continue. +* I suppose EINTR is the robust choice. */ + queue.status = -EINTR; but that still doesn't handle all signals. I'm thinking that if the SIGSTOP problem can be solved simply by returning EINTR (and let the userspace code deal with that by retrying), then perhaps that can be done for all signals. Summarily, my "perfect" patch for 2.4 might simply look like this (assuming signal_pending does the right thing, which I haven't verified): --- ipc/sem.c.original 2005-01-31 18:17:17.0 -0500 +++ ipc/sem.c 2005-02-06 01:52:01.0 -0500 @@ -961,6 +961,9 @@ error = -EIDRM; goto out_free; } + if (signal_pending(current) && queue.status == 1) + queue.status = -EINTR; + /* * If queue.status == 1 we where woken up and * have to retry else we simply return. As for 2.6, yes, the 2.6 code does seem to be harder to make a simple patch for. Then again, we're thinking of using futexes instead of semaphores if a 2.6 kernel is detected, so if futexes don't have this problem, I guess we can use them to work around this semaphore flaw. > I'm not certain that it's needed: > You assume that signals have an immediate effect. I don't necessarily need it to have an "immediate" effect. Only that pending signals are taken into consideration when the process returns from certain blocking system calls dealing with atomic synchronization primitives such as semaphores. I want to be able to think that synchronization primitives are there to *protect* me against the effects of implementation details such as delayed signal delivery, not that they have their own synchronization issues. After all, the man page for "semop" states that * The calling process catches a signal: the value of semzcnt is decremented and semop() fails, with errno set to EINTR. and I want to be able to expect this to always actually happen. Because semop is supposed to be an *atomic* operation, I don't expect it to *both* catch a signal *and* succeed within the same semop system call (knowing that the signal *must* have been sent by the time the semaphore condition was fulfilled). See where I'm coming from? > Linux ignores that - it delays signal processing under some > circumstances. If a syscall can be completed without blocking, then the > syscall is handled, regardless of any pending signals. The signal is > handled at the syscall return, i.e. after completing the syscall. > That's not just in SysV semaphore - at least pipes are identical: Perhaps. But at least read() doesn't claim to be an atomic synchronization primitive like semop() does. Though I suppose it's occasionally used that way... > pipe_read first check if there is data. If there is some, then it > returns the data. Signals are only checked if there is no pending data. > I'm not sure if this is a bug. But if it's one, then far more than just > sysv sem must be updated. I'd probably only concern myself with things that are supposed to be atomic, really. Anyway, thanks a lot for answering. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: SysV semaphore race vs SIGSTOP
Hi Ove, As I mentioned in an earlier mail, there is a race when SIGSTOP-ing a process waiting for a SysV semaphore, where if a process holding a semaphore suspends another process waiting on the semaphore and then releases the semaphore, Your patch looks correct (for 2.4 - 2.6 uses a different approach), but I'm not certain that it's needed: You assume that signals have an immediate effect. Linux ignores that - it delays signal processing under some circumstances. If a syscall can be completed without blocking, then the syscall is handled, regardless of any pending signals. The signal is handled at the syscall return, i.e. after completing the syscall. That's not just in SysV semaphore - at least pipes are identical: pipe_read first check if there is data. If there is some, then it returns the data. Signals are only checked if there is no pending data. I'm not sure if this is a bug. But if it's one, then far more than just sysv sem must be updated. What about other unices? I've attached a test app that tests pipes. Could someone try it? -- Manfred /* * Copyright (C) 1999, 2001,2005 by Manfred Spraul. * * Redistribution of this file is permitted under the terms of the GNU * General Public License (GPL) * $Header: /home/manfred/cvs-tree/pipetest/test8.cpp,v 1.1 2005/02/05 08:35:01 manfred Exp $ */ #include #include #include #include #include #include #include #include #include #define WRITE_LEN 128 /* less than PIPE_BUF */ int buffer[WRITE_LEN]; void dummy(int sig) { } int main() { int pipes[2]; int ret; ret = pipe(pipes); if(ret != 0) { printf("pipe creation failed, ret %d, errno %d.\n", ret, errno); exit(1); } ret = fork(); if(!ret) { /* child: read from pipe */ printf("child: trying to read %d bytes.\n", WRITE_LEN); ret = read(pipes[0], buffer, WRITE_LEN); /* never executed! */ printf("child: read returned %d.\n", ret); printf("SIGSTOP test result: SIGSTOP completely broken\n"); } else { /* synchronize with timer - the following block must be atomic */ sleep(1); /* begin atomic block */ ret = kill(ret, SIGSTOP); if (ret != 0) { printf("Sending SIGSTOP failed, aborting (errno=%d).\n", errno); exit(1); } ret = write(pipes[1], buffer, WRITE_LEN); if (ret != WRITE_LEN) { printf("Writing to pipe buffer failed, aborting (errno=%d).\n", errno); exit(1); } /* end of atomic block */ printf("parent: yielding\n"); sleep(1); ret = fcntl(pipes[0], F_SETFL, O_NONBLOCK); if (ret != 0) { printf("fcntl(,,O_NONBLOCK) failed, aborting (errno=%d).\n", errno); exit(1); } ret = read(pipes[0], buffer, WRITE_LEN); printf("parent: read returned %d.\n", ret); printf("\n\n"); printf("SIGSTOP test result:\n"); printf("Expected values:\n"); printf(" %d for OS with synchroneous SIGSTOP\n", WRITE_LEN); printf(" -1 for OS with asynchroneous SIGSTOP (errno EAGAIN=%d)\n", EAGAIN); printf("Got: %d (errno: %d)\n", ret, errno); close(pipes[1]); close(pipes[0]); } }
Re: PATCH: SysV semaphore race vs SIGSTOP
Hi Ove, As I mentioned in an earlier mail, there is a race when SIGSTOP-ing a process waiting for a SysV semaphore, where if a process holding a semaphore suspends another process waiting on the semaphore and then releases the semaphore, Your patch looks correct (for 2.4 - 2.6 uses a different approach), but I'm not certain that it's needed: You assume that signals have an immediate effect. Linux ignores that - it delays signal processing under some circumstances. If a syscall can be completed without blocking, then the syscall is handled, regardless of any pending signals. The signal is handled at the syscall return, i.e. after completing the syscall. That's not just in SysV semaphore - at least pipes are identical: pipe_read first check if there is data. If there is some, then it returns the data. Signals are only checked if there is no pending data. I'm not sure if this is a bug. But if it's one, then far more than just sysv sem must be updated. What about other unices? I've attached a test app that tests pipes. Could someone try it? -- Manfred /* * Copyright (C) 1999, 2001,2005 by Manfred Spraul. * * Redistribution of this file is permitted under the terms of the GNU * General Public License (GPL) * $Header: /home/manfred/cvs-tree/pipetest/test8.cpp,v 1.1 2005/02/05 08:35:01 manfred Exp $ */ #include unistd.h #include fcntl.h #include errno.h #include stdio.h #include stdlib.h #include string.h #include signal.h #include limits.h #include errno.h #define WRITE_LEN 128 /* less than PIPE_BUF */ int buffer[WRITE_LEN]; void dummy(int sig) { } int main() { int pipes[2]; int ret; ret = pipe(pipes); if(ret != 0) { printf(pipe creation failed, ret %d, errno %d.\n, ret, errno); exit(1); } ret = fork(); if(!ret) { /* child: read from pipe */ printf(child: trying to read %d bytes.\n, WRITE_LEN); ret = read(pipes[0], buffer, WRITE_LEN); /* never executed! */ printf(child: read returned %d.\n, ret); printf(SIGSTOP test result: SIGSTOP completely broken\n); } else { /* synchronize with timer - the following block must be atomic */ sleep(1); /* begin atomic block */ ret = kill(ret, SIGSTOP); if (ret != 0) { printf(Sending SIGSTOP failed, aborting (errno=%d).\n, errno); exit(1); } ret = write(pipes[1], buffer, WRITE_LEN); if (ret != WRITE_LEN) { printf(Writing to pipe buffer failed, aborting (errno=%d).\n, errno); exit(1); } /* end of atomic block */ printf(parent: yielding\n); sleep(1); ret = fcntl(pipes[0], F_SETFL, O_NONBLOCK); if (ret != 0) { printf(fcntl(,,O_NONBLOCK) failed, aborting (errno=%d).\n, errno); exit(1); } ret = read(pipes[0], buffer, WRITE_LEN); printf(parent: read returned %d.\n, ret); printf(\n\n); printf(SIGSTOP test result:\n); printf(Expected values:\n); printf( %d for OS with synchroneous SIGSTOP\n, WRITE_LEN); printf( -1 for OS with asynchroneous SIGSTOP (errno EAGAIN=%d)\n, EAGAIN); printf(Got: %d (errno: %d)\n, ret, errno); close(pipes[1]); close(pipes[0]); } }
Re: PATCH: SysV semaphore race vs SIGSTOP
lør, 05,.02.2005 kl. 09.37 +0100, skrev Manfred Spraul: Hi Ove, As I mentioned in an earlier mail, there is a race when SIGSTOP-ing a process waiting for a SysV semaphore, where if a process holding a semaphore suspends another process waiting on the semaphore and then releases the semaphore, Your patch looks correct (for 2.4 - 2.6 uses a different approach), but Actually I myself don't think the patch I sent is 100% correct. The most glaring problem is of course that it only handles SIGSTOP and nothing else, but I wasn't sure how to handle anything else. Also, I've since found that the continue in there made it impossible to attach debuggers or the like to a process blocked in semop. Finally, it would occasionally continue to loop after the condition is satisfied, e.g. when the semop is 0. So, the if check in my patch is more correct as + if (is_stopping(current) queue.status == 1) + /* Could either EINTR out or continue. +* I suppose EINTR is the robust choice. */ + queue.status = -EINTR; but that still doesn't handle all signals. I'm thinking that if the SIGSTOP problem can be solved simply by returning EINTR (and let the userspace code deal with that by retrying), then perhaps that can be done for all signals. Summarily, my perfect patch for 2.4 might simply look like this (assuming signal_pending does the right thing, which I haven't verified): --- ipc/sem.c.original 2005-01-31 18:17:17.0 -0500 +++ ipc/sem.c 2005-02-06 01:52:01.0 -0500 @@ -961,6 +961,9 @@ error = -EIDRM; goto out_free; } + if (signal_pending(current) queue.status == 1) + queue.status = -EINTR; + /* * If queue.status == 1 we where woken up and * have to retry else we simply return. As for 2.6, yes, the 2.6 code does seem to be harder to make a simple patch for. Then again, we're thinking of using futexes instead of semaphores if a 2.6 kernel is detected, so if futexes don't have this problem, I guess we can use them to work around this semaphore flaw. I'm not certain that it's needed: You assume that signals have an immediate effect. I don't necessarily need it to have an immediate effect. Only that pending signals are taken into consideration when the process returns from certain blocking system calls dealing with atomic synchronization primitives such as semaphores. I want to be able to think that synchronization primitives are there to *protect* me against the effects of implementation details such as delayed signal delivery, not that they have their own synchronization issues. After all, the man page for semop states that * The calling process catches a signal: the value of semzcnt is decremented and semop() fails, with errno set to EINTR. and I want to be able to expect this to always actually happen. Because semop is supposed to be an *atomic* operation, I don't expect it to *both* catch a signal *and* succeed within the same semop system call (knowing that the signal *must* have been sent by the time the semaphore condition was fulfilled). See where I'm coming from? Linux ignores that - it delays signal processing under some circumstances. If a syscall can be completed without blocking, then the syscall is handled, regardless of any pending signals. The signal is handled at the syscall return, i.e. after completing the syscall. That's not just in SysV semaphore - at least pipes are identical: Perhaps. But at least read() doesn't claim to be an atomic synchronization primitive like semop() does. Though I suppose it's occasionally used that way... pipe_read first check if there is data. If there is some, then it returns the data. Signals are only checked if there is no pending data. I'm not sure if this is a bug. But if it's one, then far more than just sysv sem must be updated. I'd probably only concern myself with things that are supposed to be atomic, really. Anyway, thanks a lot for answering. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PATCH: SysV semaphore race vs SIGSTOP
As I mentioned in an earlier mail, there is a race when SIGSTOP-ing a process waiting for a SysV semaphore, where if a process holding a semaphore suspends another process waiting on the semaphore and then releases the semaphore, the suspended process might still get the semaphore, resulting in a deadlock. My sample test program is at http://www.ping.uio.no/~ovehk/sembug.c Now I've written a patch for this which seems to work for me. It is against 2.4.27, but the semaphore code doesn't seem to change often. And please Cc me on any questions or comments, since I'm not subscribed. Here is the patch: --- ipc/sem.c.original 2005-01-31 18:17:17.0 -0500 +++ ipc/sem.c 2005-01-31 18:17:34.0 -0500 @@ -307,6 +307,18 @@ return result; } +static int is_stopping(struct task_struct *t) +{ + if (t->state == TASK_STOPPED) { + /* shouldn't happen */ + return 1; + } + if (sigismember(>pending.signal, SIGSTOP)) { + return 1; + } + return 0; +} + /* Go through the pending queue for the indicated semaphore * looking for tasks that can be completed. */ @@ -961,6 +973,12 @@ error = -EIDRM; goto out_free; } + + if (is_stopping(current)) + /* Could either EINTR out or continue. +* Currently I've chosen to continue */ + continue; + /* * If queue.status == 1 we where woken up and * have to retry else we simply return. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PATCH: SysV semaphore race vs SIGSTOP
As I mentioned in an earlier mail, there is a race when SIGSTOP-ing a process waiting for a SysV semaphore, where if a process holding a semaphore suspends another process waiting on the semaphore and then releases the semaphore, the suspended process might still get the semaphore, resulting in a deadlock. My sample test program is at http://www.ping.uio.no/~ovehk/sembug.c Now I've written a patch for this which seems to work for me. It is against 2.4.27, but the semaphore code doesn't seem to change often. And please Cc me on any questions or comments, since I'm not subscribed. Here is the patch: --- ipc/sem.c.original 2005-01-31 18:17:17.0 -0500 +++ ipc/sem.c 2005-01-31 18:17:34.0 -0500 @@ -307,6 +307,18 @@ return result; } +static int is_stopping(struct task_struct *t) +{ + if (t-state == TASK_STOPPED) { + /* shouldn't happen */ + return 1; + } + if (sigismember(t-pending.signal, SIGSTOP)) { + return 1; + } + return 0; +} + /* Go through the pending queue for the indicated semaphore * looking for tasks that can be completed. */ @@ -961,6 +973,12 @@ error = -EIDRM; goto out_free; } + + if (is_stopping(current)) + /* Could either EINTR out or continue. +* Currently I've chosen to continue */ + continue; + /* * If queue.status == 1 we where woken up and * have to retry else we simply return. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/