Re: 1023rd thread crashes 2.4.0-test8 from non-root user (fwd)
On Mon, Sep 25, 2000 at 03:02:05PM -0700, Linus Torvalds wrote: > sigdelset(>signal, sig); I just tested this using my perl-5.005-threads program... no change from my last email (only 1023 threads created, program fails to respond to ctrl-c when more than 1023 threads are attempted). This _appears_ to be a bug in perl-5.005-threads as shipped with debian potato. Using Mark Hahn's test code, I get all 2000 threads successfully created, and they respond properly when killed via ctrl-c. So that appears to fix the problem. ASSUMING the perl-5.005-thread problem is indeed a perl problem I think this solves the kernel crash problem. (NOTE, I have test this with max_queued_signal at 4096 and 1024... no difference for either perl or Mark's code.) I'll get the source to perl-5.005-thread and play with it later tonight. -- Ted Deppner http://www.psyber.com/~ted/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user (fwd)
On Mon, Sep 25, 2000 at 03:02:05PM -0700, Linus Torvalds wrote: sigdelset(list-signal, sig); I just tested this using my perl-5.005-threads program... no change from my last email (only 1023 threads created, program fails to respond to ctrl-c when more than 1023 threads are attempted). This _appears_ to be a bug in perl-5.005-threads as shipped with debian potato. Using Mark Hahn's test code, I get all 2000 threads successfully created, and they respond properly when killed via ctrl-c. So that appears to fix the problem. ASSUMING the perl-5.005-thread problem is indeed a perl problem I think this solves the kernel crash problem. (NOTE, I have test this with max_queued_signal at 4096 and 1024... no difference for either perl or Mark's code.) I'll get the source to perl-5.005-thread and play with it later tonight. -- Ted Deppner http://www.psyber.com/~ted/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
On Mon, Sep 25, 2000 at 10:33:06AM +0200, Ingo Molnar wrote: > On Mon, 25 Sep 2000, Ted Deppner wrote: > > > I ask because on my perl-threads test case, I can't create more than 1023 > > threads, but I get a kernel crash when I've _attempted_ to create more > > than 1023 and hit ctrl-c. > > could you test this with the kernel/signal.c:max_queued_signals > initialization change i suggested? Does it still crash? With max_queued_signals=4096, I can still only create 1022 threads under perl-5.005-threads. With more than 1023 threads the process no longer responds to ctrl-c, or a kill -INT on it. A kill -9 will kill it however with no kernel lockup. Under 1023 threads the process responds to ctrl-c. It seems like the bug is definately involved in signal handling, and that max_queued_signals affects it in some way... My ulimit -a from bash... you can see open files at 1024, but I'm not doing open files stuff in my test program (threadcrash.pl). core file size (blocks) 0 data seg size (kbytes) unlimited file size (blocks) unlimited max locked memory (kbytes) unlimited max memory size (kbytes)unlimited open files 1024 pipe size (512 bytes) 8 stack size (kbytes) 8192 cpu time (seconds) unlimited max user processes 4093 virtual memory (kbytes) unlimited I upped my open-files to 2048 and still was unable to get more than 1022 threads running. I wonder if perl-5.005-threads might have a static limit set somewhere inside it. Maybe I'll try to recompile it tonight and see what happens. -- Ted Deppner http://www.psyber.com/~ted/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user (fwd)
Duh. This was a really stupid bug. In kernel/signal.c, collect_signal(), for the case where we don't find a siginfo block, we need to clear the signal set. In short, add the line sigdelset(>signal, sig); just before the first "return 1" in collect_signal(), and all should be well (famous last words - it's untested, but I'm sure that's it). If I'm right, the kernel didn't properly crash, but it would send the signal on and on again forever, which would basically kill the machine if something like init or X or a number of other important cases got stuck doing nothing. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
btw., maybe it's init that gets those 2000 signals, not bash? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
indeed, after changing max_queued_signals to 4096, i cannot crash the kernel anymore with 2000 threads. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
On Mon, 25 Sep 2000, Mark Hahn wrote: > > The problem is large numbers of threads in 2.4.0-test8 can result in a > > hard crash of the entire kernel. This can be done as a non-root user. > > this appears to be reproducable (128M duron, haven't tried intel UP/SMP): i've done some experimentation, and to me it appears we overload the queued signal limit of bash, or something like that? The Ctrl-C thing definitely creates alot of signals. And the default limit for queued signals [kernel/signal.c:max_queued_signals] is 1024 ... so i think this is threading-unrelated, to me it (tentatively) looks like to be a signal handling bug. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
On Mon, 25 Sep 2000, Mark Hahn wrote: The problem is large numbers of threads in 2.4.0-test8 can result in a hard crash of the entire kernel. This can be done as a non-root user. this appears to be reproducable (128M duron, haven't tried intel UP/SMP): i've done some experimentation, and to me it appears we overload the queued signal limit of bash, or something like that? The Ctrl-C thing definitely creates alot of signals. And the default limit for queued signals [kernel/signal.c:max_queued_signals] is 1024 ... so i think this is threading-unrelated, to me it (tentatively) looks like to be a signal handling bug. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
indeed, after changing max_queued_signals to 4096, i cannot crash the kernel anymore with 2000 threads. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
btw., maybe it's init that gets those 2000 signals, not bash? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user (fwd)
Duh. This was a really stupid bug. In kernel/signal.c, collect_signal(), for the case where we don't find a siginfo block, we need to clear the signal set. In short, add the line sigdelset(list-signal, sig); just before the first "return 1" in collect_signal(), and all should be well (famous last words - it's untested, but I'm sure that's it). If I'm right, the kernel didn't properly crash, but it would send the signal on and on again forever, which would basically kill the machine if something like init or X or a number of other important cases got stuck doing nothing. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
On Mon, Sep 25, 2000 at 10:33:06AM +0200, Ingo Molnar wrote: On Mon, 25 Sep 2000, Ted Deppner wrote: I ask because on my perl-threads test case, I can't create more than 1023 threads, but I get a kernel crash when I've _attempted_ to create more than 1023 and hit ctrl-c. could you test this with the kernel/signal.c:max_queued_signals initialization change i suggested? Does it still crash? With max_queued_signals=4096, I can still only create 1022 threads under perl-5.005-threads. With more than 1023 threads the process no longer responds to ctrl-c, or a kill -INT on it. A kill -9 will kill it however with no kernel lockup. Under 1023 threads the process responds to ctrl-c. It seems like the bug is definately involved in signal handling, and that max_queued_signals affects it in some way... My ulimit -a from bash... you can see open files at 1024, but I'm not doing open files stuff in my test program (threadcrash.pl). core file size (blocks) 0 data seg size (kbytes) unlimited file size (blocks) unlimited max locked memory (kbytes) unlimited max memory size (kbytes)unlimited open files 1024 pipe size (512 bytes) 8 stack size (kbytes) 8192 cpu time (seconds) unlimited max user processes 4093 virtual memory (kbytes) unlimited I upped my open-files to 2048 and still was unable to get more than 1022 threads running. I wonder if perl-5.005-threads might have a static limit set somewhere inside it. Maybe I'll try to recompile it tonight and see what happens. -- Ted Deppner http://www.psyber.com/~ted/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
> The problem is large numbers of threads in 2.4.0-test8 can result in a > hard crash of the entire kernel. This can be done as a non-root user. this appears to be reproducable (128M duron, haven't tried intel UP/SMP): // code derived from a clone demo in lmbench. #include #include #include #include #include #include #include #include #include #include #include int do_clone(void (*fn)(void *), void *data, char *stack) { long retval; *--(void**)stack = data; __asm__ __volatile__( "int $0x80\n\t" /* Linux/i386 system call */ "testl %0,%0\n\t" /* check return value */ "jne 1f\n\t"/* jump if parent */ "call *%3\n\t" /* start subthread function */ "movl %2,%0\n\t" "int $0x80\n" /* exit system call: exit subthread */ "1:\t" :"=a" (retval) :"0" (__NR_clone),"i" (__NR_exit), "r" (fn), "b" (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | SIGCHLD), "c" (stack)); if (retval < 0) { errno = -retval; retval = -1; } return retval; } atomic_t counter = ATOMIC_INIT(0); atomic_t die = ATOMIC_INIT(0); void kid(void *data) { atomic_inc(); while (!atomic_read()) sleep(1); exit(0); } double gtod() { struct timeval tv; gettimeofday(,0); return tv.tv_sec + 1e-6 * tv.tv_usec; } int main() { const unsigned n = 2000; const int stackPerThread = 4096; char stack[n * stackPerThread]; char *stacktop = stack + sizeof(stack) - 1; double before = gtod(); for (unsigned i=0; ihttp://www.tux.org/lkml/
Re: 1023rd thread crashes 2.4.0-test8 from non-root user
The problem is large numbers of threads in 2.4.0-test8 can result in a hard crash of the entire kernel. This can be done as a non-root user. this appears to be reproducable (128M duron, haven't tried intel UP/SMP): // code derived from a clone demo in lmbench. #include signal.h #include stdio.h #include unistd.h #include stdlib.h #include sys/user.h #include sys/wait.h #include sched.h #include syscall.h #include errno.h #include sys/time.h #include asm/atomic.h int do_clone(void (*fn)(void *), void *data, char *stack) { long retval; *--(void**)stack = data; __asm__ __volatile__( "int $0x80\n\t" /* Linux/i386 system call */ "testl %0,%0\n\t" /* check return value */ "jne 1f\n\t"/* jump if parent */ "call *%3\n\t" /* start subthread function */ "movl %2,%0\n\t" "int $0x80\n" /* exit system call: exit subthread */ "1:\t" :"=a" (retval) :"0" (__NR_clone),"i" (__NR_exit), "r" (fn), "b" (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | SIGCHLD), "c" (stack)); if (retval 0) { errno = -retval; retval = -1; } return retval; } atomic_t counter = ATOMIC_INIT(0); atomic_t die = ATOMIC_INIT(0); void kid(void *data) { atomic_inc(counter); while (!atomic_read(die)) sleep(1); exit(0); } double gtod() { struct timeval tv; gettimeofday(tv,0); return tv.tv_sec + 1e-6 * tv.tv_usec; } int main() { const unsigned n = 2000; const int stackPerThread = 4096; char stack[n * stackPerThread]; char *stacktop = stack + sizeof(stack) - 1; double before = gtod(); for (unsigned i=0; in; i++) { if (do_clone(kid, (void*) "hey", stacktop) 0) { perror("clone"); exit(1); } stacktop -= 4096; } double elapsed = gtod() - before; printf("OK, created %d threads in %f seconds (%f/second)\n", n, elapsed, n/elapsed); printf("hit any key to tell them all to die..."); fflush(stdout); getchar(); atomic_set(die,1); for (int c=0; catomic_read(counter); c++) wait(0); printf("OK, all dead\n"); return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/