Hi, I am running an embedded linux (kernel 2.4.18) system with a bunch of multithreaded applications compiled using uClibc (0.9.26). I have noticed that very occasionally a couple of the processes die with a segmentation fault. The core file points to the process dying on the bnslr instruction inside the uClibc library code immediately after a system call. I don't know why the system returned from the system call, because nothing should have been going on at the time and the process should have been sleeping in kernel space on the accept call.
Would like any pointers from people more knowledgeable on the ppc architecture as to why a bnslr instruction would cause a segmentation fault (the LR register is correct). I am assuming that the registers saved by the OS when the fault occurred are precise (at the instant of the fault and not sometime after). Also it bothers me as to why the OS returned from the system call, as there was no activity going on. Attached below is some debugging info using gdb. Thanks, Vijay GNU gdb 5.2.1 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-linux"... Core was generated by `xmlagent.out 1 2 6500 6500 1'. Program terminated with signal 11, Segmentation fault. Reading symbols from /lib/libpthread.so.0...done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libc.so.0...done. Loaded symbols for /lib/libc.so.0 Reading symbols from /lib/ld.so.1...done. Loaded symbols for /lib/ld.so.1 #0 0x30072000 in __uClibc_syscall () from /lib/libc.so.0 (gdb) bt #0 0x30072000 in __uClibc_syscall () from /lib/libc.so.0 #1 0x3005f7b0 in accept () from /lib/libc.so.0 #2 0x3002175c in accept () from /lib/libpthread.so.0 #3 0x10002718 in xmlSocketThread (arg=0x200) at xmlMain.c:335 #4 0x3001b3ec in __pthread_manager_event () from /lib/libpthread.so.0 #5 0x30071fe0 in clone () from /lib/libc.so.0 (gdb) info frame Stack level 0, frame at 0x7f7ffc60: pc = 0x30072000 in __uClibc_syscall; saved pc 0x3005f7b0 (FRAMELESS), called by frame at 0x7f7ffc60 Arglist at 0x7f7ffc60, args: Locals at 0x7f7ffc60, Previous frame's sp is 0x0 (gdb) disassemble Dump of assembler code for function __uClibc_syscall: 0x30071ffc <__uClibc_syscall>: sc 0x30072000 <__uClibc_syscall+4>: bnslr --------> segmentation fault 0x30072004 <__uClibc_syscall+8>: b 0x300897f8 <___brk_addr+2276> End of assembler dump. (gdb) info registers r0 0x66 102 r1 0x7f7ffc60 2139094112 r2 0x0 0 r3 0x200 512 r4 0x7f7ffc68 2139094120 r5 0x7f7ffd1c 2139094300 r6 0x8 8 r7 0x38 56 r8 0x21 33 r9 0x7f7ffcc8 2139094216 r10 0x7f7ffc70 2139094128 r11 0x30071ffc 805773308 r12 0x1005a07c 268804220 r13 0x0 0 r14 0x30005140 805327168 r15 0x30005000 805326848 r16 0x2 2 r17 0xe4 228 r18 0x300050a0 805327008 r19 0xe4 228 r20 0x30035bf0 805526512 r21 0x30035d60 805526880 r22 0x0 0 ---Type <return> to continue, or q <return> to quit--- r23 0x0 0 r24 0x100025f4 268445172 r25 0x10076ccc 268922060 r26 0x10070000 268894208 r27 0x7f7ffd1c 2139094300 r28 0x7f7ffcc8 2139094216 r29 0x10 16 r30 0x100726d8 268904152 r31 0x10072744 268904260 pc 0x30072000 805773312 ps 0xd030 53296 cr 0x32002028 838869032 lr 0x3005f7b0 805697456 ctr 0x30071ffc 805773308 xer 0x20000000 536870912 (gdb) disassemble 0x3005f7b0 --------> LR location Dump of assembler code for function accept: 0x3005f784 <accept>: stwu r1,-48(r1) 0x3005f788 <accept+4>: mflr r0 0x3005f78c <accept+8>: stw r0,52(r1) 0x3005f790 <accept+12>: mr r9,r4 0x3005f794 <accept+16>: mr r0,r3 0x3005f798 <accept+20>: addi r4,r1,8 0x3005f79c <accept+24>: li r3,5 0x3005f7a0 <accept+28>: stw r0,8(r1) 0x3005f7a4 <accept+32>: stw r9,12(r1) 0x3005f7a8 <accept+36>: stw r5,16(r1) 0x3005f7ac <accept+40>: bl 0x30088fb8 <___brk_addr+164> 0x3005f7b0 <accept+44>: lwz r0,52(r1) 0x3005f7b4 <accept+48>: addi r1,r1,48 0x3005f7b8 <accept+52>: mtlr r0 0x3005f7bc <accept+56>: blr