Hi PPC fellows, We are facing a crash on high load on our TS console servers (2.2.14 based).
The test used to reproduce the crash involves running SSH connection attemps in a loop from a fast host. After one or two hours of testing, the crash happens. Its still possible to ping the box and it answers to typed keys, but thats all. The kernel is looping in page fault handling code as following, which has been observed from a BDI2000 and gdb: (gdb) cont Continuing. (locked here, so I type "ctrl+c" on the gdb session). Program received signal SIGSTOP, Stopped (signal). local_flush_tlb_page (vma=0xce678200, vmaddr=2147481140) at init.c:549 549 asm volatile ("tlbia" : : ); (gdb) bt #0 local_flush_tlb_page (vma=0xce678200, vmaddr=2147481140) at init.c:549 #1 0xc0019368 in handle_mm_fault (tsk=0xce95e000, vma=0xce678200, address=2147481140, write_access=33554432) at memory.c:918 Cannot access memory at address 0xce95fca0 (gdb) cont Continuing. And it keeps receiving faults from this address (7FFFF634 in this example, sometimes also 7FFFF630), which are part of the process last VMA. Forever. # cat /proc/1/maps 30023000-30026000 rwxp 00013000 01:00 249 /lib/ld-2.1.3.so 30026000-30027000 rwxp 00000000 00:00 0 7fffe000-80000000 rwxp fffff000 00:00 0 The "error_code" passed to "do_page_fault" under such endless loop is either 0xE (14) or 0x82000000 (2181038080). handle_mm_fault trace for such "unsuccessful pte bringup": #0 handle_mm_fault (tsk=0xce70c000, vma=0xce6188c0, address=2147481140, write_access=33554432) at memory.c:901 903 if (!pte_present(entry)) { 909 entry = pte_mkyoung(entry); 910 set_pte(pte, entry); 911 flush_tlb_page(vma, address); 912 if (write_access) { 913 if (!pte_write(entry)) 303 pte_val(pte) |= _PAGE_DIRTY; 304 if (pte_val(pte) & _PAGE_RW) 305 pte_val(pte) |= _PAGE_HWWRITE; 918 flush_tlb_page(vma, address); 916 entry = pte_mkdirty(entry); 917 set_pte(pte, entry); 918 flush_tlb_page(vma, address); 921 return 1; I should try to figure out why is it faulting. Maybe the pte is not being correctly setup. Any hints are welcome. /proc/cpuinfo processor : 0 cpu : 8xx clock : 48MHz clock : 48MHz bus clock : 48MHz revision : 0.0 bogomips : 47.82 zero pages : total 0 (0Kb) current: 0 (0Kb) hits: 0/124087 (0%) ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/