Hi guys,

For a couple of years already I've been trying to find out why our hosting machine reboots randomly. I posted some stuff to this list too. Got some tips, mostly about hardware. What happens is that both the main server and the backup server (which is just idling) just reboot. Sometimes after 60 days, sometimes after one day. No logs, no strange traffic patterns, nothing. I enabled kernel debugging. Caught a crashdump on our backup machine which I will post below. The process that crashes is the CPU monitor for Cpanel. I disabled that one, so it crashed on any other process (httpd, perl, etc). I tried disabling ACPI, rebuild world with just -O in make.conf, etc etc. This morning the main server rebooted again, it didn't even leave a dump in /var/crash. Hardware is not the same. This behavious I've seen on dual athlons (two different mainboards) and dual Xeons. It seems related to SMP code. Played around with idle and hyperthreading settings in sysctl too. Nothing seems to make any difference at all. The crashump is below, does anyone have ANY idea what might cause this?


I think it has to be the cpanel hosting panel, but such an application shouldn't be able to to crash the OS...

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 01
fault virtual address   = 0x98
fault code              = supervisor write, page not present
instruction pointer     = 0x20:0xc06b7f1e
stack pointer           = 0x28:0xece5f730
frame pointer           = 0x28:0xece5f774
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 69885 (dcpumon)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 2d22h1m13s
Dumping 2047 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
chunk 1: 2047MB (523904 pages) 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15

#0  doadump () at pcpu.h:165
165             __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) backtrace
#0  doadump () at pcpu.h:165
#1  0xc063efca in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399
#2 0xc063f396 in panic (fmt=0xc0870bd4 "%s") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc082e16c in trap_fatal (frame=0xece5f6f0, eva=0) at /usr/src/sys/i386/i386/trap.c:831 #4 0xc082de52 in trap_pfault (frame=0xece5f6f0, usermode=0, eva=152) at /usr/src/sys/i386/i386/trap.c:742
#5  0xc082da02 in trap (frame=
{tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = 4, tf_esi = 0, tf_ebp = -320473228, tf_isp = -320473316, tf_ebx = 4098, tf_edx = -1002850048, tf_ecx = 0, tf_eax = 4, tf_trapno = 12, tf_err = 2, tf_eip = -1066696930, tf_cs = 32, tf_eflags = 66118, tf_esp = -320473100, tf_ss = 1017})
    at /usr/src/sys/i386/i386/trap.c:432
#6  0xc0817d0a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc06b7f1e in vn_lock (vp=0x0, flags=4098, td=0xc439b900) at atomic.h:149
#8 0xc05eee46 in procfs_doprocfile (td=0xc439b900, p=0xc9068830, pn=0xc35f3900, sb=0x4, uio=0x0) at /usr/src/sys/fs/procfs/procfs.c:73
#9  0xc05f3f5b in pfs_readlink (va=0x4) at pcpu.h:162
#10 0xc0841a13 in VOP_READLINK_APV (vop=0x4, a=0xc439b900) at vnode_if.c:1481
#11 0xc06b14e3 in kern_readlink (td=0xc439b900, path=0xc439b900 "<j\006É x\006É", pathseg=3292117248, buf=0x4 <Address 0x4 out of bounds>, bufseg=4,
    count=1024) at vnode_if.h:772
#12 0xc06b13e8 in readlink (td=0x4, uap=0xc439b900) at /usr/src/sys/kern/vfs_syscalls.c:2261
#13 0xc082e573 in syscall (frame=
{tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 135512892, tf_esi = 135663632, tf_ebp = -1077940936, tf_isp = -320471708, tf_ebx = 674109588, tf_edx = -1077941960, tf_ecx = 0, tf_eax = 58, tf_trapno = 0, tf_err = 2, tf_eip = 672579140, tf_cs = 51, tf_eflags = 647, tf_esp = -1077942020, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:976 #14 0xc0817d5f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200
#15 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)

/Robin
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to