bce kernel page faults and NMIs (was: Strange reboot since 9.1)

2013-06-03 Thread Sebastian Kuzminsky
Howdy folks, this email is a follow-on to a 3-month-old thread about kernel 
page faults from the bce driver[0].

0:  http://lists.freebsd.org/pipermail/freebsd-stable/2013-March/072713.html

Sorry to revive such an old thread, but a couple of bits of new information has 
come to light here that may be useful for others.

The header splitting suggestion that Marius Strobl  made[1] did fix the kernel 
page fault rooted in bce_intr() that we were seeing (and that other folks 
reported in the original thread).  I'm no bce expert, but it looks to me like 
the bce driver does not apply the same flow control to its page queue as it 
does to its receive queue, maybe that's related to the problem?

1:  http://lists.freebsd.org/pipermail/freebsd-stable/2013-March/072766.html

After disabling bce header splitting we stopped getting kernel page faults, but 
we still had problems with this NIC (Broadcom NetXtreme II BCM5716 Gigabit 
Ethernet) producing frequent PCI errors and occasional NMIs.

I found this thread[2] that suggests that the NIC firmware version may be 
relevant to the NMI problem.  The Red Hat people are reporting that firmware 
version 6.0.1 is bad and 6.4.5 is good; 9.1 ships with 6.0.17, so who knows 
what that means...  We ended up reverting to the bce driver from FreeBSD 7 and 
that fixed our NMI problems.  (The bce driver from FreeBSD 7 also has header 
splitting disabled by default: Bonus!)

2:  https://bugzilla.redhat.com/show_bug.cgi?id=693542


-- 
Sebastian Kuzminsky
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange reboot since 9.1

2013-03-11 Thread Marius Strobl
On Sun, Mar 10, 2013 at 08:19:04PM +0100, Loïc BLOT wrote:
 Hi Marius,
 sorry but you patch doesn't have effect, another crash with same
 backtrace. 

Okay, thanks. Unfortunately, I'm running out of ideas for now. It
seems that the problem isn't caused by a logic error within the
driver then but rather some incorrect handling of the hardware.
The public ally available documentation for these chips is heavily
sanitized and totally unusable for writing drivers though.
The only remaining thing to test I can think of is whether this
issue is related to the header splitting, which is enabled by
default. To disable, set the loader tunable hw.bce.hdr_split to 0
or to be really sure, change the bce_hdr_split default to FALSE
in the driver and recompile it.

Marius

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange reboot since 9.1

2013-03-10 Thread Loïc BLOT
Hi Marius,
sorry but you patch doesn't have effect, another crash with same
backtrace. 
-- 
Best regards,
Loïc BLOT, 
UNIX systems, security and network expert
http://www.unix-experience.fr




Le samedi 09 mars 2013 à 15:32 +0100, Marius Strobl a écrit :

 On Sat, Mar 09, 2013 at 09:53:54AM +0100, Loïc BLOT wrote:
  Hi Marius
  Thanks for your patch, but it has no effect for stability. The server
  has rebooted this night after 8h uptime, same backtrace appears.
 
 Okay, could you please give the following patch a try instead in order
 to test another theory?
 http://people.freebsd.org/~marius/bce_rx_corruption.diff
 
 Marius
 


signature.asc
Description: This is a digitally signed message part


Re: Strange reboot since 9.1

2013-03-09 Thread Loïc BLOT
Hi Marius
Thanks for your patch, but it has no effect for stability. The server
has rebooted this night after 8h uptime, same backtrace appears.

-- 
Best regards,
Loïc BLOT, 
UNIX systems, security and network expert
http://www.unix-experience.fr




Le vendredi 08 mars 2013 à 17:16 +0100, Marius Strobl a écrit :

 On Fri, Mar 08, 2013 at 11:32:54AM +0900, YongHyeon PYUN wrote:
  On Thu, Mar 07, 2013 at 08:38:27AM -0800, Jeremy Chadwick wrote:
   On Thu, Mar 07, 2013 at 04:38:54PM +0100, Lo?c Blot wrote:
Hi Marcelo, thanks. Here is a better trace:

-

kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.11
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for
details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80a84414
stack pointer   = 0x28:0xff822fc267a0
frame pointer   = 0x28:0xff822fc26830
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq265: bce0)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0x809208a6 at kdb_backtrace+0x66
#1 0x808ea8be at panic+0x1ce
#2 0x80bd8240 at trap_fatal+0x290
#3 0x80bd857d at trap_pfault+0x1ed
#4 0x80bd8b9e at trap+0x3ce
#5 0x80bc315f at calltrap+0x8
#6 0x80a861d5 at udp_input+0x475
#7 0x80a043dc at ip_input+0xac
#8 0x809adafb at netisr_dispatch_src+0x20b
#9 0x809a35cd at ether_demux+0x14d
#10 0x809a38a4 at ether_nh_input+0x1f4
#11 0x809adafb at netisr_dispatch_src+0x20b
#12 0x80438fd7 at bce_intr+0x487
#13 0x808be8d4 at intr_event_execute_handlers+0x104
#14 0x808c0076 at ithread_loop+0xa6
#15 0x808bb9ef at fork_exit+0x11f
#16 0x80bc368e at fork_trampoline+0xe
Uptime: 27m20s
Dumping 1265 out of 8162
MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..92%

#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
224 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt f
#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
No locals.
#1  0x808ea3a1 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:448
_ep = Variable _ep is not available.
(kgdb) bt
#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
#1  0x808ea3a1 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:448
#2  0x808ea897 in panic (fmt=0x1 Address 0x1 out of bounds)
at /usr/src/sys/kern/kern_shutdown.c:636
#3  0x80bd8240 in trap_fatal (frame=0xc, eva=Variable eva is
not available.
) at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x80bd857d in trap_pfault (frame=0xff822fc266f0,
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:773
#5  0x80bd8b9e in trap (frame=0xff822fc266f0)
at /usr/src/sys/amd64/amd64/trap.c:456
#6  0x80bc315f in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:228
#7  0x80a84414 in udp_append (inp=0xfe019e2a1000,
ip=0xfe00444b6c80, n=0xfe00444b6c00, off=20,
udp_in=0xff822fc268a0) at /usr/src/sys/netinet/udp_usrreq.c:252
#8  0x80a861d5 in udp_input (m=0xfe00444b6c00, off=Variable
off is not available.
) at /usr/src/sys/netinet/udp_usrreq.c:618
#9  0x80a043dc in ip_input (m=0xfe00444b6c00)
at /usr/src/sys/netinet/ip_input.c:760
#10 0x809adafb in netisr_dispatch_src (proto=1, source=Variable
source is not available.
) at /usr/src/sys/net/netisr.c:1013
#11 0x809a35cd in ether_demux (ifp=0xfe00053fa000,
m=0xfe00444b6c00) at /usr/src/sys/net/if_ethersubr.c:940
#12 0x809a38a4 in ether_nh_input (m=Variable m is not
available.
) at /usr/src/sys/net/if_ethersubr.c:759
#13 0x809adafb in netisr_dispatch_src (proto=9, source=Variable
source is not available.
) at /usr/src/sys/net/netisr.c:1013
#14 0x80438fd7 in bce_intr 

Re: Strange reboot since 9.1

2013-03-09 Thread Marius Strobl
On Sat, Mar 09, 2013 at 09:53:54AM +0100, Loïc BLOT wrote:
 Hi Marius
 Thanks for your patch, but it has no effect for stability. The server
 has rebooted this night after 8h uptime, same backtrace appears.

Okay, could you please give the following patch a try instead in order
to test another theory?
http://people.freebsd.org/~marius/bce_rx_corruption.diff

Marius

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange reboot since 9.1

2013-03-08 Thread Marius Strobl
On Fri, Mar 08, 2013 at 11:32:54AM +0900, YongHyeon PYUN wrote:
 On Thu, Mar 07, 2013 at 08:38:27AM -0800, Jeremy Chadwick wrote:
  On Thu, Mar 07, 2013 at 04:38:54PM +0100, Lo?c Blot wrote:
   Hi Marcelo, thanks. Here is a better trace:
   
   -
   
   kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.11
   GNU gdb 6.1.1 [FreeBSD]
   Copyright 2004 Free Software Foundation, Inc.
   GDB is free software, covered by the GNU General Public License, and you
   are
   welcome to change it and/or distribute copies of it under certain
   conditions.
   Type show copying to see the conditions.
   There is absolutely no warranty for GDB.  Type show warranty for
   details.
   This GDB was configured as amd64-marcel-freebsd...
   
   Unread portion of the kernel message buffer:
   
   
   Fatal trap 12: page fault while in kernel mode
   cpuid = 0; apic id = 00
   fault virtual address = 0x0
   fault code= supervisor read data, page not present
   instruction pointer   = 0x20:0x80a84414
   stack pointer = 0x28:0xff822fc267a0
   frame pointer = 0x28:0xff822fc26830
   code segment  = base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
   processor eflags  = interrupt enabled, resume, IOPL = 0
   current process   = 12 (irq265: bce0)
   trap number   = 12
   panic: page fault
   cpuid = 0
   KDB: stack backtrace:
   #0 0x809208a6 at kdb_backtrace+0x66
   #1 0x808ea8be at panic+0x1ce
   #2 0x80bd8240 at trap_fatal+0x290
   #3 0x80bd857d at trap_pfault+0x1ed
   #4 0x80bd8b9e at trap+0x3ce
   #5 0x80bc315f at calltrap+0x8
   #6 0x80a861d5 at udp_input+0x475
   #7 0x80a043dc at ip_input+0xac
   #8 0x809adafb at netisr_dispatch_src+0x20b
   #9 0x809a35cd at ether_demux+0x14d
   #10 0x809a38a4 at ether_nh_input+0x1f4
   #11 0x809adafb at netisr_dispatch_src+0x20b
   #12 0x80438fd7 at bce_intr+0x487
   #13 0x808be8d4 at intr_event_execute_handlers+0x104
   #14 0x808c0076 at ithread_loop+0xa6
   #15 0x808bb9ef at fork_exit+0x11f
   #16 0x80bc368e at fork_trampoline+0xe
   Uptime: 27m20s
   Dumping 1265 out of 8162
   MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..92%
   
   #0  doadump (textdump=Variable textdump is not available.
   ) at pcpu.h:224
   224   pcpu.h: No such file or directory.
 in pcpu.h
   (kgdb) bt f
   #0  doadump (textdump=Variable textdump is not available.
   ) at pcpu.h:224
   No locals.
   #1  0x808ea3a1 in kern_reboot (howto=260)
   at /usr/src/sys/kern/kern_shutdown.c:448
 _ep = Variable _ep is not available.
   (kgdb) bt
   #0  doadump (textdump=Variable textdump is not available.
   ) at pcpu.h:224
   #1  0x808ea3a1 in kern_reboot (howto=260)
   at /usr/src/sys/kern/kern_shutdown.c:448
   #2  0x808ea897 in panic (fmt=0x1 Address 0x1 out of bounds)
   at /usr/src/sys/kern/kern_shutdown.c:636
   #3  0x80bd8240 in trap_fatal (frame=0xc, eva=Variable eva is
   not available.
   ) at /usr/src/sys/amd64/amd64/trap.c:857
   #4  0x80bd857d in trap_pfault (frame=0xff822fc266f0,
   usermode=0) at /usr/src/sys/amd64/amd64/trap.c:773
   #5  0x80bd8b9e in trap (frame=0xff822fc266f0)
   at /usr/src/sys/amd64/amd64/trap.c:456
   #6  0x80bc315f in calltrap ()
   at /usr/src/sys/amd64/amd64/exception.S:228
   #7  0x80a84414 in udp_append (inp=0xfe019e2a1000,
   ip=0xfe00444b6c80, n=0xfe00444b6c00, off=20,
   udp_in=0xff822fc268a0) at /usr/src/sys/netinet/udp_usrreq.c:252
   #8  0x80a861d5 in udp_input (m=0xfe00444b6c00, off=Variable
   off is not available.
   ) at /usr/src/sys/netinet/udp_usrreq.c:618
   #9  0x80a043dc in ip_input (m=0xfe00444b6c00)
   at /usr/src/sys/netinet/ip_input.c:760
   #10 0x809adafb in netisr_dispatch_src (proto=1, source=Variable
   source is not available.
   ) at /usr/src/sys/net/netisr.c:1013
   #11 0x809a35cd in ether_demux (ifp=0xfe00053fa000,
   m=0xfe00444b6c00) at /usr/src/sys/net/if_ethersubr.c:940
   #12 0x809a38a4 in ether_nh_input (m=Variable m is not
   available.
   ) at /usr/src/sys/net/if_ethersubr.c:759
   #13 0x809adafb in netisr_dispatch_src (proto=9, source=Variable
   source is not available.
   ) at /usr/src/sys/net/netisr.c:1013
   #14 0x80438fd7 in bce_intr (xsc=Variable xsc is not available.
   ) at /usr/src/sys/dev/bce/if_bce.c:6903
   #15 0x808be8d4 in intr_event_execute_handlers (p=Variable p is
   not available.
   ) at /usr/src/sys/kern/kern_intr.c:1262
   #16 0x808c0076 in ithread_loop (arg=0xfe00057424e0)
   at /usr/src/sys/kern/kern_intr.c:1275
   #17 0x808bb9ef in fork_exit (callout=0x808bffd0
   ithread_loop, arg=0xfe00057424e0, 

Re: Strange reboot since 9.1

2013-03-07 Thread Loïc Blot
Hello,
i have enabled dumpdev=AUTO and run kgdb after a reboot.
Here is the backtrace:

root@freebsd-server kgdb
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for
details.
This GDB was configured as amd64-marcel-freebsd...
#0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
flags=Variable flags is not available.
) at /usr/src/sys/kern/sched_ule.c:1927
1927cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
flags=Variable flags is not available.
) at /usr/src/sys/kern/sched_ule.c:1927
#1  0x808f2d46 in mi_switch (flags=260, newtd=0x0)
at /usr/src/sys/kern/kern_synch.c:485
#2  0x8092ba72 in sleepq_timedwait (wchan=0x81222400,
pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:658
#3  0x808f332f in _sleep (ident=0x81222400, lock=0x0,
priority=Variable priority is not available.
) at /usr/src/sys/kern/kern_synch.c:246
#4  0x80b429db in scheduler (dummy=Variable dummy is not
available.
) at /usr/src/sys/vm/vm_glue.c:788
#5  0x8089c047 in mi_startup ()
at /usr/src/sys/kern/init_main.c:277
#6  0x802b526c in btext ()
at /usr/src/sys/amd64/amd64/locore.S:81
#7  0x0001 in ?? ()
#8  0x81240f80 in tdq_cpu ()
#9  0x812228a0 in proc0 ()
#10 0x in ?? ()
#11 0x81529b90 in ?? ()
#12 0x81529b38 in ?? ()
#13 0xfe00051c8000 in ?? ()
#14 0x8091352e in sched_switch (td=0x0, newtd=0x0,
flags=Variable flags is not available.
) at /usr/src/sys/kern/sched_ule.c:1921
Previous frame inner to this frame (corrupt stack?)
(kgdb) bt f
#0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
flags=Variable flags is not available.
) at /usr/src/sys/kern/sched_ule.c:1927
__res = 2
__s = Variable __s is not available.

-- 
Best regards, 

Loïc BLOT, Engineering
UNIX Systems, Security and Networks
http://www.unix-experience.fr


Le mercredi 06 mars 2013 à 11:18 +0200, Marin Atanasov Nikolov a écrit :
 
 
 
 On Wed, Mar 6, 2013 at 10:55 AM, Loïc Blot
 loic.b...@unix-experience.fr wrote:
 Hello,
 
 
 Hi,
 
  
 Since FreeBSD 9.1 I have strange problems with the
 distribution. Some
 servers are rebooting without any kernel panic, instanly.
 First i
 thought it's a problem with my KVM system, but one of my
 FreeBSD under a
 Dell R210 have the same problem.
 The servers concerned are now:
 - Monitoring server
 - LDAP test server
 - Some other servers, randomly (not in production).
 First i thought it's a problem with my FreeBSD install, then i
 download
 another time the ISO but the problem was already here. After i
 try
 another thing, install 9.0 and upgrade to 9.1 but same
 problem.
 How can i get informations about this problem ?
 
 
 
 I've had similar issues with one of my FreeBSD systems. My system had
 spontaneous reboots without any kernel panic, without any clear
 evidence of why it happened.
 
 
 After a lot of trials and tests the root cause appeared to be the
 amount of ZFS snapshots I had, which were more than 1K on a 8G system.
 
 
 
 Upgrading from 9.0 to 9.1 didn't solve the issue, as clearly I had to
 do some cleanup of the ZFS snapshots and since then it's more than a
 month without any reboots.
 
 
 Few pointers that you could use -- get these systems monitored and
 keep an eye on the monitoring system -- CPU usage, memory, processes,
 network traffic, etc.. I've noticed that my system was running low on
 free memory and that later led me to the ZFS snapshots clue. 
 
 
 So, my advise is to get first these systems monitored and watch for
 anything unusual happening. Then further investigate.
 
 Good luck.
 
 Regards,
 Marin
 
  
 Thanks for advance.
 --
 Best regards,
 
 Loïc BLOT, Engineering
 UNIX Systems, Security and Networks
 http://www.unix-experience.fr
 
 
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 freebsd-stable-unsubscr...@freebsd.org
 
 
 
 -- 
 Marin Atanasov Nikolov
 
 dnaeon AT gmail DOT com
 http://www.unix-heaven.org/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Strange reboot since 9.1

2013-03-07 Thread Andriy Gapon
on 07/03/2013 12:27 Loïc Blot said the following:
 Hello,
 i have enabled dumpdev=AUTO and run kgdb after a reboot.
 Here is the backtrace:
 
 root@freebsd-server kgdb

It's a stack trace of the first thread in your live running system.
You need to read kgdb(1), inspect your /var/crash directory and pass a proper
vmcore file, if any, to kgdb.

 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you
 are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for
 details.
 This GDB was configured as amd64-marcel-freebsd...
 #0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
 flags=Variable flags is not available.
 ) at /usr/src/sys/kern/sched_ule.c:1927
 1927  cpuid = PCPU_GET(cpuid);
 (kgdb) bt
 #0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
 flags=Variable flags is not available.
 ) at /usr/src/sys/kern/sched_ule.c:1927
 #1  0x808f2d46 in mi_switch (flags=260, newtd=0x0)
 at /usr/src/sys/kern/kern_synch.c:485
 #2  0x8092ba72 in sleepq_timedwait (wchan=0x81222400,
 pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:658
 #3  0x808f332f in _sleep (ident=0x81222400, lock=0x0,
 priority=Variable priority is not available.
 ) at /usr/src/sys/kern/kern_synch.c:246
 #4  0x80b429db in scheduler (dummy=Variable dummy is not
 available.
 ) at /usr/src/sys/vm/vm_glue.c:788
 #5  0x8089c047 in mi_startup ()
 at /usr/src/sys/kern/init_main.c:277
 #6  0x802b526c in btext ()
 at /usr/src/sys/amd64/amd64/locore.S:81
 #7  0x0001 in ?? ()
 #8  0x81240f80 in tdq_cpu ()
 #9  0x812228a0 in proc0 ()
 #10 0x in ?? ()
 #11 0x81529b90 in ?? ()
 #12 0x81529b38 in ?? ()
 #13 0xfe00051c8000 in ?? ()
 #14 0x8091352e in sched_switch (td=0x0, newtd=0x0,
 flags=Variable flags is not available.
 ) at /usr/src/sys/kern/sched_ule.c:1921
 Previous frame inner to this frame (corrupt stack?)
 (kgdb) bt f
 #0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
 flags=Variable flags is not available.
 ) at /usr/src/sys/kern/sched_ule.c:1927
   __res = 2
   __s = Variable __s is not available.
 


-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Strange reboot since 9.1

2013-03-07 Thread Loïc Blot
Hi Andriy,
thanks for your help.

here is the stack backtrace (i have 11 core.txt files, and each has this
crash). (cat /var/crash/core.txt.11)

panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0x809208a6 at kdb_backtrace+0x66
#1 0x808ea8be at panic+0x1ce
#2 0x80bd8240 at trap_fatal+0x290
#3 0x80bd857d at trap_pfault+0x1ed
#4 0x80bd8b9e at trap+0x3ce
#5 0x80bc315f at calltrap+0x8
#6 0x80a861d5 at udp_input+0x475
#7 0x80a043dc at ip_input+0xac
#8 0x809adafb at netisr_dispatch_src+0x20b
#9 0x809a35cd at ether_demux+0x14d
#10 0x809a38a4 at ether_nh_input+0x1f4
#11 0x809adafb at netisr_dispatch_src+0x20b
#12 0x80438fd7 at bce_intr+0x487
#13 0x808be8d4 at intr_event_execute_handlers+0x104
#14 0x808c0076 at ithread_loop+0xa6
#15 0x808bb9ef at fork_exit+0x11f
#16 0x80bc368e at fork_trampoline+0xe
Uptime: 2h6m59s
Dumping 1177 out of 8162
MB:..2%..11%..21%..32%..41%..51%..62%..71%..81%..92%

I can't read vmcore.11 only with this option:

kgdb -d /var/crash/vmcore.11

I read man and thought i must use kgdb -c /var/crash/vmcore.11 but it's
not a suitable image. (kgdb: couldn't find a suitable kernel image)

This servers uses UDP packets, for SNMP requests ( 1/h), NTP (a
little), Syslog (that's all i remember).
-- 
Best regards, 

Loïc BLOT, Engineering
UNIX Systems, Security and Networks
http://www.unix-experience.fr


Le jeudi 07 mars 2013 à 14:55 +0200, Andriy Gapon a écrit :
 on 07/03/2013 12:27 Loïc Blot said the following:
  Hello,
  i have enabled dumpdev=AUTO and run kgdb after a reboot.
  Here is the backtrace:
  
  root@freebsd-server kgdb
 
 It's a stack trace of the first thread in your live running system.
 You need to read kgdb(1), inspect your /var/crash directory and pass a proper
 vmcore file, if any, to kgdb.
 
  GNU gdb 6.1.1 [FreeBSD]
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you
  are
  welcome to change it and/or distribute copies of it under certain
  conditions.
  Type show copying to see the conditions.
  There is absolutely no warranty for GDB.  Type show warranty for
  details.
  This GDB was configured as amd64-marcel-freebsd...
  #0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
  flags=Variable flags is not available.
  ) at /usr/src/sys/kern/sched_ule.c:1927
  1927cpuid = PCPU_GET(cpuid);
  (kgdb) bt
  #0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
  flags=Variable flags is not available.
  ) at /usr/src/sys/kern/sched_ule.c:1927
  #1  0x808f2d46 in mi_switch (flags=260, newtd=0x0)
  at /usr/src/sys/kern/kern_synch.c:485
  #2  0x8092ba72 in sleepq_timedwait (wchan=0x81222400,
  pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:658
  #3  0x808f332f in _sleep (ident=0x81222400, lock=0x0,
  priority=Variable priority is not available.
  ) at /usr/src/sys/kern/kern_synch.c:246
  #4  0x80b429db in scheduler (dummy=Variable dummy is not
  available.
  ) at /usr/src/sys/vm/vm_glue.c:788
  #5  0x8089c047 in mi_startup ()
  at /usr/src/sys/kern/init_main.c:277
  #6  0x802b526c in btext ()
  at /usr/src/sys/amd64/amd64/locore.S:81
  #7  0x0001 in ?? ()
  #8  0x81240f80 in tdq_cpu ()
  #9  0x812228a0 in proc0 ()
  #10 0x in ?? ()
  #11 0x81529b90 in ?? ()
  #12 0x81529b38 in ?? ()
  #13 0xfe00051c8000 in ?? ()
  #14 0x8091352e in sched_switch (td=0x0, newtd=0x0,
  flags=Variable flags is not available.
  ) at /usr/src/sys/kern/sched_ule.c:1921
  Previous frame inner to this frame (corrupt stack?)
  (kgdb) bt f
  #0  sched_switch (td=0x812228a0, newtd=0xfe00051c8000,
  flags=Variable flags is not available.
  ) at /usr/src/sys/kern/sched_ule.c:1927
  __res = 2
  __s = Variable __s is not available.
  
 
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Strange reboot since 9.1

2013-03-07 Thread Marcelo Gondim

Em 07/03/13 10:12, Loïc Blot escreveu:

Hi Andriy,
thanks for your help.

here is the stack backtrace (i have 11 core.txt files, and each has this
crash). (cat /var/crash/core.txt.11)

panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0x809208a6 at kdb_backtrace+0x66
#1 0x808ea8be at panic+0x1ce
#2 0x80bd8240 at trap_fatal+0x290
#3 0x80bd857d at trap_pfault+0x1ed
#4 0x80bd8b9e at trap+0x3ce
#5 0x80bc315f at calltrap+0x8
#6 0x80a861d5 at udp_input+0x475
#7 0x80a043dc at ip_input+0xac
#8 0x809adafb at netisr_dispatch_src+0x20b
#9 0x809a35cd at ether_demux+0x14d
#10 0x809a38a4 at ether_nh_input+0x1f4
#11 0x809adafb at netisr_dispatch_src+0x20b
#12 0x80438fd7 at bce_intr+0x487
#13 0x808be8d4 at intr_event_execute_handlers+0x104
#14 0x808c0076 at ithread_loop+0xa6
#15 0x808bb9ef at fork_exit+0x11f
#16 0x80bc368e at fork_trampoline+0xe
Uptime: 2h6m59s
Dumping 1177 out of 8162
MB:..2%..11%..21%..32%..41%..51%..62%..71%..81%..92%

I can't read vmcore.11 only with this option:

kgdb -d /var/crash/vmcore.11

I read man and thought i must use kgdb -c /var/crash/vmcore.11 but it's
not a suitable image. (kgdb: couldn't find a suitable kernel image)

This servers uses UDP packets, for SNMP requests ( 1/h), NTP (a
little), Syslog (that's all i remember).

Hi,

Look this 
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html


[]'s
Gondim

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Strange reboot since 9.1

2013-03-07 Thread Loïc Blot
Hi Marcelo, thanks. Here is a better trace:

-

kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.11
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for
details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80a84414
stack pointer   = 0x28:0xff822fc267a0
frame pointer   = 0x28:0xff822fc26830
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq265: bce0)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0x809208a6 at kdb_backtrace+0x66
#1 0x808ea8be at panic+0x1ce
#2 0x80bd8240 at trap_fatal+0x290
#3 0x80bd857d at trap_pfault+0x1ed
#4 0x80bd8b9e at trap+0x3ce
#5 0x80bc315f at calltrap+0x8
#6 0x80a861d5 at udp_input+0x475
#7 0x80a043dc at ip_input+0xac
#8 0x809adafb at netisr_dispatch_src+0x20b
#9 0x809a35cd at ether_demux+0x14d
#10 0x809a38a4 at ether_nh_input+0x1f4
#11 0x809adafb at netisr_dispatch_src+0x20b
#12 0x80438fd7 at bce_intr+0x487
#13 0x808be8d4 at intr_event_execute_handlers+0x104
#14 0x808c0076 at ithread_loop+0xa6
#15 0x808bb9ef at fork_exit+0x11f
#16 0x80bc368e at fork_trampoline+0xe
Uptime: 27m20s
Dumping 1265 out of 8162
MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..92%

#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
224 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt f
#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
No locals.
#1  0x808ea3a1 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:448
_ep = Variable _ep is not available.
(kgdb) bt
#0  doadump (textdump=Variable textdump is not available.
) at pcpu.h:224
#1  0x808ea3a1 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:448
#2  0x808ea897 in panic (fmt=0x1 Address 0x1 out of bounds)
at /usr/src/sys/kern/kern_shutdown.c:636
#3  0x80bd8240 in trap_fatal (frame=0xc, eva=Variable eva is
not available.
) at /usr/src/sys/amd64/amd64/trap.c:857
#4  0x80bd857d in trap_pfault (frame=0xff822fc266f0,
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:773
#5  0x80bd8b9e in trap (frame=0xff822fc266f0)
at /usr/src/sys/amd64/amd64/trap.c:456
#6  0x80bc315f in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:228
#7  0x80a84414 in udp_append (inp=0xfe019e2a1000,
ip=0xfe00444b6c80, n=0xfe00444b6c00, off=20,
udp_in=0xff822fc268a0) at /usr/src/sys/netinet/udp_usrreq.c:252
#8  0x80a861d5 in udp_input (m=0xfe00444b6c00, off=Variable
off is not available.
) at /usr/src/sys/netinet/udp_usrreq.c:618
#9  0x80a043dc in ip_input (m=0xfe00444b6c00)
at /usr/src/sys/netinet/ip_input.c:760
#10 0x809adafb in netisr_dispatch_src (proto=1, source=Variable
source is not available.
) at /usr/src/sys/net/netisr.c:1013
#11 0x809a35cd in ether_demux (ifp=0xfe00053fa000,
m=0xfe00444b6c00) at /usr/src/sys/net/if_ethersubr.c:940
#12 0x809a38a4 in ether_nh_input (m=Variable m is not
available.
) at /usr/src/sys/net/if_ethersubr.c:759
#13 0x809adafb in netisr_dispatch_src (proto=9, source=Variable
source is not available.
) at /usr/src/sys/net/netisr.c:1013
#14 0x80438fd7 in bce_intr (xsc=Variable xsc is not available.
) at /usr/src/sys/dev/bce/if_bce.c:6903
#15 0x808be8d4 in intr_event_execute_handlers (p=Variable p is
not available.
) at /usr/src/sys/kern/kern_intr.c:1262
#16 0x808c0076 in ithread_loop (arg=0xfe00057424e0)
at /usr/src/sys/kern/kern_intr.c:1275
#17 0x808bb9ef in fork_exit (callout=0x808bffd0
ithread_loop, arg=0xfe00057424e0, frame=0xff822fc26c40)
at /usr/src/sys/kern/kern_fork.c:992
#18 0x80bc368e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:602
#19 0x in ?? ()
#20 0x in ?? ()
#21 0x0001 in ?? ()
#22 0x in ?? ()
#23 0x in ?? ()
#24 0x in ?? ()
#25 0x in ?? ()
#26 0x in ?? ()
#27 0x in ?? ()
#28 0x in ?? ()
#29 0x in ?? ()
#30 

Re: Strange reboot since 9.1

2013-03-07 Thread Jeremy Chadwick
On Thu, Mar 07, 2013 at 04:38:54PM +0100, Lo?c Blot wrote:
 Hi Marcelo, thanks. Here is a better trace:
 
 -
 
 kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.11
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you
 are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for
 details.
 This GDB was configured as amd64-marcel-freebsd...
 
 Unread portion of the kernel message buffer:
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address = 0x0
 fault code= supervisor read data, page not present
 instruction pointer   = 0x20:0x80a84414
 stack pointer = 0x28:0xff822fc267a0
 frame pointer = 0x28:0xff822fc26830
 code segment  = base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags  = interrupt enabled, resume, IOPL = 0
 current process   = 12 (irq265: bce0)
 trap number   = 12
 panic: page fault
 cpuid = 0
 KDB: stack backtrace:
 #0 0x809208a6 at kdb_backtrace+0x66
 #1 0x808ea8be at panic+0x1ce
 #2 0x80bd8240 at trap_fatal+0x290
 #3 0x80bd857d at trap_pfault+0x1ed
 #4 0x80bd8b9e at trap+0x3ce
 #5 0x80bc315f at calltrap+0x8
 #6 0x80a861d5 at udp_input+0x475
 #7 0x80a043dc at ip_input+0xac
 #8 0x809adafb at netisr_dispatch_src+0x20b
 #9 0x809a35cd at ether_demux+0x14d
 #10 0x809a38a4 at ether_nh_input+0x1f4
 #11 0x809adafb at netisr_dispatch_src+0x20b
 #12 0x80438fd7 at bce_intr+0x487
 #13 0x808be8d4 at intr_event_execute_handlers+0x104
 #14 0x808c0076 at ithread_loop+0xa6
 #15 0x808bb9ef at fork_exit+0x11f
 #16 0x80bc368e at fork_trampoline+0xe
 Uptime: 27m20s
 Dumping 1265 out of 8162
 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..92%
 
 #0  doadump (textdump=Variable textdump is not available.
 ) at pcpu.h:224
 224   pcpu.h: No such file or directory.
   in pcpu.h
 (kgdb) bt f
 #0  doadump (textdump=Variable textdump is not available.
 ) at pcpu.h:224
 No locals.
 #1  0x808ea3a1 in kern_reboot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:448
   _ep = Variable _ep is not available.
 (kgdb) bt
 #0  doadump (textdump=Variable textdump is not available.
 ) at pcpu.h:224
 #1  0x808ea3a1 in kern_reboot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:448
 #2  0x808ea897 in panic (fmt=0x1 Address 0x1 out of bounds)
 at /usr/src/sys/kern/kern_shutdown.c:636
 #3  0x80bd8240 in trap_fatal (frame=0xc, eva=Variable eva is
 not available.
 ) at /usr/src/sys/amd64/amd64/trap.c:857
 #4  0x80bd857d in trap_pfault (frame=0xff822fc266f0,
 usermode=0) at /usr/src/sys/amd64/amd64/trap.c:773
 #5  0x80bd8b9e in trap (frame=0xff822fc266f0)
 at /usr/src/sys/amd64/amd64/trap.c:456
 #6  0x80bc315f in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:228
 #7  0x80a84414 in udp_append (inp=0xfe019e2a1000,
 ip=0xfe00444b6c80, n=0xfe00444b6c00, off=20,
 udp_in=0xff822fc268a0) at /usr/src/sys/netinet/udp_usrreq.c:252
 #8  0x80a861d5 in udp_input (m=0xfe00444b6c00, off=Variable
 off is not available.
 ) at /usr/src/sys/netinet/udp_usrreq.c:618
 #9  0x80a043dc in ip_input (m=0xfe00444b6c00)
 at /usr/src/sys/netinet/ip_input.c:760
 #10 0x809adafb in netisr_dispatch_src (proto=1, source=Variable
 source is not available.
 ) at /usr/src/sys/net/netisr.c:1013
 #11 0x809a35cd in ether_demux (ifp=0xfe00053fa000,
 m=0xfe00444b6c00) at /usr/src/sys/net/if_ethersubr.c:940
 #12 0x809a38a4 in ether_nh_input (m=Variable m is not
 available.
 ) at /usr/src/sys/net/if_ethersubr.c:759
 #13 0x809adafb in netisr_dispatch_src (proto=9, source=Variable
 source is not available.
 ) at /usr/src/sys/net/netisr.c:1013
 #14 0x80438fd7 in bce_intr (xsc=Variable xsc is not available.
 ) at /usr/src/sys/dev/bce/if_bce.c:6903
 #15 0x808be8d4 in intr_event_execute_handlers (p=Variable p is
 not available.
 ) at /usr/src/sys/kern/kern_intr.c:1262
 #16 0x808c0076 in ithread_loop (arg=0xfe00057424e0)
 at /usr/src/sys/kern/kern_intr.c:1275
 #17 0x808bb9ef in fork_exit (callout=0x808bffd0
 ithread_loop, arg=0xfe00057424e0, frame=0xff822fc26c40)
 at /usr/src/sys/kern/kern_fork.c:992
 #18 0x80bc368e in fork_trampoline ()
 at /usr/src/sys/amd64/amd64/exception.S:602
 #19 0x in ?? ()
 #20 0x in ?? ()
 #21 0x0001 in ?? ()
 #22 0x in ?? ()
 #23 0x in ?? ()
 #24 0x in ?? ()
 #25 

Re: Strange reboot since 9.1

2013-03-07 Thread Loïc Blot
Here is pciconf -lbcv

hostb0@pci0:0:0:0:  class=0x06 card=0x02a51028 chip=0xd1308086
rev=0x11 hdr=0x00
vendor = 'Intel Corporation'
device = 'Core Processor DMI'
class  = bridge
subclass   = HOST-PCI
cap 05[60] = MSI supports 2 messages, vector masks 
cap 10[90] = PCI-Express 1 root port max data 128(128) link x4(x4)
cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000d[150] = unknown 1
ecap 000b[160] = unknown 0
pcib1@pci0:0:3:0:   class=0x060400 card=0x02a51028 chip=0xd1388086
rev=0x11 hdr=0x01
vendor = 'Intel Corporation'
device = 'Core Processor PCI Express Root Port 1'
class  = bridge
subclass   = PCI-PCI
cap 0d[40] = PCI Bridge card=0x02a51028
cap 05[60] = MSI supports 2 messages, vector masks 
cap 10[90] = PCI-Express 2 root port max data 256(256) link x8(x16)
cap 01[e0] = powerspec 3  supports D0 D3  current D0
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 000d[150] = unknown 1
ecap 000b[160] = unknown 0
none0@pci0:0:8:0:   class=0x088000 card=0x chip=0xd1558086
rev=0x11 hdr=0x00
vendor = 'Intel Corporation'
device = 'Core Processor System Management Registers'
class  = base peripheral
cap 10[40] = PCI-Express 2 root endpoint max data 128(128) link
x0(x0)
ecap 000b[100] = unknown 0
none1@pci0:0:8:1:   class=0x088000 card=0x chip=0xd1568086
rev=0x11 hdr=0x00
vendor = 'Intel Corporation'
device = 'Core Processor Semaphore and Scratchpad Registers'
class  = base peripheral
cap 10[40] = PCI-Express 2 root endpoint max data 128(128) link
x0(x0)
ecap 000b[100] = unknown 0
none2@pci0:0:8:2:   class=0x088000 card=0x chip=0xd1578086
rev=0x11 hdr=0x00
vendor = 'Intel Corporation'
device = 'Core Processor System Control and Status Registers'
class  = base peripheral
cap 10[40] = PCI-Express 2 root endpoint max data 128(128) link
x0(x0)
ecap 000b[100] = unknown 0
none3@pci0:0:8:3:   class=0x088000 card=0x chip=0xd1588086
rev=0x11 hdr=0x00
vendor = 'Intel Corporation'
device = 'Core Processor Miscellaneous Registers'
class  = base peripheral
none4@pci0:0:16:0:  class=0x088000 card=0x chip=0xd1508086
rev=0x11 hdr=0x00
vendor = 'Intel Corporation'
device = 'Core Processor QPI Link'
class  = base peripheral
none5@pci0:0:16:1:  class=0x088000 card=0x chip=0xd1518086
rev=0x11 hdr=0x00
vendor = 'Intel Corporation'
device = 'Core Processor QPI Routing and Protocol Registers'
class  = base peripheral
ehci0@pci0:0:26:0:  class=0x0c0320 card=0x02a51028 chip=0x3b3c8086
rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
device = '5 Series/3400 Series Chipset USB2 Enhanced Host
Controller'
class  = serial bus
subclass   = USB
bar   [10] = type Memory, range 32, base 0xdf0fa000, size 1024,
enabled
cap 01[50] = powerspec 2  supports D0 D3  current D0
cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
cap 13[98] = PCI Advanced Features: FLR TP
pcib2@pci0:0:28:0:  class=0x060400 card=0x02a51028 chip=0x3b428086
rev=0x05 hdr=0x01
vendor = 'Intel Corporation'
device = '5 Series/3400 Series Chipset PCI Express Root Port 1'
class  = bridge
subclass   = PCI-PCI
cap 10[40] = PCI-Express 2 root port max data 128(128) link x4(x4)
cap 05[80] = MSI supports 1 message 
cap 0d[90] = PCI Bridge card=0x02a51028
cap 01[a0] = powerspec 2  supports D0 D3  current D0
ehci1@pci0:0:29:0:  class=0x0c0320 card=0x02a51028 chip=0x3b348086
rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
device = '5 Series/3400 Series Chipset USB2 Enhanced Host
Controller'
class  = serial bus
subclass   = USB
bar   [10] = type Memory, range 32, base 0xdf0fc000, size 1024,
enabled
cap 01[50] = powerspec 2  supports D0 D3  current D0
cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
cap 13[98] = PCI Advanced Features: FLR TP
pcib3@pci0:0:30:0:  class=0x060401 card=0x02a51028 chip=0x244e8086
rev=0xa5 hdr=0x01
vendor = 'Intel Corporation'
device = '82801 PCI Bridge'
class  = bridge
subclass   = PCI-PCI
cap 0d[50] = PCI Bridge card=0x02a51028
isab0@pci0:0:31:0:  class=0x060100 card=0x02a51028 chip=0x3b148086
rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
device = '3400 Series Chipset LPC Interface Controller'
class  = bridge
subclass   = PCI-ISA
cap 09[e0] = vendor (length 16) Intel cap 1 version 1
ahci0@pci0:0:31:2:  class=0x010601 card=0x02a51028 chip=0x3b228086
rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
device = '5 Series/3400 Series Chipset 6 port SATA AHCI
Controller'
class  = mass storage
subclass   = SATA
bar   [10] 

Re: Strange reboot since 9.1

2013-03-07 Thread YongHyeon PYUN
On Thu, Mar 07, 2013 at 08:38:27AM -0800, Jeremy Chadwick wrote:
 On Thu, Mar 07, 2013 at 04:38:54PM +0100, Lo?c Blot wrote:
  Hi Marcelo, thanks. Here is a better trace:
  
  -
  
  kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.11
  GNU gdb 6.1.1 [FreeBSD]
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you
  are
  welcome to change it and/or distribute copies of it under certain
  conditions.
  Type show copying to see the conditions.
  There is absolutely no warranty for GDB.  Type show warranty for
  details.
  This GDB was configured as amd64-marcel-freebsd...
  
  Unread portion of the kernel message buffer:
  
  
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 00
  fault virtual address   = 0x0
  fault code  = supervisor read data, page not present
  instruction pointer = 0x20:0x80a84414
  stack pointer   = 0x28:0xff822fc267a0
  frame pointer   = 0x28:0xff822fc26830
  code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, long 1, def32 0, gran 1
  processor eflags= interrupt enabled, resume, IOPL = 0
  current process = 12 (irq265: bce0)
  trap number = 12
  panic: page fault
  cpuid = 0
  KDB: stack backtrace:
  #0 0x809208a6 at kdb_backtrace+0x66
  #1 0x808ea8be at panic+0x1ce
  #2 0x80bd8240 at trap_fatal+0x290
  #3 0x80bd857d at trap_pfault+0x1ed
  #4 0x80bd8b9e at trap+0x3ce
  #5 0x80bc315f at calltrap+0x8
  #6 0x80a861d5 at udp_input+0x475
  #7 0x80a043dc at ip_input+0xac
  #8 0x809adafb at netisr_dispatch_src+0x20b
  #9 0x809a35cd at ether_demux+0x14d
  #10 0x809a38a4 at ether_nh_input+0x1f4
  #11 0x809adafb at netisr_dispatch_src+0x20b
  #12 0x80438fd7 at bce_intr+0x487
  #13 0x808be8d4 at intr_event_execute_handlers+0x104
  #14 0x808c0076 at ithread_loop+0xa6
  #15 0x808bb9ef at fork_exit+0x11f
  #16 0x80bc368e at fork_trampoline+0xe
  Uptime: 27m20s
  Dumping 1265 out of 8162
  MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..92%
  
  #0  doadump (textdump=Variable textdump is not available.
  ) at pcpu.h:224
  224 pcpu.h: No such file or directory.
  in pcpu.h
  (kgdb) bt f
  #0  doadump (textdump=Variable textdump is not available.
  ) at pcpu.h:224
  No locals.
  #1  0x808ea3a1 in kern_reboot (howto=260)
  at /usr/src/sys/kern/kern_shutdown.c:448
  _ep = Variable _ep is not available.
  (kgdb) bt
  #0  doadump (textdump=Variable textdump is not available.
  ) at pcpu.h:224
  #1  0x808ea3a1 in kern_reboot (howto=260)
  at /usr/src/sys/kern/kern_shutdown.c:448
  #2  0x808ea897 in panic (fmt=0x1 Address 0x1 out of bounds)
  at /usr/src/sys/kern/kern_shutdown.c:636
  #3  0x80bd8240 in trap_fatal (frame=0xc, eva=Variable eva is
  not available.
  ) at /usr/src/sys/amd64/amd64/trap.c:857
  #4  0x80bd857d in trap_pfault (frame=0xff822fc266f0,
  usermode=0) at /usr/src/sys/amd64/amd64/trap.c:773
  #5  0x80bd8b9e in trap (frame=0xff822fc266f0)
  at /usr/src/sys/amd64/amd64/trap.c:456
  #6  0x80bc315f in calltrap ()
  at /usr/src/sys/amd64/amd64/exception.S:228
  #7  0x80a84414 in udp_append (inp=0xfe019e2a1000,
  ip=0xfe00444b6c80, n=0xfe00444b6c00, off=20,
  udp_in=0xff822fc268a0) at /usr/src/sys/netinet/udp_usrreq.c:252
  #8  0x80a861d5 in udp_input (m=0xfe00444b6c00, off=Variable
  off is not available.
  ) at /usr/src/sys/netinet/udp_usrreq.c:618
  #9  0x80a043dc in ip_input (m=0xfe00444b6c00)
  at /usr/src/sys/netinet/ip_input.c:760
  #10 0x809adafb in netisr_dispatch_src (proto=1, source=Variable
  source is not available.
  ) at /usr/src/sys/net/netisr.c:1013
  #11 0x809a35cd in ether_demux (ifp=0xfe00053fa000,
  m=0xfe00444b6c00) at /usr/src/sys/net/if_ethersubr.c:940
  #12 0x809a38a4 in ether_nh_input (m=Variable m is not
  available.
  ) at /usr/src/sys/net/if_ethersubr.c:759
  #13 0x809adafb in netisr_dispatch_src (proto=9, source=Variable
  source is not available.
  ) at /usr/src/sys/net/netisr.c:1013
  #14 0x80438fd7 in bce_intr (xsc=Variable xsc is not available.
  ) at /usr/src/sys/dev/bce/if_bce.c:6903
  #15 0x808be8d4 in intr_event_execute_handlers (p=Variable p is
  not available.
  ) at /usr/src/sys/kern/kern_intr.c:1262
  #16 0x808c0076 in ithread_loop (arg=0xfe00057424e0)
  at /usr/src/sys/kern/kern_intr.c:1275
  #17 0x808bb9ef in fork_exit (callout=0x808bffd0
  ithread_loop, arg=0xfe00057424e0, frame=0xff822fc26c40)
  at /usr/src/sys/kern/kern_fork.c:992
  #18 0x80bc368e in fork_trampoline ()
  at /usr/src/sys/amd64/amd64/exception.S:602
  #19 0x in ?? ()

Strange reboot since 9.1

2013-03-06 Thread Loïc Blot
Hello,
Since FreeBSD 9.1 I have strange problems with the distribution. Some
servers are rebooting without any kernel panic, instanly. First i
thought it's a problem with my KVM system, but one of my FreeBSD under a
Dell R210 have the same problem.
The servers concerned are now:
- Monitoring server
- LDAP test server
- Some other servers, randomly (not in production).
First i thought it's a problem with my FreeBSD install, then i download
another time the ISO but the problem was already here. After i try
another thing, install 9.0 and upgrade to 9.1 but same problem.
How can i get informations about this problem ?

Thanks for advance.
-- 
Best regards, 

Loïc BLOT, Engineering
UNIX Systems, Security and Networks
http://www.unix-experience.fr



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Strange reboot since 9.1

2013-03-06 Thread Marin Atanasov Nikolov
On Wed, Mar 6, 2013 at 10:55 AM, Loïc Blot loic.b...@unix-experience.frwrote:

 Hello,


Hi,


 Since FreeBSD 9.1 I have strange problems with the distribution. Some
 servers are rebooting without any kernel panic, instanly. First i
 thought it's a problem with my KVM system, but one of my FreeBSD under a
 Dell R210 have the same problem.
 The servers concerned are now:
 - Monitoring server
 - LDAP test server
 - Some other servers, randomly (not in production).
 First i thought it's a problem with my FreeBSD install, then i download
 another time the ISO but the problem was already here. After i try
 another thing, install 9.0 and upgrade to 9.1 but same problem.
 How can i get informations about this problem ?


I've had similar issues with one of my FreeBSD systems. My system had
spontaneous reboots without any kernel panic, without any clear evidence of
why it happened.

After a lot of trials and tests the root cause appeared to be the amount of
ZFS snapshots I had, which were more than 1K on a 8G system.

Upgrading from 9.0 to 9.1 didn't solve the issue, as clearly I had to do
some cleanup of the ZFS snapshots and since then it's more than a month
without any reboots.

Few pointers that you could use -- get these systems monitored and keep an
eye on the monitoring system -- CPU usage, memory, processes, network
traffic, etc.. I've noticed that my system was running low on free memory
and that later led me to the ZFS snapshots clue.

So, my advise is to get first these systems monitored and watch for
anything unusual happening. Then further investigate.

Good luck.

Regards,
Marin


 Thanks for advance.
 --
 Best regards,

 Loïc BLOT, Engineering
 UNIX Systems, Security and Networks
 http://www.unix-experience.fr



 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org




-- 
Marin Atanasov Nikolov

dnaeon AT gmail DOT com
http://www.unix-heaven.org/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange reboot since 9.1

2013-03-06 Thread Marcelo Gondim

Em 06/03/13 06:18, Marin Atanasov Nikolov escreveu:

On Wed, Mar 6, 2013 at 10:55 AM, Loïc Blot loic.b...@unix-experience.frwrote:


Hello,


Hi,



Since FreeBSD 9.1 I have strange problems with the distribution. Some
servers are rebooting without any kernel panic, instanly. First i
thought it's a problem with my KVM system, but one of my FreeBSD under a
Dell R210 have the same problem.
The servers concerned are now:
- Monitoring server
- LDAP test server
- Some other servers, randomly (not in production).
First i thought it's a problem with my FreeBSD install, then i download
another time the ISO but the problem was already here. After i try
another thing, install 9.0 and upgrade to 9.1 but same problem.
How can i get informations about this problem ?



I've had similar issues with one of my FreeBSD systems. My system had
spontaneous reboots without any kernel panic, without any clear evidence of
why it happened.

After a lot of trials and tests the root cause appeared to be the amount of
ZFS snapshots I had, which were more than 1K on a 8G system.

Upgrading from 9.0 to 9.1 didn't solve the issue, as clearly I had to do
some cleanup of the ZFS snapshots and since then it's more than a month
without any reboots.

Few pointers that you could use -- get these systems monitored and keep an
eye on the monitoring system -- CPU usage, memory, processes, network
traffic, etc.. I've noticed that my system was running low on free memory
and that later led me to the ZFS snapshots clue.

So, my advise is to get first these systems monitored and watch for
anything unusual happening. Then further investigate.

Good luck.

Regards,
Marin



Thanks for advance.
--
Best regards,

Loïc BLOT, Engineering
UNIX Systems, Security and Networks
http://www.unix-experience.fr

I have same problem but I'm using UFS and FreeBSD 9.1-STABLE with dumdev 
enabled (dumpdev=AUTO). After spontaneous reboots, nothing in /var/crash.

Spontaneous reboots always happen between 00:00 am and 09:00 am.

FreeBSD rt01.xxx.com 9.1-STABLE FreeBSD 9.1-STABLE #14 r247497: Thu Feb 
28 21:32:09 BRT 2013 r...@rt01.xxx.com:/usr/obj/usr/src/sys/INTNET amd64


hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU   E5606  @ 2.13GHz
hw.ncpu: 8
hw.byteorder: 1234
hw.physmem: 8509702144
hw.usermem: 7911686144

Handle 0x0003, DMI type 2, 16 bytes
Base Board Information
Manufacturer: Intel Corporation
Product Name: S5500BC
Version: E25124-453
Serial Number: BZBZ04800361
Asset Tag: 
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: Not Specified
Chassis Handle: 0x0004
Type: Motherboard
Contained Object Handles: 0

This motherboard have 2 CPU processors.

[]'s
Gondim
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange reboot since 9.1

2013-03-06 Thread Marcelo Gondim

Em 06/03/13 07:55, Marcelo Gondim escreveu:

Em 06/03/13 06:18, Marin Atanasov Nikolov escreveu:
On Wed, Mar 6, 2013 at 10:55 AM, Loïc Blot 
loic.b...@unix-experience.frwrote:



Hello,


Hi,



Since FreeBSD 9.1 I have strange problems with the distribution. Some
servers are rebooting without any kernel panic, instanly. First i
thought it's a problem with my KVM system, but one of my FreeBSD 
under a

Dell R210 have the same problem.
The servers concerned are now:
- Monitoring server
- LDAP test server
- Some other servers, randomly (not in production).
First i thought it's a problem with my FreeBSD install, then i download
another time the ISO but the problem was already here. After i try
another thing, install 9.0 and upgrade to 9.1 but same problem.
How can i get informations about this problem ?



I've had similar issues with one of my FreeBSD systems. My system had
spontaneous reboots without any kernel panic, without any clear 
evidence of

why it happened.

After a lot of trials and tests the root cause appeared to be the 
amount of

ZFS snapshots I had, which were more than 1K on a 8G system.

Upgrading from 9.0 to 9.1 didn't solve the issue, as clearly I had to do
some cleanup of the ZFS snapshots and since then it's more than a month
without any reboots.

Few pointers that you could use -- get these systems monitored and 
keep an

eye on the monitoring system -- CPU usage, memory, processes, network
traffic, etc.. I've noticed that my system was running low on free 
memory

and that later led me to the ZFS snapshots clue.

So, my advise is to get first these systems monitored and watch for
anything unusual happening. Then further investigate.

Good luck.

Regards,
Marin



Thanks for advance.
--
Best regards,

Loïc BLOT, Engineering
UNIX Systems, Security and Networks
http://www.unix-experience.fr

I have same problem but I'm using UFS and FreeBSD 9.1-STABLE with 
dumdev enabled (dumpdev=AUTO). After spontaneous reboots, nothing in 
/var/crash.

Spontaneous reboots always happen between 00:00 am and 09:00 am.

FreeBSD rt01.xxx.com 9.1-STABLE FreeBSD 9.1-STABLE #14 r247497: Thu 
Feb 28 21:32:09 BRT 2013 r...@rt01.xxx.com:/usr/obj/usr/src/sys/INTNET 
amd64


hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU   E5606  @ 2.13GHz
hw.ncpu: 8
hw.byteorder: 1234
hw.physmem: 8509702144
hw.usermem: 7911686144

Handle 0x0003, DMI type 2, 16 bytes
Base Board Information
Manufacturer: Intel Corporation
Product Name: S5500BC
Version: E25124-453
Serial Number: BZBZ04800361
Asset Tag: 
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: Not Specified
Chassis Handle: 0x0004
Type: Motherboard
Contained Object Handles: 0

This motherboard have 2 CPU processors.

My last log:

boot time  Wed Mar  6 03:14
boot time  Wed Mar  6 02:29
boot time  Tue Mar  5 04:32
boot time  Mon Mar  4 08:16
boot time  Mon Mar  4 07:09
boot time  Mon Mar  4 05:54
boot time  Mon Mar  4 05:14
boot time  Mon Mar  4 04:33
boot time  Mon Mar  4 04:29
boot time  Mon Mar  4 04:10
boot time  Mon Mar  4 04:01
boot time  Mon Mar  4 03:22
boot time  Sun Mar  3 05:55
boot time  Sat Mar  2 08:02
boot time  Sat Mar  2 07:54
boot time  Sat Mar  2 07:11
boot time  Sat Mar  2 05:33
boot time  Sat Mar  2 05:09
boot time  Sat Mar  2 04:56
boot time  Sat Mar  2 04:19
boot time  Sat Mar  2 04:13
boot time  Sat Mar  2 04:04
boot time  Sat Mar  2 03:27
boot time  Sat Mar  2 03:20
boot time  Sat Mar  2 02:51
boot time  Sat Mar  2 02:40
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange reboot since 9.1

2013-03-06 Thread Service Info
Hi Marin,
i don't use ZFS on this system, only UFS2+J :)
My LDAP servers reboots more often when i compile a program (yesterday
when i compile samba36), i think it's when server it's charged (my
monitoring server uses 750 NRPE sensors + MRTG under 50 switches every
time and SNORT
But the CPU isn't very used, like memory:

CPU:  8.4% user,  0.0% nice,  0.6% system,  0.0% interrupt, 91.0% idle
Mem: 709M Active, 606M Inact, 885M Wired, 92M Cache, 826M Buf, 5599M
Free

-- 
Cordialement, 

Loïc BLOT
Systèmes UNIX, Sécurité et Réseau
01.64.53.31.54
Laboratoire Charles Fabry, CNRS



Le mercredi 06 mars 2013 à 11:18 +0200, Marin Atanasov Nikolov a écrit :
 
 
 
 On Wed, Mar 6, 2013 at 10:55 AM, Loïc Blot
 loic.b...@unix-experience.fr wrote:
 Hello,
 
 
 Hi,
 
  
 Since FreeBSD 9.1 I have strange problems with the
 distribution. Some
 servers are rebooting without any kernel panic, instanly.
 First i
 thought it's a problem with my KVM system, but one of my
 FreeBSD under a
 Dell R210 have the same problem.
 The servers concerned are now:
 - Monitoring server
 - LDAP test server
 - Some other servers, randomly (not in production).
 First i thought it's a problem with my FreeBSD install, then i
 download
 another time the ISO but the problem was already here. After i
 try
 another thing, install 9.0 and upgrade to 9.1 but same
 problem.
 How can i get informations about this problem ?
 
 
 
 I've had similar issues with one of my FreeBSD systems. My system had
 spontaneous reboots without any kernel panic, without any clear
 evidence of why it happened.
 
 
 After a lot of trials and tests the root cause appeared to be the
 amount of ZFS snapshots I had, which were more than 1K on a 8G system.
 
 
 
 Upgrading from 9.0 to 9.1 didn't solve the issue, as clearly I had to
 do some cleanup of the ZFS snapshots and since then it's more than a
 month without any reboots.
 
 
 Few pointers that you could use -- get these systems monitored and
 keep an eye on the monitoring system -- CPU usage, memory, processes,
 network traffic, etc.. I've noticed that my system was running low on
 free memory and that later led me to the ZFS snapshots clue. 
 
 
 So, my advise is to get first these systems monitored and watch for
 anything unusual happening. Then further investigate.
 
 Good luck.
 
 Regards,
 Marin
 
  
 Thanks for advance.
 --
 Best regards,
 
 Loïc BLOT, Engineering
 UNIX Systems, Security and Networks
 http://www.unix-experience.fr
 
 
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 freebsd-stable-unsubscr...@freebsd.org
 
 
 
 -- 
 Marin Atanasov Nikolov
 
 dnaeon AT gmail DOT com
 http://www.unix-heaven.org/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org