Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On Tue, Jul 21, 2015 at 04:39:07PM +0300, Konstantin Belousov wrote: On Tue, Jul 21, 2015 at 05:57:34AM -0700, David Wolfskill wrote: My laptop had no problems, but the build machine has a panic that appears quite reproducible (4 successes out of 4 tries); here's a bit from the core.txt file: There must be kernel messages before the panic string. They are crusial to understand what is going on. ... Sorry I wasn't able to capture those before I needed to do Other Things. The machine had a (PCI-attached) serial console that was working for FreeBSD (thanks mostly to sbruno's help), but Somthing seems to Have Happened, and that's not presently working (even in stable/10, where I first got it working). I will try to get it working again, but I doubt I will have time to focus on that until about 9 hours from now. Peace, david -- David H. Wolfskill da...@catwhisker.org Those who murder in the name of God or prophet are blasphemous cowards. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpXNNyz_zCPR.pgp Description: PGP signature
Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On Tue, Jul 21, 2015 at 09:19:27AM -0700, David Wolfskill wrote: On Tue, Jul 21, 2015 at 04:39:07PM +0300, Konstantin Belousov wrote: On Tue, Jul 21, 2015 at 05:57:34AM -0700, David Wolfskill wrote: My laptop had no problems, but the build machine has a panic that appears quite reproducible (4 successes out of 4 tries); here's a bit from the core.txt file: There must be kernel messages before the panic string. They are crusial to understand what is going on. ... Sorry I wasn't able to capture those before I needed to do Other Things. The machine had a (PCI-attached) serial console that was working for FreeBSD (thanks mostly to sbruno's help), but Somthing seems to Have Happened, and that's not presently working (even in stable/10, where I first got it working). I will try to get it working again, but I doubt I will have time to focus on that until about 9 hours from now. It's possible to extract log messages leading up to the panic from the vmcore. From the kgdb prompt, running (kgdb) printf %s, msgbufp-msg_ptr should bring them up. And, I just noticed that you posted the core.txt, which contains this info near the end: http://www.catwhisker.org/~david/FreeBSD/head/core.txt.1 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On Tue, Jul 21, 2015 at 07:17:43PM +, Mark Johnston wrote: On Tue, Jul 21, 2015 at 09:19:27AM -0700, David Wolfskill wrote: On Tue, Jul 21, 2015 at 04:39:07PM +0300, Konstantin Belousov wrote: On Tue, Jul 21, 2015 at 05:57:34AM -0700, David Wolfskill wrote: My laptop had no problems, but the build machine has a panic that appears quite reproducible (4 successes out of 4 tries); here's a bit from the core.txt file: There must be kernel messages before the panic string. They are crusial to understand what is going on. ... Sorry I wasn't able to capture those before I needed to do Other Things. The machine had a (PCI-attached) serial console that was working for FreeBSD (thanks mostly to sbruno's help), but Somthing seems to Have Happened, and that's not presently working (even in stable/10, where I first got it working). I will try to get it working again, but I doubt I will have time to focus on that until about 9 hours from now. It's possible to extract log messages leading up to the panic from the vmcore. From the kgdb prompt, running (kgdb) printf %s, msgbufp-msg_ptr should bring them up. And, I just noticed that you posted the core.txt, which contains this info near the end: http://www.catwhisker.org/~david/FreeBSD/head/core.txt.1 Indeed, thank you. ithread_loop() at ithread_loop+0xa6/frame 0xfe083b9c0a70 fork_exit() at fork_exit+0x84/frame 0xfe083b9c0ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe083b9c0ab0 --- trap 0, rip = 0, rsp = 0xfe083b9c0b70, rbp = 0 --- suspending ithread with the following locks held: shared rw udpinp (udpinp) r = 3 (0xf80010c7d7b0) locked @ /usr/src/sys/netinet6/in6_pcb.c:1174 panic: witness_warn cpuid = 3 So it looks like net swi, leaking some udp6 lock. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On Tue, Jul 21, 2015 at 03:21:16PM -0500, Eric van Gyzen wrote: ... So it looks like net swi, leaking some udp6 lock. Curiouser and curiouser... While I'm not taking any special pains to avoid building IPv6, I'm not actively actually doing anything with it (IPv6), either (for both the failing machine and my laptop). Once I'm back home, I should be able to poke around in ddb after re-creating the panic, if that would be a useful thing for me to do (and given some hints as to what to poke). Naturally, I'm also happy to change bits of sources, rebuild, and smoke-test. A quick check from the SVN update output only shows r285710, r285711, and r285740 in the range from (r285685,r285741] -- as the kernel running r285685 had no known issues -- that touched sys/netinet6/*. It's a multicast destination. Maybe something is using mDNS? Randall, does the test on line 406 of udp6_usrreq.c need to be inverted? Eric DING! We have a winner! FreeBSD freebeast.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1789 r285741M/285741:1100077: Tue Jul 21 14:50:59 PDT 2015 r...@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/GENERIC amd64 freebeast(11.0-C)[3] cd /usr/src freebeast(11.0-C)[4] svn diff sys/netinet netinet/ netinet6/ freebeast(11.0-C)[4] svn diff sys/netinet* Index: sys/netinet6/udp6_usrreq.c === --- sys/netinet6/udp6_usrreq.c (revision 285741) +++ sys/netinet6/udp6_usrreq.c (working copy) @@ -403,7 +403,7 @@ INP_RLOCK(last); INP_INFO_RUNLOCK(pcbinfo); UDP_PROBE(receive, NULL, last, ip6, last, uh); - if (udp6_append(last, m, off, fromsa)) + if (! udp6_append(last, m, off, fromsa)) INP_RUNLOCK(last); inp_lost: return (IPPROTO_DONE); freebeast(11.0-C)[5] Thanks! :-) Peace, david -- David H. Wolfskill da...@catwhisker.org Those who murder in the name of God or prophet are blasphemous cowards. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpJj1IlX26b0.pgp Description: PGP signature
Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On Tue, Jul 21, 2015 at 10:28:32PM +0300, Konstantin Belousov wrote: ... Indeed, thank you. ithread_loop() at ithread_loop+0xa6/frame 0xfe083b9c0a70 fork_exit() at fork_exit+0x84/frame 0xfe083b9c0ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe083b9c0ab0 --- trap 0, rip = 0, rsp = 0xfe083b9c0b70, rbp = 0 --- suspending ithread with the following locks held: shared rw udpinp (udpinp) r = 3 (0xf80010c7d7b0) locked @ /usr/src/sys/netinet6/in6_pcb.c:1174 panic: witness_warn cpuid = 3 So it looks like net swi, leaking some udp6 lock. Curiouser and curiouser... While I'm not taking any special pains to avoid building IPv6, I'm not actively actually doing anything with it (IPv6), either (for both the failing machine and my laptop). Once I'm back home, I should be able to poke around in ddb after re-creating the panic, if that would be a useful thing for me to do (and given some hints as to what to poke). Naturally, I'm also happy to change bits of sources, rebuild, and smoke-test. A quick check from the SVN update output only shows r285710, r285711, and r285740 in the range from (r285685,r285741] -- as the kernel running r285685 had no known issues -- that touched sys/netinet6/*. Peace, david -- David H. Wolfskill da...@catwhisker.org Those who murder in the name of God or prophet are blasphemous cowards. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpQ_E2uiyznk.pgp Description: PGP signature
Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On 07/21/2015 15:05, David Wolfskill wrote: On Tue, Jul 21, 2015 at 10:28:32PM +0300, Konstantin Belousov wrote: ... Indeed, thank you. ithread_loop() at ithread_loop+0xa6/frame 0xfe083b9c0a70 fork_exit() at fork_exit+0x84/frame 0xfe083b9c0ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe083b9c0ab0 --- trap 0, rip = 0, rsp = 0xfe083b9c0b70, rbp = 0 --- suspending ithread with the following locks held: shared rw udpinp (udpinp) r = 3 (0xf80010c7d7b0) locked @ /usr/src/sys/netinet6/in6_pcb.c:1174 panic: witness_warn cpuid = 3 So it looks like net swi, leaking some udp6 lock. Curiouser and curiouser... While I'm not taking any special pains to avoid building IPv6, I'm not actively actually doing anything with it (IPv6), either (for both the failing machine and my laptop). Once I'm back home, I should be able to poke around in ddb after re-creating the panic, if that would be a useful thing for me to do (and given some hints as to what to poke). Naturally, I'm also happy to change bits of sources, rebuild, and smoke-test. A quick check from the SVN update output only shows r285710, r285711, and r285740 in the range from (r285685,r285741] -- as the kernel running r285685 had no known issues -- that touched sys/netinet6/*. It's a multicast destination. Maybe something is using mDNS? Randall, does the test on line 406 of udp6_usrreq.c need to be inverted? Eric signature.asc Description: OpenPGP digital signature
Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On 07/21/2015 15:21, Eric van Gyzen wrote: On 07/21/2015 15:05, David Wolfskill wrote: On Tue, Jul 21, 2015 at 10:28:32PM +0300, Konstantin Belousov wrote: ... Indeed, thank you. ithread_loop() at ithread_loop+0xa6/frame 0xfe083b9c0a70 fork_exit() at fork_exit+0x84/frame 0xfe083b9c0ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe083b9c0ab0 --- trap 0, rip = 0, rsp = 0xfe083b9c0b70, rbp = 0 --- suspending ithread with the following locks held: shared rw udpinp (udpinp) r = 3 (0xf80010c7d7b0) locked @ /usr/src/sys/netinet6/in6_pcb.c:1174 panic: witness_warn cpuid = 3 So it looks like net swi, leaking some udp6 lock. Curiouser and curiouser... While I'm not taking any special pains to avoid building IPv6, I'm not actively actually doing anything with it (IPv6), either (for both the failing machine and my laptop). Once I'm back home, I should be able to poke around in ddb after re-creating the panic, if that would be a useful thing for me to do (and given some hints as to what to poke). Naturally, I'm also happy to change bits of sources, rebuild, and smoke-test. A quick check from the SVN update output only shows r285710, r285711, and r285740 in the range from (r285685,r285741] -- as the kernel running r285685 had no known issues -- that touched sys/netinet6/*. It's a multicast destination. Maybe something is using mDNS? Blurf. I wonder if it's a multicast destination. (I need more chocolate.) Randall, does the test on line 406 of udp6_usrreq.c need to be inverted? Eric signature.asc Description: OpenPGP digital signature
Re: panic: witness_warn head/amd64 @r285741 on 1 of 2 machines
On Tue, Jul 21, 2015 at 05:57:34AM -0700, David Wolfskill wrote: My laptop had no problems, but the build machine has a panic that appears quite reproducible (4 successes out of 4 tries); here's a bit from the core.txt file: There must be kernel messages before the panic string. They are crusial to understand what is going on. freebeast.catwhisker.org dumped core - see /var/crash/vmcore.1 Tue Jul 21 05:36:11 PDT 2015 FreeBSD freebeast.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1787 r285741M/285741:1100077: Tue Jul 21 04:48:37 PDT 2015 r...@freebeast.catwhisker.org:/common/S4/obj/usr/src/sys/GENERIC amd64 panic: witness_warn ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org