Re: -CURRENT Panic at boot at Revision: 237264 mutex gif softc not owned at /usr/src/sys/netinet/in_gif.c:105
Hi, Just a quick ping to make sure this isnt forgotten. Would it help if I open a PR? regards, Vince On 21/06/2012 19:34, John Baldwin wrote: On Thursday, June 21, 2012 12:41:59 pm Vincent Hoffman wrote: Hi again, The 2nd patch (to if.h and if_gif.c) also fixes the panic on boot. Thanks for the quick response as ever. Great, thanks for testing! Randall, do you have any thoughts on these patches? Vince ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: -CURRENT Panic at boot at Revision: 237264 mutex gif softc not owned at /usr/src/sys/netinet/in_gif.c:105
Hi again, The 2nd patch (to if.h and if_gif.c) also fixes the panic on boot. Thanks for the quick response as ever. Vince On 20/06/2012 13:12, John Baldwin wrote: On Tuesday, June 19, 2012 8:05:36 pm Vincent Hoffman wrote: Full dump info at http://unsane.co.uk/crash It seems to have popped up between r236905 (working kernel) and r237264 (this panic) the gif config I have in rc.conf is for a HE ipv6 tunnel Looks like this was broken in r236951 by Randall (cc'd). I think this would fix it: Index: if_gif.c === --- if_gif.c (revision 237227) +++ if_gif.c (working copy) @@ -366,11 +366,12 @@ gif_start(struct ifnet *ifp) return; } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { + GIF_UNLOCK(sc); IFQ_DRV_DEQUEUE(ifp-if_snd, m); + GIF_LOCK(sc); if (m == 0) break; @@ -424,14 +425,12 @@ keep_going: ifp-if_oerrors++; } - GIF_LOCK(sc); if (ifp-if_drv_flags IFF_GIF_WANTED) { /* Someone did a start while * we were unlocked and processing * lets clear the flag and try again. */ ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); goto keep_going; } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; However, unless there is a known LOR, I would be inclined to just hold the lock across IFQ_DRV_DEQUEUE() and dispense with all the 'keep_going', etc. logic. Other NIC drivers tend to just hold their transmit lock for the entire loop in their start routines. That would look like this: Index: if.h === --- if.h (revision 237227) +++ if.h (working copy) @@ -153,7 +153,6 @@ #define IFF_STATICARP 0x8 /* (n) static ARP */ #define IFF_DYING 0x20/* (n) interface is winding down */ #define IFF_RENAMING0x40/* (n) interface is being renamed */ -#define IFF_GIF_WANTED 0x100 /* (n) The gif tunnel is wanted */ /* * Old names for driver flags so that user space tools can continue to use * the old (portable) names. Index: if_gif.c === --- if_gif.c (revision 237227) +++ if_gif.c (working copy) @@ -359,15 +359,7 @@ sc = ifp-if_softc; GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_DRV_OACTIVE) { - /* Already active */ - ifp-if_drv_flags |= IFF_GIF_WANTED; - GIF_UNLOCK(sc); - return; - } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); -keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { IFQ_DRV_DEQUEUE(ifp-if_snd, m); @@ -424,16 +416,6 @@ ifp-if_oerrors++; } - GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_GIF_WANTED) { - /* Someone did a start while - * we were unlocked and processing - * lets clear the flag and try again. - */ - ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); - goto keep_going; - } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; GIF_UNLOCK(sc); return; I would prefer this latter patch if it is ok as it makes the code simpler. Also, IFF_GIF_WANTED as a new iff flag seems really hackish. IFF_* flags are supposed to be interface independent. A flag like that should be in a private field in the gif softc, not something exposed to the entire system. cloned_interfaces=gif0 ifconfig_gif0=tunnel 85.233.185.162 216.66.80.26 ifconfig_gif0_ipv6=inet6 2001:470:1f08:110::2 2001:470:1f08:110::1 prefixlen 128 -accept_rtadv src.conf only has WITHOUT_IPFILTER=true WITHOUT_KERBEROS=true WITHOUT_PROFILE=yes Happy to provide any more info as needed. any suggestions welcome, I'll see if I can track it further with a binary search tomorrow. From dump info file (at above URL) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 266 if (textdump textdump_pending) { (kgdb) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 #1 0x80314740 in db_dump (dummy=Variable dummy is not available. ) at /usr/src/sys/ddb/db_command.c:538 #2 0x80313d31 in db_command (last_cmdp=0x80c52b40, cmd_table=Variable cmd_table is not available. ) at /usr/src/sys/ddb/db_command.c:449 #3 0x80313f80 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0x803160d9 in db_trap (type=Variable type is not available. ) at /usr/src/sys/ddb/db_main.c:231 #5 0x80590918 in kdb_trap
Re: -CURRENT Panic at boot at Revision: 237264 mutex gif softc not owned at /usr/src/sys/netinet/in_gif.c:105
On Thursday, June 21, 2012 12:41:59 pm Vincent Hoffman wrote: Hi again, The 2nd patch (to if.h and if_gif.c) also fixes the panic on boot. Thanks for the quick response as ever. Great, thanks for testing! Randall, do you have any thoughts on these patches? Vince On 20/06/2012 13:12, John Baldwin wrote: On Tuesday, June 19, 2012 8:05:36 pm Vincent Hoffman wrote: Full dump info at http://unsane.co.uk/crash It seems to have popped up between r236905 (working kernel) and r237264 (this panic) the gif config I have in rc.conf is for a HE ipv6 tunnel Looks like this was broken in r236951 by Randall (cc'd). I think this would fix it: Index: if_gif.c === --- if_gif.c(revision 237227) +++ if_gif.c(working copy) @@ -366,11 +366,12 @@ gif_start(struct ifnet *ifp) return; } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { + GIF_UNLOCK(sc); IFQ_DRV_DEQUEUE(ifp-if_snd, m); + GIF_LOCK(sc); if (m == 0) break; @@ -424,14 +425,12 @@ keep_going: ifp-if_oerrors++; } - GIF_LOCK(sc); if (ifp-if_drv_flags IFF_GIF_WANTED) { /* Someone did a start while * we were unlocked and processing * lets clear the flag and try again. */ ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); goto keep_going; } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; However, unless there is a known LOR, I would be inclined to just hold the lock across IFQ_DRV_DEQUEUE() and dispense with all the 'keep_going', etc. logic. Other NIC drivers tend to just hold their transmit lock for the entire loop in their start routines. That would look like this: Index: if.h === --- if.h(revision 237227) +++ if.h(working copy) @@ -153,7 +153,6 @@ #defineIFF_STATICARP 0x8 /* (n) static ARP */ #defineIFF_DYING 0x20/* (n) interface is winding down */ #defineIFF_RENAMING0x40/* (n) interface is being renamed */ -#define IFF_GIF_WANTED 0x100 /* (n) The gif tunnel is wanted */ /* * Old names for driver flags so that user space tools can continue to use * the old (portable) names. Index: if_gif.c === --- if_gif.c(revision 237227) +++ if_gif.c(working copy) @@ -359,15 +359,7 @@ sc = ifp-if_softc; GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_DRV_OACTIVE) { - /* Already active */ - ifp-if_drv_flags |= IFF_GIF_WANTED; - GIF_UNLOCK(sc); - return; - } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); -keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { IFQ_DRV_DEQUEUE(ifp-if_snd, m); @@ -424,16 +416,6 @@ ifp-if_oerrors++; } - GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_GIF_WANTED) { - /* Someone did a start while -* we were unlocked and processing -* lets clear the flag and try again. -*/ - ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); - goto keep_going; - } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; GIF_UNLOCK(sc); return; I would prefer this latter patch if it is ok as it makes the code simpler. Also, IFF_GIF_WANTED as a new iff flag seems really hackish. IFF_* flags are supposed to be interface independent. A flag like that should be in a private field in the gif softc, not something exposed to the entire system. cloned_interfaces=gif0 ifconfig_gif0=tunnel 85.233.185.162 216.66.80.26 ifconfig_gif0_ipv6=inet6 2001:470:1f08:110::2 2001:470:1f08:110::1 prefixlen 128 -accept_rtadv src.conf only has WITHOUT_IPFILTER=true WITHOUT_KERBEROS=true WITHOUT_PROFILE=yes Happy to provide any more info as needed. any suggestions welcome, I'll see if I can track it further with a binary search tomorrow. From dump info file (at above URL) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 266 if (textdump textdump_pending) { (kgdb) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 #1 0x80314740 in db_dump (dummy=Variable dummy is not available. ) at /usr/src/sys/ddb/db_command.c:538 #2 0x80313d31 in db_command (last_cmdp=0x80c52b40, cmd_table=Variable cmd_table is not available. ) at /usr/src/sys/ddb/db_command.c:449 #3 0x80313f80 in
Re: -CURRENT Panic at boot at Revision: 237264 mutex gif softc not owned at /usr/src/sys/netinet/in_gif.c:105
On Tuesday, June 19, 2012 8:05:36 pm Vincent Hoffman wrote: Full dump info at http://unsane.co.uk/crash It seems to have popped up between r236905 (working kernel) and r237264 (this panic) the gif config I have in rc.conf is for a HE ipv6 tunnel Looks like this was broken in r236951 by Randall (cc'd). I think this would fix it: Index: if_gif.c === --- if_gif.c(revision 237227) +++ if_gif.c(working copy) @@ -366,11 +366,12 @@ gif_start(struct ifnet *ifp) return; } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { + GIF_UNLOCK(sc); IFQ_DRV_DEQUEUE(ifp-if_snd, m); + GIF_LOCK(sc); if (m == 0) break; @@ -424,14 +425,12 @@ keep_going: ifp-if_oerrors++; } - GIF_LOCK(sc); if (ifp-if_drv_flags IFF_GIF_WANTED) { /* Someone did a start while * we were unlocked and processing * lets clear the flag and try again. */ ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); goto keep_going; } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; However, unless there is a known LOR, I would be inclined to just hold the lock across IFQ_DRV_DEQUEUE() and dispense with all the 'keep_going', etc. logic. Other NIC drivers tend to just hold their transmit lock for the entire loop in their start routines. That would look like this: Index: if.h === --- if.h(revision 237227) +++ if.h(working copy) @@ -153,7 +153,6 @@ #defineIFF_STATICARP 0x8 /* (n) static ARP */ #defineIFF_DYING 0x20/* (n) interface is winding down */ #defineIFF_RENAMING0x40/* (n) interface is being renamed */ -#define IFF_GIF_WANTED 0x100 /* (n) The gif tunnel is wanted */ /* * Old names for driver flags so that user space tools can continue to use * the old (portable) names. Index: if_gif.c === --- if_gif.c(revision 237227) +++ if_gif.c(working copy) @@ -359,15 +359,7 @@ sc = ifp-if_softc; GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_DRV_OACTIVE) { - /* Already active */ - ifp-if_drv_flags |= IFF_GIF_WANTED; - GIF_UNLOCK(sc); - return; - } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); -keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { IFQ_DRV_DEQUEUE(ifp-if_snd, m); @@ -424,16 +416,6 @@ ifp-if_oerrors++; } - GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_GIF_WANTED) { - /* Someone did a start while -* we were unlocked and processing -* lets clear the flag and try again. -*/ - ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); - goto keep_going; - } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; GIF_UNLOCK(sc); return; I would prefer this latter patch if it is ok as it makes the code simpler. Also, IFF_GIF_WANTED as a new iff flag seems really hackish. IFF_* flags are supposed to be interface independent. A flag like that should be in a private field in the gif softc, not something exposed to the entire system. cloned_interfaces=gif0 ifconfig_gif0=tunnel 85.233.185.162 216.66.80.26 ifconfig_gif0_ipv6=inet6 2001:470:1f08:110::2 2001:470:1f08:110::1 prefixlen 128 -accept_rtadv src.conf only has WITHOUT_IPFILTER=true WITHOUT_KERBEROS=true WITHOUT_PROFILE=yes Happy to provide any more info as needed. any suggestions welcome, I'll see if I can track it further with a binary search tomorrow. From dump info file (at above URL) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 266 if (textdump textdump_pending) { (kgdb) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 #1 0x80314740 in db_dump (dummy=Variable dummy is not available. ) at /usr/src/sys/ddb/db_command.c:538 #2 0x80313d31 in db_command (last_cmdp=0x80c52b40, cmd_table=Variable cmd_table is not available. ) at /usr/src/sys/ddb/db_command.c:449 #3 0x80313f80 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0x803160d9 in db_trap (type=Variable type is not available. ) at /usr/src/sys/ddb/db_main.c:231 #5 0x80590918 in kdb_trap (type=3, code=0, tf=0xff80ea22ee20) at /usr/src/sys/kern/subr_kdb.c:654 #6 0x80815c9d in trap (frame=0xff80ea22ee20) at
Re: -CURRENT Panic at boot at Revision: 237264 mutex gif softc not owned at /usr/src/sys/netinet/in_gif.c:105
The patch to gif.c does fix it. I'll try the second patch later when I get a chance. Thanks, Vince On 20/06/2012 13:12, John Baldwin wrote: On Tuesday, June 19, 2012 8:05:36 pm Vincent Hoffman wrote: Full dump info at http://unsane.co.uk/crash It seems to have popped up between r236905 (working kernel) and r237264 (this panic) the gif config I have in rc.conf is for a HE ipv6 tunnel Looks like this was broken in r236951 by Randall (cc'd). I think this would fix it: Index: if_gif.c === --- if_gif.c (revision 237227) +++ if_gif.c (working copy) @@ -366,11 +366,12 @@ gif_start(struct ifnet *ifp) return; } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { + GIF_UNLOCK(sc); IFQ_DRV_DEQUEUE(ifp-if_snd, m); + GIF_LOCK(sc); if (m == 0) break; @@ -424,14 +425,12 @@ keep_going: ifp-if_oerrors++; } - GIF_LOCK(sc); if (ifp-if_drv_flags IFF_GIF_WANTED) { /* Someone did a start while * we were unlocked and processing * lets clear the flag and try again. */ ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); goto keep_going; } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; However, unless there is a known LOR, I would be inclined to just hold the lock across IFQ_DRV_DEQUEUE() and dispense with all the 'keep_going', etc. logic. Other NIC drivers tend to just hold their transmit lock for the entire loop in their start routines. That would look like this: Index: if.h === --- if.h (revision 237227) +++ if.h (working copy) @@ -153,7 +153,6 @@ #define IFF_STATICARP 0x8 /* (n) static ARP */ #define IFF_DYING 0x20/* (n) interface is winding down */ #define IFF_RENAMING0x40/* (n) interface is being renamed */ -#define IFF_GIF_WANTED 0x100 /* (n) The gif tunnel is wanted */ /* * Old names for driver flags so that user space tools can continue to use * the old (portable) names. Index: if_gif.c === --- if_gif.c (revision 237227) +++ if_gif.c (working copy) @@ -359,15 +359,7 @@ sc = ifp-if_softc; GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_DRV_OACTIVE) { - /* Already active */ - ifp-if_drv_flags |= IFF_GIF_WANTED; - GIF_UNLOCK(sc); - return; - } ifp-if_drv_flags |= IFF_DRV_OACTIVE; - GIF_UNLOCK(sc); -keep_going: while (!IFQ_DRV_IS_EMPTY(ifp-if_snd)) { IFQ_DRV_DEQUEUE(ifp-if_snd, m); @@ -424,16 +416,6 @@ ifp-if_oerrors++; } - GIF_LOCK(sc); - if (ifp-if_drv_flags IFF_GIF_WANTED) { - /* Someone did a start while - * we were unlocked and processing - * lets clear the flag and try again. - */ - ifp-if_drv_flags = ~IFF_GIF_WANTED; - GIF_UNLOCK(sc); - goto keep_going; - } ifp-if_drv_flags = ~IFF_DRV_OACTIVE; GIF_UNLOCK(sc); return; I would prefer this latter patch if it is ok as it makes the code simpler. Also, IFF_GIF_WANTED as a new iff flag seems really hackish. IFF_* flags are supposed to be interface independent. A flag like that should be in a private field in the gif softc, not something exposed to the entire system. cloned_interfaces=gif0 ifconfig_gif0=tunnel 85.233.185.162 216.66.80.26 ifconfig_gif0_ipv6=inet6 2001:470:1f08:110::2 2001:470:1f08:110::1 prefixlen 128 -accept_rtadv src.conf only has WITHOUT_IPFILTER=true WITHOUT_KERBEROS=true WITHOUT_PROFILE=yes Happy to provide any more info as needed. any suggestions welcome, I'll see if I can track it further with a binary search tomorrow. From dump info file (at above URL) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 266 if (textdump textdump_pending) { (kgdb) #0 doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:266 #1 0x80314740 in db_dump (dummy=Variable dummy is not available. ) at /usr/src/sys/ddb/db_command.c:538 #2 0x80313d31 in db_command (last_cmdp=0x80c52b40, cmd_table=Variable cmd_table is not available. ) at /usr/src/sys/ddb/db_command.c:449 #3 0x80313f80 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0x803160d9 in db_trap (type=Variable type is not available. ) at /usr/src/sys/ddb/db_main.c:231 #5 0x80590918 in kdb_trap (type=3, code=0, tf=0xff80ea22ee20)