Panic using QLogic NetXtreme II BCM57810 with latest CURRENT snapshot

2015-05-12 Thread Niclas Zeising
Hi!
I got the following panic with a QLogic NetXtreme II BCM57810 when
trying to assign an IP address using dhclient.  The network card uses
the bxe driver.  The machine in question is a HP DL380 Gen9.

Kernel page fault with the following non-sleepable locks held:
shared rw if_addr_lock (if_addr_lock) locked @ /usr/src/sys/net/if.c:1539
exclusive sleep mutex bxe0_mcast_lock lockeed @
/usr/src/sys/dev/bxe/bxe.c:12548

See screenshots at the links below for details and a stack trace.
I can provoke this panic at will, let me know if you need more details.
 Unfortunately I don't have access to a console where I can copy things
out currently, so screenshots have to do.

Screenshot 1: https://people.freebsd.org/~zeising/panic1.png
Screenshot 2: https://people.freebsd.org/~zeising/panic2.png

Regards!
-- 
Niclas

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic using QLogic NetXtreme II BCM57810 with latest CURRENT snapshot

2015-05-12 Thread Sergey Kandaurov
On 13 May 2015 at 00:21, Niclas Zeising zeis...@freebsd.org wrote:
 Hi!
 I got the following panic with a QLogic NetXtreme II BCM57810 when
 trying to assign an IP address using dhclient.  The network card uses
 the bxe driver.  The machine in question is a HP DL380 Gen9.

 Kernel page fault with the following non-sleepable locks held:
 shared rw if_addr_lock (if_addr_lock) locked @ /usr/src/sys/net/if.c:1539
 exclusive sleep mutex bxe0_mcast_lock lockeed @
 /usr/src/sys/dev/bxe/bxe.c:12548

 See screenshots at the links below for details and a stack trace.
 I can provoke this panic at will, let me know if you need more details.
  Unfortunately I don't have access to a console where I can copy things
 out currently, so screenshots have to do.

 Screenshot 1: https://people.freebsd.org/~zeising/panic1.png
 Screenshot 2: https://people.freebsd.org/~zeising/panic2.png


I'm not in any way a network/bxe expert, and this is probably unrelated,
but I see there at least a missing unlock at the error path.

Index: sys/dev/bxe/bxe.c
===
--- sys/dev/bxe/bxe.c   (revision 282468)
+++ sys/dev/bxe/bxe.c   (working copy)
@@ -12551,6 +12551,7 @@
 rc = ecore_config_mcast(sc, rparam, ECORE_MCAST_CMD_DEL);
 if (rc  0) {
 BLOGE(sc, Failed to clear multicast configuration: %d\n, rc);
+BXE_MCAST_UNLOCK(sc);
 return (rc);
 }

BXE_MCAST_LOCK acquires two locks: sc mutex, and if_maddr_rlock(ifp)

OTOH, in bxe_init_mcast_macs_list(), down the path, if_maddr_rlock is acquired
(and released) one more time: in if_multiaddr_array / if_multiaddr_count
functions. Is it recursive?

Another one is bcopy under lock. It is probably inlined
under bxe_handle_rx_mode_tq() in ddb, so the actual place
where it's called is not visible.
My guess is bcopy in bxe_init_mcast_macs_list():

 bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac-mac, ETHER_ADDR_LEN);

Previously, there was a pointer assignment, see stable/10:

mc_mac-mac = (uint8_t *)LLADDR((struct sockaddr_dl *)ifma-ifma_addr);

mc_mac itself is malloc(M_ZERO)'ed, so that mc_mac-mac is NULL.

Probably bcopy should be restored to assignment (not even compile tested):

Index: sys/dev/bxe/bxe.c
===
--- sys/dev/bxe/bxe.c   (revision 282468)
+++ sys/dev/bxe/bxe.c   (working copy)
@@ -12506,7 +12506,7 @@
   to be  different */
 for(i=0; i mcnt; i++) {

-bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac-mac, ETHER_ADDR_LEN);
+mc_mac-mac = (uint8_t *)(mta + (i * ETHER_ADDR_LEN));
 ECORE_LIST_PUSH_TAIL(mc_mac-link, p-mcast_list);

 BLOGD(sc, DBG_LOAD,

-- 
wbr,
pluknet
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org