Can't boot from NFS with Emulex BE3 (oce) in stable/10 (9 works)
Hi, I have an Emulex BE3 in a HP BL460c G8 machine. I boot it from PXE/NFS, which works in stable/9 (r248885), but doesn't in stable/10 (r267603). The relevant output from the boot process: oce1: Interface Up Sending DHCP Discover packet from interface oce0 (d8:9d:67:61:c2:a8) Sending DHCP Discover packet from interface oce1 (d8:9d:67:61:c2:ac) uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered ugen1.2: HP at usbus1 ukbd0: Virtual Keyboard on usbus1 kbd2 at ukbd0 ugen2.2: vendor 0x8087 at usbus2 uhub3: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2 on usbus 2 ugen0.2: vendor 0x8087 at usbus0 uhub4: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2 on usbus 0 uhub4: 6 ports with 6 removable, self powered uhub3: 8 ports with 8 removable, self powered ugen2.3: vendor 0x0424 at usbus2 uhub5: vendor 0x0424 product 0x2660, class 9/0, rev 2.00/8.01, addr 3 on usbus 2 uhub5: 2 ports with 1 removable, self powered DHCP/BOOTP timeout for server 255.255.255.255 DHCP/BOOTP timeout for server 255.255.255.255 DHCP/BOOTP timeout for server 255.255.255.255 And these lines forever. On the DHCP server I can see the DHCPDISCOVERs and also DHCPOFFERs without any response from the host. Skimming through the svn changelog I couldn't find any clues from the commit messages. Any ideas? Thanks, ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol.
Hello Adrian, 2014-06-23 12:16 GMT+08:00 Adrian Chadd adr...@freebsd.org: ... It's an interesting idea, but doing round robin like that may introduce out of order packets. Actually, the round robin implementation as it is, causes out of order packets, but almost all the time SACK can recover it. In my tests using iperf, when we set a bigger number of packets to be sent through the same interface before switch to the next one, I can see that we have less SACK request, and I do believe because of it, I can reach a better throughput. The test is very simple: iperf -s and iperf -c ip -i 1 -t 10. As an example: 1) without change the number of packets: 43 SACK recovery episodes 187 segment rexmits in SACK recovery episodes 270776 byte rexmits in SACK recovery episodes 172688 SACK options (SACK blocks) received 0 SACK options (SACK blocks) sent 0 SACK scoreboard overflow 0 input SACK chunks 0 output SACKs 2) Set 50 packets per interface: 6 SACK recovery episodes 16 segment rexmits in SACK recovery episodes 23168 byte rexmits in SACK recovery episodes 111626 SACK options (SACK blocks) received 0 SACK options (SACK blocks) sent 0 SACK scoreboard overflow 0 input SACK chunks 0 output SACKs What's the actual problem you're seeing? Are the transmit queues filling up? Is the distribution with flowid/curcpu not good enough? I have had imported Scott's patch, I do believe you are talking about r260070. I didn't pay attention to the flowid/curcpu distribution and I can't tell you if it is the root cause or not, but for my case, it didn't solve the bad performance of round robin. With all the other lagg(4) protocols, the throughput reach the limit of the NIC. It might be likely that the transmit queue isn't filled up or hang for some reason, it is something that I need check. My suspicious is how the ixgbe(4) trigger the TSO, it seems that transmit queue is not completely filled up and it might delay the transmission or lose packets, or perhaps lose the entire queue. Also any tips of how debug the TSO will be very welcome. Scott saw this happen at Netflix. He added a lagg twiddle to set which set of bits to care about in the flowid when picking an interface to choose. The ixgbe hashing was being done on the low x bits, where x is related to how many CPUs you have (2 CPUs? 1 bit. 8 CPUs? 3 bits. etc.) lagg was doing the same thing on the same low order set of bits. He modified lagg so you could pick some new starting point a few bits up in the flowid to pick a lagg interface with. That fixed the distribution issue and also kept the in-orderness of it all. I thought that Scott's patch is more focused on LACP, I didn't realize that it would helps the other aggregation protocols. Anyway, for round robin, with/without the r260070, don't change too much, at least in my environment. Best Regards, 2c, -a On 22 June 2014 19:27, Marcelo Araujo araujobsdp...@gmail.com wrote: Hello guys, I made some changes on roundrobin protocol where from now you can via sysctl(8) set a better packets distribution among the interfaces that are part of the lagg(4) group. My motivation for this change was interfaces that use TSO, as example ixgbe(4), the performance is terrible, as we can't full fill the TSO buffer at once, the throughput drops expressively and we have much more sack between hosts. So, with this patch we can set the number of packets that will be send before switch to the next interface. In my testbed using ixgbe(4), I had a very good performance as you can see bellow: 1) Without patch: Client connecting to 192.168.1.2, TCP port 5001 TCP window size: 32.5 KByte (default) [ 3] local 192.168.1.1 port 32808 connected with 192.168.1.2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 406 MBytes 3.40 Gbits/sec [ 3] 1.0- 2.0 sec 391 MBytes 3.28 Gbits/sec [ 3] 2.0- 3.0 sec 406 MBytes 3.41 Gbits/sec [ 3] 3.0- 4.0 sec 585 MBytes 4.91 Gbits/sec [ 3] 4.0- 5.0 sec 477 MBytes 4.00 Gbits/sec [ 3] 5.0- 6.0 sec 429 MBytes 3.60 Gbits/sec [ 3] 6.0- 7.0 sec 520 MBytes 4.36 Gbits/sec [ 3] 7.0- 8.0 sec 385 MBytes 3.23 Gbits/sec [ 3] 8.0- 9.0 sec 414 MBytes 3.48 Gbits/sec [ 3] 9.0-10.0 sec 515 MBytes 4.32 Gbits/sec [ 3] 0.0-10.0 sec 4.42 GBytes 3.80 Gbits/sec 2) With patch: Client connecting to 192.168.1.2, TCP port 5001 TCP window size: 32.5 KByte (default) [ 3] local 192.168.1.1 port 10526 connected with 192.168.1.2 port 5001 [ ID] Interval
[Bugzilla] Commit Needs MFC
Hi, You have a bug in the Needs MFC state which has not been touched in 7 or more days. This email serves as a reminder that you may want to MFC this bug or marked it as completed. In the event you have a longer MFC timeout you may update this bug with a comment and I won't remind you again for 7 days. This reminder is only sent on Mondays. Please file a bug about concerns you may have. This search was scheduled by ead...@freebsd.org. (1 bugs) Bug 183659: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183659 Severity: Affects Only Me Priority: Normal Hardware: Any Assignee: freebsd-net@FreeBSD.org Status: Needs MFC Resolution: Summary: [tcp] TCP stack lock contention with short-lived connections ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Can't boot from NFS with Emulex BE3 (oce) in stable/10 (9 works)
On 06/23/14 08:36, Nagy, Attila wrote: I have an Emulex BE3 in a HP BL460c G8 machine. I boot it from PXE/NFS, which works in stable/9 (r248885), but doesn't in stable/10 (r267603). I've upgraded its firmware from 4.6.95.0 to 4.9.416.0 (hp.com latest) and the driver to 10.0.747.0 from emulex.com without any success. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: ifaddr refcount problem
Navdeep, On Fri, Jun 20, 2014 at 12:15:21PM -0700, Navdeep Parhar wrote: N Revision 264905 and 266860 that followed it seem to leak ifaddr N references. ifa_ifwithdstaddr and ifa_ifwithnet both install a N reference on the ifaddr returned to the caller but ip_output does not N release it, eventually leading to a panic when the refcount wraps over N to 0 and the ifaddr is freed while it is still on various lists. N N I'm using this patch for now. Thoughts? N N Regards, N Navdeep N N N diff -r 6dfcecd314af sys/netinet/ip_output.c N --- a/sys/netinet/ip_output.cFri Jun 20 10:33:22 2014 -0700 N +++ b/sys/netinet/ip_output.cFri Jun 20 12:07:12 2014 -0700 N @@ -243,6 +243,7 @@ again: N ifp = ia-ia_ifp; N ip-ip_ttl = 1; N isbroadcast = 1; N +ifa_free((void *)ia); N } else if (flags IP_ROUTETOIF) { N if ((ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst == NULL N (ia = ifatoia(ifa_ifwithnet(sintosa(dst), 0))) == NULL) { N @@ -253,6 +254,7 @@ again: N ifp = ia-ia_ifp; N ip-ip_ttl = 1; N isbroadcast = in_broadcast(dst-sin_addr, ifp); N +ifa_free((void *)ia); N } else if (IN_MULTICAST(ntohl(ip-ip_dst.s_addr)) N imo != NULL imo-imo_multicast_ifp != NULL) { N /* The patch shouldn't use void * casts, but instead specify explicit member: ifa_free(ia-ia_ifa); Apart from that it, the patch looks entirely correct and plugging a leak. Thanks! -- Totus tuus, Glebius. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Weird Xen networking issue with PV interfaces passing traffic to other PV's...
Hi, I originally posted to freebsd-xen about this (and I've raised a PR) -pr188261 - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=188261 It's been suggested I should ask in FreeBSD-NET to see if someone can look at this, and suggest how to proceed... In a nutshell - the FreeBSD PV network code won't pass packets properly if you setup two VM's on the same XenServer trying to use one of the VM's as a 'router' (i.e. passing traffic). Swap it for linux - works. Disable PV - works - anyway, the details are in the PR. If anyone has a chance to look at that and suggest what I can do to try and debug this further? - I've done packet captures from the VM's (they show a lot of retransmits) - but I'm not sure how I can go any further in look at this - or what the problem is likely to be (it's been suggested it's checksum related - which '-txcsum' seems to address for clients, but not the router). If some kind soul can have a look at this - and suggest anything? Cheers, -Karl ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Bug 184311] [bge] [panic] kernel panic with bge(4) on SunFire X2100
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=184311 John Baldwin j...@freebsd.org changed: What|Removed |Added CC||j...@freebsd.org --- Comment #6 from John Baldwin j...@freebsd.org --- Can you capture a verbose dmesg (boot -v) with the ASF tunable set (so that it works) and in the stock setup (where it breaks)? -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Ordering problem in if_detach_internal regarding if_bridge
On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote: Hello, I've stumbled across the following panic when testing Xen netback with if_bridge: Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) locked @ /usr/src/sys/m KDB: stack backtrace: X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540 witness_warn() at witness_warn+0x4a8/frame 0xfe213600 trap() at trap+0xc9d/frame 0xfe2136a0 trap() at trap+0x669/frame 0xfe2138b0 calltrap() at calltrap+0x8/frame 0xfe2138b0 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 0xfe2139e0 --- bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 0xfe213b30 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 0xfe213b70 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0 fork_exit() at fork_exit+0x84/frame 0xfe213bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 --- I've tracked this down to if_detach_internal setting ifp-if_addr to NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries to perform IF_LLADDR on an interface that's in the process of being destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event has not fired yet). I have the following naive patch that moves the firing of the event before if_addr is set to NULL, but I'm not familiar with the ordering in if_detach_internal, so I'm not sure if this might cause problems in other parts of the code, could someone familiar with the net stuff comment on the best way to deal with it? Hmmm, I have no idea if this is ok or not. I do think the route message should go out at the same time as the devctl_notify() call however. My guess is it is actually better to do this earlier so that we allow outside consumers to detach from an interface before it is destroyed. I'm not sure if it would break things, but I would be tempted to move this even earlier right after it is removed from the global ifnet list but before the taskqueue_drain, etc. -- John Baldwin ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Ordering problem in if_detach_internal regarding if_bridge
On 23.06.2014 19:32, John Baldwin wrote: On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote: Hello, I've stumbled across the following panic when testing Xen netback with if_bridge: Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) locked @ /usr/src/sys/m KDB: stack backtrace: X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540 witness_warn() at witness_warn+0x4a8/frame 0xfe213600 trap() at trap+0xc9d/frame 0xfe2136a0 trap() at trap+0x669/frame 0xfe2138b0 calltrap() at calltrap+0x8/frame 0xfe2138b0 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 0xfe2139e0 --- bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 0xfe213b30 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 0xfe213b70 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0 fork_exit() at fork_exit+0x84/frame 0xfe213bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 --- I've tracked this down to if_detach_internal setting ifp-if_addr to NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries to perform IF_LLADDR on an interface that's in the process of being destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event has not fired yet). I have the following naive patch that moves the firing of the event before if_addr is set to NULL, but I'm not familiar with the ordering in if_detach_internal, so I'm not sure if this might cause problems in other parts of the code, could someone familiar with the net stuff comment on the best way to deal with it? We should notify kernel customers only when we are really taking this interface down and every other subsystem cannot add any new state to the interface. In this patch you're sending notification before taking ifnet down, removing its L3 addresses, routes, and so on. This can easily lead to panic in, for example, BPF subsystem (since BPF state is freed in bpf_ifdetach() handler). Addintionally, this will introduce ifaddr / iface messages reversal for rtsock. It looks like we'd better fix if_bridge (and it is still using mutexes, what a shame!). Can you send me trace with line numbers? Hmmm, I have no idea if this is ok or not. I do think the route message should go out at the same time as the devctl_notify() call however. My guess is it is actually better to do this earlier so that we allow outside consumers to detach from an interface before it is destroyed. I'm not sure if it would break things, but I would be tempted to move this even earlier right after it is removed from the global ifnet list but before the taskqueue_drain, etc. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: ifaddr refcount problem
On Mon, Jun 23, 2014 at 2:52 AM, Gleb Smirnoff gleb...@freebsd.org wrote: Navdeep, On Fri, Jun 20, 2014 at 12:15:21PM -0700, Navdeep Parhar wrote: N Revision 264905 and 266860 that followed it seem to leak ifaddr N references. ifa_ifwithdstaddr and ifa_ifwithnet both install a N reference on the ifaddr returned to the caller but ip_output does not N release it, eventually leading to a panic when the refcount wraps over N to 0 and the ifaddr is freed while it is still on various lists. N N I'm using this patch for now. Thoughts? N N Regards, N Navdeep N N N diff -r 6dfcecd314af sys/netinet/ip_output.c N --- a/sys/netinet/ip_output.cFri Jun 20 10:33:22 2014 -0700 N +++ b/sys/netinet/ip_output.cFri Jun 20 12:07:12 2014 -0700 N @@ -243,6 +243,7 @@ again: N ifp = ia-ia_ifp; N ip-ip_ttl = 1; N isbroadcast = 1; N +ifa_free((void *)ia); N } else if (flags IP_ROUTETOIF) { N if ((ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst == NULL N (ia = ifatoia(ifa_ifwithnet(sintosa(dst), 0))) == NULL) { N @@ -253,6 +254,7 @@ again: N ifp = ia-ia_ifp; N ip-ip_ttl = 1; N isbroadcast = in_broadcast(dst-sin_addr, ifp); N +ifa_free((void *)ia); N } else if (IN_MULTICAST(ntohl(ip-ip_dst.s_addr)) N imo != NULL imo-imo_multicast_ifp != NULL) { N /* The patch shouldn't use void * casts, but instead specify explicit member: ifa_free(ia-ia_ifa); Apart from that it, the patch looks entirely correct and plugging a leak. Thanks! I still don't see how this patch would work without breaking stuff like the statistics collection at line 673 of ip_output.c. If we call ifa_free immediately after choosing our ifp, then ia won't be available at lines 630 or 673, and ip_output will never record statistics, right? -Alan -- Totus tuus, Glebius. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Ordering problem in if_detach_internal regarding if_bridge
On 23.06.2014 20:39, Alexander V. Chernikov wrote: On 23.06.2014 19:32, John Baldwin wrote: On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote: Hello, I've stumbled across the following panic when testing Xen netback with if_bridge: Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) locked @ /usr/src/sys/m KDB: stack backtrace: X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540 witness_warn() at witness_warn+0x4a8/frame 0xfe213600 trap() at trap+0xc9d/frame 0xfe2136a0 trap() at trap+0x669/frame 0xfe2138b0 calltrap() at calltrap+0x8/frame 0xfe2138b0 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 0xfe2139e0 --- bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 0xfe213b30 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 0xfe213b70 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0 fork_exit() at fork_exit+0x84/frame 0xfe213bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 --- I've tracked this down to if_detach_internal setting ifp-if_addr to NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries to perform IF_LLADDR on an interface that's in the process of being destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event has not fired yet). I have the following naive patch that moves the firing of the event before if_addr is set to NULL, but I'm not familiar with the ordering in if_detach_internal, so I'm not sure if this might cause problems in other parts of the code, could someone familiar with the net stuff comment on the best way to deal with it? We should notify kernel customers only when we are really taking this interface down and every other subsystem cannot add any new state to the interface. In this patch you're sending notification before taking ifnet down, removing its L3 addresses, routes, and so on. This can easily lead to panic in, for example, BPF subsystem (since BPF state is freed in bpf_ifdetach() handler). Addintionally, this will introduce ifaddr / iface messages reversal for rtsock. Whoops. I misread the patch. It should be OK. It looks like we'd better fix if_bridge (and it is still using mutexes, what a shame!). Can you send me trace with line numbers? However, these two still stands. (And I'm wondering how you're getting any traffic on down/dying interface). Hmmm, I have no idea if this is ok or not. I do think the route message should go out at the same time as the devctl_notify() call however. My guess is it is actually better to do this earlier so that we allow outside consumers to detach from an interface before it is destroyed. I'm not sure if it would break things, but I would be tempted to move this even earlier right after it is removed from the global ifnet list but before the taskqueue_drain, etc. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Ordering problem in if_detach_internal regarding if_bridge
On 23/06/14 18:49, Alexander V. Chernikov wrote: On 23.06.2014 20:39, Alexander V. Chernikov wrote: On 23.06.2014 19:32, John Baldwin wrote: On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote: Hello, I've stumbled across the following panic when testing Xen netback with if_bridge: Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) locked @ /usr/src/sys/m KDB: stack backtrace: X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540 witness_warn() at witness_warn+0x4a8/frame 0xfe213600 trap() at trap+0xc9d/frame 0xfe2136a0 trap() at trap+0x669/frame 0xfe2138b0 calltrap() at calltrap+0x8/frame 0xfe2138b0 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 0xfe2139e0 --- bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 0xfe213b30 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 0xfe213b70 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0 fork_exit() at fork_exit+0x84/frame 0xfe213bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 --- I've tracked this down to if_detach_internal setting ifp-if_addr to NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries to perform IF_LLADDR on an interface that's in the process of being destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event has not fired yet). I have the following naive patch that moves the firing of the event before if_addr is set to NULL, but I'm not familiar with the ordering in if_detach_internal, so I'm not sure if this might cause problems in other parts of the code, could someone familiar with the net stuff comment on the best way to deal with it? We should notify kernel customers only when we are really taking this interface down and every other subsystem cannot add any new state to the interface. In this patch you're sending notification before taking ifnet down, removing its L3 addresses, routes, and so on. This can easily lead to panic in, for example, BPF subsystem (since BPF state is freed in bpf_ifdetach() handler). Addintionally, this will introduce ifaddr / iface messages reversal for rtsock. Whoops. I misread the patch. It should be OK. It looks like we'd better fix if_bridge (and it is still using mutexes, what a shame!). Can you send me trace with line numbers? However, these two still stands. (And I'm wondering how you're getting any traffic on down/dying interface). I'm not getting the traffic from the dying interface, I'm getting the traffic from another interface on the bridge (a physical bce interface), which injects traffic into the bridge, that calls bridge_input, which tries to read ifp-if_addr-ifa_addr from the dying interface, and that leads to the panic. Line numbers: /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:2410 (bridge_input) /usr/src/sys/net/if_ethersubr.c:543 (ether_input_internal) /usr/src/sys/net/netisr.c:972 (netisr_dispatch_src) /usr/src/sys/net/if_ethersubr.c:674 (ether_input) /usr/src/sys/dev/bce/if_bce.c:6861 (bce_rx_intr) Roger. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
[Bug 190785] [em] cpu affinity not working in FreeBSD 10-STABLE
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=190785 John Baldwin j...@freebsd.org changed: What|Removed |Added CC||j...@freebsd.org --- Comment #3 from John Baldwin j...@freebsd.org --- Can you provide more details? The em/igb drivers create additional taskqueue threads for each queue, but 'cpuset -x' is only going to pin the interrupt thread associated with that IRQ, not other threads the em driver may create. -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol.
Hi, No, don't introduce out of order behaviour. Ever. You may not think it's a problem for TCP, but UDP things and VPN things will start getting very angry. There are VPN configurations out there that will drop the VPN if frames are out of order. The ixgbe driver is setting the flowid to the msix queue ID, rather than a 32 bit unique flow id hash value for the flow. That makes it hard to do traffic distribution where the flowid is available. There's an lagg option to re-hash the mbuf rather than rely on the flowid for outbound port choice - have you looked at using that? Did that make any difference? -a ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol.
2014-06-24 6:54 GMT+08:00 Adrian Chadd adr...@freebsd.org: Hi, No, don't introduce out of order behaviour. Ever. Yes, it has out of order behavior; with my patch much less. I upload two pcap files and you can see by yourself, if you don't believe in what I'm talking about. Test done using: iperf -s and iperf -c ip -i 1 -t 10. 1) Don't change the number of packets(default round robin behavior). http://people.freebsd.org/~araujo/lagg/lagg-nop.cap 8 out of order packets. Several SACKs. 2) Set the number of packets to 50. http://people.freebsd.org/~araujo/lagg/lagg.cap 0 out of order packets. Less SACKs. You may not think it's a problem for TCP, but UDP things and VPN things will start getting very angry. There are VPN configurations out there that will drop the VPN if frames are out of order. I'm not thinking that will be a problem for TCP, but, in somehow it will be, less throughput as I showed before, and less SACK. About the VPN, please, tell me which softwares, and let me know where I can get a sample to make a testbed. However to be very honest, I don't believe anyone here when change something at network protocols will make this extensive testbed. It is almost impossible to predict what software it will works or not, and I don't believe anyone here has all these stuff in hands. The ixgbe driver is setting the flowid to the msix queue ID, rather than a 32 bit unique flow id hash value for the flow. That makes it hard to do traffic distribution where the flowid is available. Thanks for the explanation. There's an lagg option to re-hash the mbuf rather than rely on the flowid for outbound port choice - have you looked at using that? Did that make any difference? Yes, I set to 0 the net.link.lagg.0.use _flowid, it make a little difference to the default round robin implementation, but yet I can't reach more than 5 Gbit/s. With my patch and set the packets to 50, it improved a bit too. So, thank you so much for all review, I don't know if you have time and a testbed to make a real test, as I'm doing. I would be happy if you or more people could make tests on that patch. Also, I have only ixgbe(4) to make tests, would appreciate if this patch could be tested with other NICs too. Best Regards, -- Marcelo Araujo(__)ara...@freebsd.org \\\'',)http://www.FreeBSD.org http://www.freebsd.org/ \/ \ ^ Power To Server. .\. /_) ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org