Can't boot from NFS with Emulex BE3 (oce) in stable/10 (9 works)

2014-06-23 Thread Nagy, Attila

Hi,

I have an Emulex BE3 in a HP BL460c G8 machine. I boot it from PXE/NFS, 
which works in stable/9 (r248885), but doesn't in stable/10 (r267603).


The relevant output from the boot process:
oce1: Interface Up
Sending DHCP Discover packet from interface oce0 (d8:9d:67:61:c2:a8)
Sending DHCP Discover packet from interface oce1 (d8:9d:67:61:c2:ac)
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
ugen1.2: HP at usbus1
ukbd0: Virtual Keyboard  on usbus1
kbd2 at ukbd0
ugen2.2: vendor 0x8087 at usbus2
uhub3: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2 
on usbus

2
ugen0.2: vendor 0x8087 at usbus0
uhub4: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2 
on usbus

0
uhub4: 6 ports with 6 removable, self powered
uhub3: 8 ports with 8 removable, self powered
ugen2.3: vendor 0x0424 at usbus2
uhub5: vendor 0x0424 product 0x2660, class 9/0, rev 2.00/8.01, addr 3 
on usbus

2
uhub5: 2 ports with 1 removable, self powered
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255

And these lines forever.

On the DHCP server I can see the DHCPDISCOVERs and also DHCPOFFERs 
without any response from the host.
Skimming through the svn changelog I couldn't find any clues from the 
commit messages.


Any ideas?

Thanks,
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol.

2014-06-23 Thread Marcelo Araujo
Hello Adrian,


2014-06-23 12:16 GMT+08:00 Adrian Chadd adr...@freebsd.org:

 ...

 It's an interesting idea, but doing round robin like that may
 introduce out of order packets.


Actually, the round robin implementation as it is, causes out of order
packets, but almost all the time SACK can recover it.

In my tests using iperf, when we set a bigger number of packets to be sent
through the same interface before switch to the next one, I can see that we
have less SACK request, and I do believe because of it, I can reach a
better throughput.

The test is very simple: iperf -s and iperf -c ip -i 1 -t 10.

As an example:
1) without change the number of packets:
43 SACK recovery episodes
187 segment rexmits in SACK recovery episodes
270776 byte rexmits in SACK recovery episodes
172688 SACK options (SACK blocks) received
0 SACK options (SACK blocks) sent
0 SACK scoreboard overflow
0 input SACK chunks
0 output SACKs

2) Set 50 packets per interface:
6 SACK recovery episodes
16 segment rexmits in SACK recovery episodes
23168 byte rexmits in SACK recovery episodes
111626 SACK options (SACK blocks) received
0 SACK options (SACK blocks) sent
0 SACK scoreboard overflow
0 input SACK chunks
0 output SACKs




 What's the actual problem you're seeing? Are the transmit queues
 filling up? Is the distribution with flowid/curcpu not good enough?


I have had imported Scott's patch, I do believe you are talking about
r260070. I didn't pay attention to the flowid/curcpu distribution and I
can't tell you if it is the root cause or not, but for my case, it didn't
solve the bad performance of round robin. With all the other lagg(4)
protocols, the throughput reach the limit of the NIC.

It might be likely that the transmit queue isn't filled up or hang for some
reason, it is something that I need check.

My suspicious is how the ixgbe(4) trigger the TSO, it seems that transmit
queue is not completely filled up and it might delay the transmission or
lose packets, or perhaps lose the entire queue. Also any tips of how debug
the TSO will be very welcome.



 Scott saw this happen at Netflix. He added a lagg twiddle to set which
 set of bits to care about in the flowid when picking an interface to
 choose. The ixgbe hashing was being done on the low x bits, where x is
 related to how many CPUs you have (2 CPUs? 1 bit. 8 CPUs? 3 bits.
 etc.) lagg was doing the same thing on the same low order set of bits.
 He modified lagg so you could pick some new starting point a few bits
 up in the flowid to pick a lagg interface with. That fixed the
 distribution issue and also kept the in-orderness of it all.


I thought that Scott's patch is more focused on LACP, I didn't realize that
it would helps the other aggregation protocols. Anyway, for round robin,
with/without the r260070, don't change too much, at least in my environment.

Best Regards,



 2c,


 -a

 On 22 June 2014 19:27, Marcelo Araujo araujobsdp...@gmail.com wrote:
  Hello guys,
 
  I made some changes on roundrobin protocol where from now you can via
  sysctl(8) set a better packets distribution among the interfaces that are
  part of the lagg(4) group.
 
  My motivation for this change was interfaces that use TSO, as example
  ixgbe(4), the performance is terrible, as we can't full fill the TSO
 buffer
  at once, the throughput drops expressively and we have much more sack
  between hosts.
 
  So, with this patch we can set the number of packets that will be send
  before switch to the next interface.
 
  In my testbed using ixgbe(4), I had a very good performance as you can
 see
  bellow:
 
  1) Without patch:
  
  Client connecting to 192.168.1.2, TCP port 5001
  TCP window size: 32.5 KByte (default)
  
  [  3] local 192.168.1.1 port 32808 connected with 192.168.1.2 port 5001
  [ ID] Interval   Transfer Bandwidth
  [  3]  0.0- 1.0 sec   406 MBytes  3.40 Gbits/sec
  [  3]  1.0- 2.0 sec   391 MBytes  3.28 Gbits/sec
  [  3]  2.0- 3.0 sec   406 MBytes  3.41 Gbits/sec
  [  3]  3.0- 4.0 sec   585 MBytes  4.91 Gbits/sec
  [  3]  4.0- 5.0 sec   477 MBytes  4.00 Gbits/sec
  [  3]  5.0- 6.0 sec   429 MBytes  3.60 Gbits/sec
  [  3]  6.0- 7.0 sec   520 MBytes  4.36 Gbits/sec
  [  3]  7.0- 8.0 sec   385 MBytes  3.23 Gbits/sec
  [  3]  8.0- 9.0 sec   414 MBytes  3.48 Gbits/sec
  [  3]  9.0-10.0 sec   515 MBytes  4.32 Gbits/sec
  [  3]  0.0-10.0 sec  4.42 GBytes  3.80 Gbits/sec
 
  2) With patch:
  
  Client connecting to 192.168.1.2, TCP port 5001
  TCP window size: 32.5 KByte (default)
  
  [  3] local 192.168.1.1 port 10526 connected with 192.168.1.2 port 5001
  [ ID] Interval   

[Bugzilla] Commit Needs MFC

2014-06-23 Thread bugzilla-noreply
Hi,

You have a bug in the Needs MFC state which has not been touched in 7 or more 
days. This email serves as a reminder that you may want to MFC this bug or 
marked it as completed.

In the event you have a longer MFC timeout you may update this bug with a 
comment and I won't remind you again for 7 days.

This reminder is only sent on Mondays.  Please file a bug about concerns you 
may have.

  This search was scheduled by ead...@freebsd.org.


 (1 bugs)

Bug 183659:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=183659
Severity: Affects Only Me
Priority: Normal
Hardware: Any
Assignee: freebsd-net@FreeBSD.org
  Status: Needs MFC
  Resolution: 
 Summary: [tcp] TCP stack lock contention with short-lived connections

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Can't boot from NFS with Emulex BE3 (oce) in stable/10 (9 works)

2014-06-23 Thread Nagy, Attila

On 06/23/14 08:36, Nagy, Attila wrote:


I have an Emulex BE3 in a HP BL460c G8 machine. I boot it from 
PXE/NFS, which works in stable/9 (r248885), but doesn't in stable/10 
(r267603).
I've upgraded its firmware from 4.6.95.0 to 4.9.416.0 (hp.com latest) 
and the driver to 10.0.747.0 from emulex.com without any success.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: ifaddr refcount problem

2014-06-23 Thread Gleb Smirnoff
  Navdeep,

On Fri, Jun 20, 2014 at 12:15:21PM -0700, Navdeep Parhar wrote:
N Revision 264905 and 266860 that followed it seem to leak ifaddr
N references.  ifa_ifwithdstaddr and ifa_ifwithnet both install a
N reference on the ifaddr returned to the caller but ip_output does not
N release it, eventually leading to a panic when the refcount wraps over
N to 0 and the ifaddr is freed while it is still on various lists.
N 
N I'm using this patch for now.  Thoughts?
N 
N Regards,
N Navdeep
N 
N 
N diff -r 6dfcecd314af sys/netinet/ip_output.c
N --- a/sys/netinet/ip_output.cFri Jun 20 10:33:22 2014 -0700
N +++ b/sys/netinet/ip_output.cFri Jun 20 12:07:12 2014 -0700
N @@ -243,6 +243,7 @@ again:
N  ifp = ia-ia_ifp;
N  ip-ip_ttl = 1;
N  isbroadcast = 1;
N +ifa_free((void *)ia);
N  } else if (flags  IP_ROUTETOIF) {
N  if ((ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst == NULL 
N  (ia = ifatoia(ifa_ifwithnet(sintosa(dst), 0))) == NULL) {
N @@ -253,6 +254,7 @@ again:
N  ifp = ia-ia_ifp;
N  ip-ip_ttl = 1;
N  isbroadcast = in_broadcast(dst-sin_addr, ifp);
N +ifa_free((void *)ia);
N  } else if (IN_MULTICAST(ntohl(ip-ip_dst.s_addr)) 
N  imo != NULL  imo-imo_multicast_ifp != NULL) {
N  /*

The patch shouldn't use void * casts, but instead specify explicit member:

ifa_free(ia-ia_ifa);

Apart from that it, the patch looks entirely correct and plugging a leak.
Thanks!

-- 
Totus tuus, Glebius.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Weird Xen networking issue with PV interfaces passing traffic to other PV's...

2014-06-23 Thread Karl Pielorz


Hi,

I originally posted to freebsd-xen about this (and I've raised a PR) 
-pr188261 - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=188261


It's been suggested I should ask in FreeBSD-NET to see if someone can look 
at this, and suggest how to proceed...


In a nutshell - the FreeBSD PV network code won't pass packets properly if 
you setup two VM's on the same XenServer trying to use one of the VM's as a 
'router' (i.e. passing traffic). Swap it for linux - works. Disable PV - 
works - anyway, the details are in the PR.


If anyone has a chance to look at that and suggest what I can do to try and 
debug this further? - I've done packet captures from the VM's (they show a 
lot of retransmits) - but I'm not sure how I can go any further in look at 
this - or what the problem is likely to be (it's been suggested it's 
checksum related - which '-txcsum' seems to address for clients, but not 
the  router).


If some kind soul can have a look at this - and suggest anything?

Cheers,

-Karl
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 184311] [bge] [panic] kernel panic with bge(4) on SunFire X2100

2014-06-23 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=184311

John Baldwin j...@freebsd.org changed:

   What|Removed |Added

 CC||j...@freebsd.org

--- Comment #6 from John Baldwin j...@freebsd.org ---
Can you capture a verbose dmesg (boot -v) with the ASF tunable set (so that it
works) and in the stock setup (where it breaks)?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Ordering problem in if_detach_internal regarding if_bridge

2014-06-23 Thread John Baldwin
On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote:
 Hello,
 
 I've stumbled across the following panic when testing Xen netback with 
 if_bridge:
 
 Kernel page fault with the following non-sleepable locks held:
 exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) 
locked @ /usr/src/sys/m
 KDB: stack backtrace:
 X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540
 witness_warn() at witness_warn+0x4a8/frame 0xfe213600
 trap() at trap+0xc9d/frame 0xfe2136a0
 trap() at trap+0x669/frame 0xfe2138b0
 calltrap() at calltrap+0x8/frame 0xfe2138b0
 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 
0xfe2139e0 ---
 bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0
 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10
 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80
 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0
 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 
0xfe213b30
 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 
0xfe213b70
 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0
 fork_exit() at fork_exit+0x84/frame 0xfe213bf0
 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0
 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 ---
 
 I've tracked this down to if_detach_internal setting ifp-if_addr to 
 NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which 
 causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries 
 to perform IF_LLADDR on an interface that's in the process of being 
 destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event 
 has not fired yet).
 
 I have the following naive patch that moves the firing of the event 
 before if_addr is set to NULL, but I'm not familiar with the ordering 
 in if_detach_internal, so I'm not sure if this might cause problems in 
 other parts of the code, could someone familiar with the net stuff 
 comment on the best way to deal with it?

Hmmm, I have no idea if this is ok or not.  I do think the route message 
should go out at the same time as the devctl_notify() call however.  My guess 
is it is actually better to do this earlier so that we allow outside consumers
to detach from an interface before it is destroyed.  I'm not sure if it would
break things, but I would be tempted to move this even earlier right after it
is removed from the global ifnet list but before the taskqueue_drain, etc.

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Ordering problem in if_detach_internal regarding if_bridge

2014-06-23 Thread Alexander V. Chernikov
On 23.06.2014 19:32, John Baldwin wrote:
 On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote:
 Hello,

 I've stumbled across the following panic when testing Xen netback with 
 if_bridge:

 Kernel page fault with the following non-sleepable locks held:
 exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) 
 locked @ /usr/src/sys/m
 KDB: stack backtrace:
 X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540
 witness_warn() at witness_warn+0x4a8/frame 0xfe213600
 trap() at trap+0xc9d/frame 0xfe2136a0
 trap() at trap+0x669/frame 0xfe2138b0
 calltrap() at calltrap+0x8/frame 0xfe2138b0
 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 
 0xfe2139e0 ---
 bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0
 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10
 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80
 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0
 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 
 0xfe213b30
 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 
 0xfe213b70
 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0
 fork_exit() at fork_exit+0x84/frame 0xfe213bf0
 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0
 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 ---

 I've tracked this down to if_detach_internal setting ifp-if_addr to 
 NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which 
 causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries 
 to perform IF_LLADDR on an interface that's in the process of being 
 destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event 
 has not fired yet).

 I have the following naive patch that moves the firing of the event 
 before if_addr is set to NULL, but I'm not familiar with the ordering 
 in if_detach_internal, so I'm not sure if this might cause problems in 
 other parts of the code, could someone familiar with the net stuff 
 comment on the best way to deal with it?

We should notify kernel customers only when we are really taking this
interface down and every other subsystem cannot add any new state to the
interface.

In this patch you're sending notification before taking ifnet down,
removing its L3 addresses, routes, and so on.

This can easily lead to panic in, for example, BPF subsystem (since BPF
state is freed in bpf_ifdetach() handler).

Addintionally, this will introduce ifaddr / iface messages reversal for
rtsock.

It looks like we'd better fix if_bridge (and it is still using mutexes,
what a shame!).

Can you send me trace with line numbers?

 
 Hmmm, I have no idea if this is ok or not.  I do think the route message 
 should go out at the same time as the devctl_notify() call however.  My guess 
 is it is actually better to do this earlier so that we allow outside consumers
 to detach from an interface before it is destroyed.  I'm not sure if it would
 break things, but I would be tempted to move this even earlier right after it
 is removed from the global ifnet list but before the taskqueue_drain, etc.
 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: ifaddr refcount problem

2014-06-23 Thread Alan Somers
On Mon, Jun 23, 2014 at 2:52 AM, Gleb Smirnoff gleb...@freebsd.org wrote:
   Navdeep,

 On Fri, Jun 20, 2014 at 12:15:21PM -0700, Navdeep Parhar wrote:
 N Revision 264905 and 266860 that followed it seem to leak ifaddr
 N references.  ifa_ifwithdstaddr and ifa_ifwithnet both install a
 N reference on the ifaddr returned to the caller but ip_output does not
 N release it, eventually leading to a panic when the refcount wraps over
 N to 0 and the ifaddr is freed while it is still on various lists.
 N
 N I'm using this patch for now.  Thoughts?
 N
 N Regards,
 N Navdeep
 N
 N
 N diff -r 6dfcecd314af sys/netinet/ip_output.c
 N --- a/sys/netinet/ip_output.cFri Jun 20 10:33:22 2014 -0700
 N +++ b/sys/netinet/ip_output.cFri Jun 20 12:07:12 2014 -0700
 N @@ -243,6 +243,7 @@ again:
 N  ifp = ia-ia_ifp;
 N  ip-ip_ttl = 1;
 N  isbroadcast = 1;
 N +ifa_free((void *)ia);
 N  } else if (flags  IP_ROUTETOIF) {
 N  if ((ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst == NULL 
 N  (ia = ifatoia(ifa_ifwithnet(sintosa(dst), 0))) == NULL) {
 N @@ -253,6 +254,7 @@ again:
 N  ifp = ia-ia_ifp;
 N  ip-ip_ttl = 1;
 N  isbroadcast = in_broadcast(dst-sin_addr, ifp);
 N +ifa_free((void *)ia);
 N  } else if (IN_MULTICAST(ntohl(ip-ip_dst.s_addr)) 
 N  imo != NULL  imo-imo_multicast_ifp != NULL) {
 N  /*

 The patch shouldn't use void * casts, but instead specify explicit member:

 ifa_free(ia-ia_ifa);

 Apart from that it, the patch looks entirely correct and plugging a leak.
 Thanks!

I still don't see how this patch would work without breaking stuff
like the statistics collection at line 673 of ip_output.c.  If we call
ifa_free immediately after choosing our ifp, then ia won't be
available at lines 630 or 673, and ip_output will never record
statistics, right?

-Alan


 --
 Totus tuus, Glebius.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Ordering problem in if_detach_internal regarding if_bridge

2014-06-23 Thread Alexander V. Chernikov
On 23.06.2014 20:39, Alexander V. Chernikov wrote:
 On 23.06.2014 19:32, John Baldwin wrote:
 On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote:
 Hello,

 I've stumbled across the following panic when testing Xen netback with 
 if_bridge:

 Kernel page fault with the following non-sleepable locks held:
 exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) 
 locked @ /usr/src/sys/m
 KDB: stack backtrace:
 X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540
 witness_warn() at witness_warn+0x4a8/frame 0xfe213600
 trap() at trap+0xc9d/frame 0xfe2136a0
 trap() at trap+0x669/frame 0xfe2138b0
 calltrap() at calltrap+0x8/frame 0xfe2138b0
 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 
 0xfe2139e0 ---
 bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0
 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10
 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80
 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0
 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 
 0xfe213b30
 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 
 0xfe213b70
 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0
 fork_exit() at fork_exit+0x84/frame 0xfe213bf0
 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0
 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 ---

 I've tracked this down to if_detach_internal setting ifp-if_addr to 
 NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which 
 causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries 
 to perform IF_LLADDR on an interface that's in the process of being 
 destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event 
 has not fired yet).

 I have the following naive patch that moves the firing of the event 
 before if_addr is set to NULL, but I'm not familiar with the ordering 
 in if_detach_internal, so I'm not sure if this might cause problems in 
 other parts of the code, could someone familiar with the net stuff 
 comment on the best way to deal with it?
 
 We should notify kernel customers only when we are really taking this
 interface down and every other subsystem cannot add any new state to the
 interface.
 
 In this patch you're sending notification before taking ifnet down,
 removing its L3 addresses, routes, and so on.
 
 This can easily lead to panic in, for example, BPF subsystem (since BPF
 state is freed in bpf_ifdetach() handler).
 
 Addintionally, this will introduce ifaddr / iface messages reversal for
 rtsock.
Whoops. I misread the patch.
It should be OK.

 
 It looks like we'd better fix if_bridge (and it is still using mutexes,
 what a shame!).
 
 Can you send me trace with line numbers?
However, these two still stands.
(And I'm wondering how you're getting any traffic on down/dying interface).
 

 Hmmm, I have no idea if this is ok or not.  I do think the route message 
 should go out at the same time as the devctl_notify() call however.  My 
 guess 
 is it is actually better to do this earlier so that we allow outside 
 consumers
 to detach from an interface before it is destroyed.  I'm not sure if it would
 break things, but I would be tempted to move this even earlier right after it
 is removed from the global ifnet list but before the taskqueue_drain, etc.

 
 ___
 freebsd-net@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Ordering problem in if_detach_internal regarding if_bridge

2014-06-23 Thread Roger Pau Monné
On 23/06/14 18:49, Alexander V. Chernikov wrote:
 On 23.06.2014 20:39, Alexander V. Chernikov wrote:
 On 23.06.2014 19:32, John Baldwin wrote:
 On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote:
 Hello,

 I've stumbled across the following panic when testing Xen netback with 
 if_bridge:

 Kernel page fault with the following non-sleepable locks held:
 exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80006306c18) 
 locked @ /usr/src/sys/m
 KDB: stack backtrace:
 X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfe213490
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe213540
 witness_warn() at witness_warn+0x4a8/frame 0xfe213600
 trap() at trap+0xc9d/frame 0xfe2136a0
 trap() at trap+0x669/frame 0xfe2138b0
 calltrap() at calltrap+0x8/frame 0xfe2138b0
 --- trap 0xc, rip = 0x8221a0ef, rsp = 0xfe213970, rbp = 
 0xfe2139e0 ---
 bridge_input() at bridge_input+0x5ff/frame 0xfe2139e0
 ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfe213a10
 netisr_dispatch_src() at netisr_dispatch_src+0x90/frame 0xfe213a80
 ether_ifattach() at ether_ifattach+0x19f/frame 0xfe213ab0
 ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame 
 0xfe213b30
 intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame 
 0xfe213b70
 db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfe213bb0
 fork_exit() at fork_exit+0x84/frame 0xfe213bf0
 fork_trampoline() at fork_trampoline+0xe/frame 0xfe213bf0
 --- trap 0, rip = 0, rsp = 0xfe213cb0, rbp = 0 ---

 I've tracked this down to if_detach_internal setting ifp-if_addr to 
 NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which 
 causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries 
 to perform IF_LLADDR on an interface that's in the process of being 
 destroyed (ifp-if_addr set to NULL, but the ifnet_departure_event event 
 has not fired yet).

 I have the following naive patch that moves the firing of the event 
 before if_addr is set to NULL, but I'm not familiar with the ordering 
 in if_detach_internal, so I'm not sure if this might cause problems in 
 other parts of the code, could someone familiar with the net stuff 
 comment on the best way to deal with it?

 We should notify kernel customers only when we are really taking this
 interface down and every other subsystem cannot add any new state to the
 interface.

 In this patch you're sending notification before taking ifnet down,
 removing its L3 addresses, routes, and so on.

 This can easily lead to panic in, for example, BPF subsystem (since BPF
 state is freed in bpf_ifdetach() handler).

 Addintionally, this will introduce ifaddr / iface messages reversal for
 rtsock.
 Whoops. I misread the patch.
 It should be OK.
 

 It looks like we'd better fix if_bridge (and it is still using mutexes,
 what a shame!).

 Can you send me trace with line numbers?
 However, these two still stands.
 (And I'm wondering how you're getting any traffic on down/dying interface).

I'm not getting the traffic from the dying interface, I'm getting the
traffic from another interface on the bridge (a physical bce interface),
which injects traffic into the bridge, that calls bridge_input, which
tries to read ifp-if_addr-ifa_addr from the dying interface, and that
leads to the panic.

Line numbers:

/usr/src/sys/modules/if_bridge/../../net/if_bridge.c:2410 (bridge_input)
/usr/src/sys/net/if_ethersubr.c:543 (ether_input_internal)
/usr/src/sys/net/netisr.c:972 (netisr_dispatch_src)
/usr/src/sys/net/if_ethersubr.c:674 (ether_input)
/usr/src/sys/dev/bce/if_bce.c:6861 (bce_rx_intr)

Roger.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


[Bug 190785] [em] cpu affinity not working in FreeBSD 10-STABLE

2014-06-23 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=190785

John Baldwin j...@freebsd.org changed:

   What|Removed |Added

 CC||j...@freebsd.org

--- Comment #3 from John Baldwin j...@freebsd.org ---
Can you provide more details?  The em/igb drivers create additional taskqueue
threads for each queue, but 'cpuset -x' is only going to pin the interrupt
thread associated with that IRQ, not other threads the em driver may create.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol.

2014-06-23 Thread Adrian Chadd
Hi,

No, don't introduce out of order behaviour. Ever. You may not think
it's a problem for TCP, but UDP things and VPN things will start
getting very angry. There are VPN configurations out there that will
drop the VPN if frames are out of order.

The ixgbe driver is setting the flowid to the msix queue ID, rather
than a 32 bit unique flow id hash value for the flow. That makes it
hard to do traffic distribution where the flowid is available.

There's an lagg option to re-hash the mbuf rather than rely on the
flowid for outbound port choice - have you looked at using that? Did
that make any difference?



-a
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [patch][lagg] - Set a better granularity and distribution on roundrobin protocol.

2014-06-23 Thread Marcelo Araujo
2014-06-24 6:54 GMT+08:00 Adrian Chadd adr...@freebsd.org:

 Hi,

 No, don't introduce out of order behaviour. Ever.


Yes, it has out of order behavior; with my patch much less. I upload two
pcap files and you can see by yourself, if you don't believe in what I'm
talking about.

Test done using: iperf -s and iperf -c ip -i 1 -t 10.

1) Don't change the number of packets(default round robin behavior).
http://people.freebsd.org/~araujo/lagg/lagg-nop.cap
8 out of order packets.
Several SACKs.

2) Set the number of packets to 50.
http://people.freebsd.org/~araujo/lagg/lagg.cap
0 out of order packets.
Less SACKs.


 You may not think
 it's a problem for TCP, but UDP things and VPN things will start
 getting very angry. There are VPN configurations out there that will
 drop the VPN if frames are out of order.


I'm not thinking that will be a problem for TCP, but, in somehow it will
be, less throughput as I showed before, and less SACK. About the VPN,
please, tell me which softwares, and let me know where I can get a sample
to make a testbed.

However to be very honest, I don't believe anyone here when change
something at network protocols will make this extensive testbed. It is
almost impossible to predict what software it will works or not, and I
don't believe anyone here has all these stuff in hands.



 The ixgbe driver is setting the flowid to the msix queue ID, rather
 than a 32 bit unique flow id hash value for the flow. That makes it
 hard to do traffic distribution where the flowid is available.


Thanks for the explanation.



 There's an lagg option to re-hash the mbuf rather than rely on the
 flowid for outbound port choice - have you looked at using that? Did
 that make any difference?


Yes, I set to 0 the net.link.lagg.0.use _flowid, it make a little
difference to the default round robin implementation, but yet I can't reach
more than 5 Gbit/s. With my patch and set the packets to 50, it improved a
bit too.

So, thank you so much for all review, I don't know if you have time and a
testbed to make a real test, as I'm doing. I would be happy if you or more
people could make tests on that patch. Also, I have only ixgbe(4) to make
tests, would appreciate if this patch could be tested with other NICs too.

Best Regards,

-- 
Marcelo Araujo(__)ara...@freebsd.org
\\\'',)http://www.FreeBSD.org http://www.freebsd.org/   \/  \ ^
Power To Server. .\. /_)
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org