Re: Cwnd grows slowly during slow-start due to LRO of the receiver side.

2023-05-02 Thread Hans Petter Selasky

On 5/2/23 11:14, Hans Petter Selasky wrote:

Hi Chen!

The FreeBSD mbufs carry the number of ACKs that have been joined 
together into the following field:


m->m_pkthdr.lro_nsegs

Can this value be of any use to cc_newreno ?

--HPS


Hi Chen,

Have you tested using FreeBSD main / 14 ?

The "nsegs" are passed along like this:

nsegs = max(1, m->m_pkthdr.lro_nsegs);

...

cc_ack_received(tp, th, nsegs, CC_ACK);

...

(Newreno - FreeBSD-14)

incr = min(ccv->bytes_this_ack,
ccv->nsegs * abc_val *
CCV(ccv, t_maxseg));

And in FreeBSD-10 being mentioned in your article:

(Newreno - FreeBSD-10)

incr = min(ccv->bytes_this_ack,
V_tcp_abc_l_var * CCV(ccv, t_maxseg));


There is no such thing.

This issue may already have been fixed!

--HPS


On 5/2/23 09:46, Chen Shuo wrote:

As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
That is, during slow-start, when receiving an ACK of 'bytes_acked'

 cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt

As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
the negative impact of the delayed ACK algorithm.  RFC 5681 also
requires that a receiver SHOULD generate an ACK for at least every
second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
If both sender and receiver follow it. cwnd should grow exponentially
during slow-slow:

 cwnd *= 2    (per RTT)

However, LRO and TSO are widely used today, so receiver may generate
much less ACKs than it used to do.  As I observed, Both FreeBSD and
Linux generates at most one ACK per segment assembled by LRO/GRO.
The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.

Sending 1MB over a link of 100ms delay from FreeBSD 13.2:

  0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
[mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
  0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
65160, options [mss 1460,sackOK,TS val 563185696 ecr
495212525,nop,wscale 7], length 0
  0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
val 495212626 ecr 563185696], length 0
  // TSopt omitted below for brevity.

  // cwnd = 10 * MSS, sent 10 * MSS
  0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, 
length 14480


  // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
  0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
  0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, 
length 17376


  // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
  0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
  0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, 
length 20272


  // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
  0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
  0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
length 21500
  0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, 
length 1448


As a consequence, instead of growing exponentially, cwnd grows
more-or-less quadratically during slow-start, unless abc_l_var is
set to a sufficiently large value.

NewReno took more than 20 seconds to ramp up throughput to 100Mbps
over an emulated 100ms delay link.  While Linux took ~2 seconds.
I can provide the pcap file if anyone is interested.

Switching to CUBIC won't help, because it uses the logic in NewReno
ack_received() for slow start.

Is this a well-known issue and abc_l_var is the only cure for it?
https://calomel.org/freebsd_network_tuning.html

Thank you!

Best,
Shuo Chen









Re: Cwnd grows slowly during slow-start due to LRO of the receiver side.

2023-05-02 Thread Hans Petter Selasky

Hi Chen!

The FreeBSD mbufs carry the number of ACKs that have been joined 
together into the following field:


m->m_pkthdr.lro_nsegs

Can this value be of any use to cc_newreno ?

--HPS

On 5/2/23 09:46, Chen Shuo wrote:

As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
That is, during slow-start, when receiving an ACK of 'bytes_acked'

 cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt

As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
the negative impact of the delayed ACK algorithm.  RFC 5681 also
requires that a receiver SHOULD generate an ACK for at least every
second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
If both sender and receiver follow it. cwnd should grow exponentially
during slow-slow:

 cwnd *= 2(per RTT)

However, LRO and TSO are widely used today, so receiver may generate
much less ACKs than it used to do.  As I observed, Both FreeBSD and
Linux generates at most one ACK per segment assembled by LRO/GRO.
The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.

Sending 1MB over a link of 100ms delay from FreeBSD 13.2:

  0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
[mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
  0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
65160, options [mss 1460,sackOK,TS val 563185696 ecr
495212525,nop,wscale 7], length 0
  0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
val 495212626 ecr 563185696], length 0
  // TSopt omitted below for brevity.

  // cwnd = 10 * MSS, sent 10 * MSS
  0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, length 14480

  // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
  0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
  0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, length 
17376

  // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
  0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
  0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, length 
20272

  // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
  0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
  0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
length 21500
  0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, length 1448

As a consequence, instead of growing exponentially, cwnd grows
more-or-less quadratically during slow-start, unless abc_l_var is
set to a sufficiently large value.

NewReno took more than 20 seconds to ramp up throughput to 100Mbps
over an emulated 100ms delay link.  While Linux took ~2 seconds.
I can provide the pcap file if anyone is interested.

Switching to CUBIC won't help, because it uses the logic in NewReno
ack_received() for slow start.

Is this a well-known issue and abc_l_var is the only cure for it?
https://calomel.org/freebsd_network_tuning.html

Thank you!

Best,
Shuo Chen






Re: Ifconfig limitations

2023-04-19 Thread Hans Petter Selasky

On 4/19/23 00:19, Mina Galić wrote:

Do you know how the sysctl entries interact with renaming an interface?


Hi,

I think there is no interaction there.

We do have a sysctl_rename_oid() function, so it is technically possible.

Feel free to work on it, if you find any issues!

--HPS



Re: Ifconfig limitations

2023-04-18 Thread Hans Petter Selasky

On 4/18/23 11:44, Mina Galić wrote:

Hi HPS,

i don't see those sysctl entries for regular devices?
is this infiniband specific?
or is there anything I need to enable to get these sysctls?

Kind regards,



Hi,

The sysctl entries you are referring to are created by ibcore.ko . 
Unless the network card attaches to those API's

like mlx5ib, no such entries exist.

In upstream FreeBSD there is no software infiniband or RoCE support.

--HPS



Re: Ifconfig limitations

2023-04-18 Thread Hans Petter Selasky

Hi,

All the `/sys/class/net//*` entries are sysctl(8) entries, like 
Sobczak, pointed out. They are converted simply by replacing "/" with 
".", and there are some helper functions in:


contrib/ofed/libibumad/sysfs.c:	if (sysctlbyname(PATH_TO_SYS(path), str, 
, NULL, 0) == -1)


To do this conversion automagically. you specify the Linux equivalent as 
a "const char *" pointer, and then it looks up the value for you under 
FreeBSD.


We may not have all the entries, but most you need is there, and some 
additions specific to FreeBSD.


--HPS



Re: Too aggressive TCP ACKs

2022-10-26 Thread Hans Petter Selasky

On 10/26/22 10:57, Tom Jones wrote:

It focuses on QUIC, but congestion control dynamics don't change with
the protocol. You should be able to read there, but if not I'm happy to
send anyone a pdf.


If QUIC doesn't support TSO  (Large Send Offload), it cannot be compared 
I think.


--HPS



Re: Too aggressive TCP ACKs

2022-10-22 Thread Hans Petter Selasky

Hi,

Some thoughts about this topic.

Delaying ACKs means loss of performance when using Gigabit TCP 
connections in data centers. There it is important to ACK the data as 
quick as possible, to avoid running out of TCP window space. Thinking 
about TCP connections at 30 GBit/s and above!


I think the implementation should be exactly like it is.

There is a software LRO in FreeBSD to coalesce the ACKs before they hit 
the network stack, so there are no real problems there.


--HPS





Re: Help wanted with MFC 256820

2022-10-18 Thread Hans Petter Selasky

On 10/18/22 17:16, Koichiro Iwao wrote:

On Mon, Oct 17, 2022 at 09:16:12AM +0200, Hans Petter Selasky wrote:
  

Send me the "git show" output of the commit before you push it, and I'll
review it for you.

--HPS


I would like to MFC it to stable/12, too. See attached patches for both
branch.



stable/12 and stable/13

Approved.

Make sure the NOINET version of the kernel still builds, and also LINT 
(for amd64).


--HPS



Re: Help wanted with MFC 256820

2022-10-17 Thread Hans Petter Selasky

On 10/17/22 01:59, Koichiro Iwao wrote:

On Sun, Oct 16, 2022 at 06:12:40PM +0200, Hans Petter Selasky wrote:


Hi,

I think you should do:

git cherry-pick -x 9823a0c0acf4fc277a71336ea737e1de7c65742f

Then:

git commit --amend

And remove the "MFC after" tag.

Then push it to stable/13 . You maybe need to specify the override authors
flag before pushing.

--HPS


Hi Hans,

Thanks for the advice. Do you mean you have approved me to do that?



Send me the "git show" output of the commit before you push it, and I'll 
review it for you.


--HPS



Re: Help wanted with MFC 256820

2022-10-16 Thread Hans Petter Selasky

On 10/16/22 18:07, Koichiro Iwao wrote:

Hi,

I only have ports commit bit. I would like to ship this 1-year-old commit
to each stable branch (at least stable/13). I'm happy to help with 
anything.


I would like to try it if I'm permitted to do that with someone's approval.
I would appreciate it if someone let me know the procedure.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256820

Thanks,


Hi,

I think you should do:

git cherry-pick -x 9823a0c0acf4fc277a71336ea737e1de7c65742f

Then:

git commit --amend

And remove the "MFC after" tag.

Then push it to stable/13 . You maybe need to specify the override 
authors flag before pushing.


--HPS



Re: Is there a way to deterministically bring up two usb ethernet interfaces?

2022-06-21 Thread Hans Petter Selasky

Hi,

On 6/21/22 22:01, Bakul Shah wrote:

I think the problem is that the two interfaces don't always come up in the 
right sequence so which is ue0 and which is ue1 changes but they are connected 
to specific networks.
Thanks


Most likely not. Maybe devd events or sysctls can help you:

net.ue.0.%parent: axge0

This is a general problem with pluggable USB devices. Usually if the 
devices are connected to the same USB HUB, they will come up 
deterministicly.


--HPS



Re: Poor performance with stable/13 and Mellanox ConnectX-6 (mlx5)

2022-06-13 Thread Hans Petter Selasky

On 6/13/22 20:25, Mike Jakubik wrote:

Hello,

I have two new servers with a Mellnox ConnectX-6 card linked at 25Gb/s, 
however, I am unable to get much more than 6Gb/s when testing with iperf3.



The servers are Lenovo SR665 (2 x AMD EPYC 7443 24-Core Processor, 256 GB RAM, 
Mellanox ConnectX-6 Lx 10/25GbE SFP28 2-port OCP Ethernet Adapter)



They are connected to a Dell N3224PX-ON switch. Both servers are idle and not 
in use, with a fresh install of stable/13-ebea872f8, nothing running on them 
except ssh, sendmail, etc.



When i test with iperf3 I am unable to get a higher avg than about 6Gb/s. I 
have tried just about every knob listed in 
https://calomel.org/freebsd_network_tuning.html with little impact on the 
performance. The network cards have HW LRO enabled as per the driver 
documentation (though this only seems to lower IRQ usage with no impact on 
actual throughput).



The same exact servers tested on Linux (fedora 34) produced nearly 3x the 
performance (see attached screenshots), i was able to get a steady 14.6Gb/s 
rate with nearly 0 retries shown in iperf, the performance on FreeBSD seems to 
avg at around 6Gbs but it is very sporadic during the iperf run.



I have run out of ideas, any suggestions are welcome. Considering Netflix uses 
very similar HW and they push 400 Gb/s tells me there is something really wrong 
here or Netflix isnt sharing all their secret sauce.




Some ideas:

Try to disable "rxpause,txpause" when setting the media.

Keep HW LRO off for now, it doesn't work for large number of connections.

What is the CPU usage during test? Is iperf3 running on a CPU-core which 
has direct access to the NIC's numa domain?


Is the NIC installed in the "correct" PCI high-performance slot?

There are some sysctl knobs which may tell where the problem is, if it's 
PCI backpressure or something else.


sysctl -a | grep diag_pci_enable
sysctl -a | grep diag_general_enable

Set these two to 1, then run some traffic and dump all mce sysctls:

sysctl -a | grep mce > dump.txt

--HPS



Re: recommended USB-c Ethernet adapter for laptop

2022-05-05 Thread Hans Petter Selasky

On 5/5/22 15:08, Ludovit Koren wrote:


Hello,

I am using FreeBSD 14.0-CURRENT #0 main-n253876-1e9ce60a6d7 on my laptop
and to find a reliable USB-c Ethernet adapter. I have tried the following:

ure0 on uhub0
ure0:  on usbus1
miibus0:  on ure0
rgephy0:  PHY 0 on miibus0
rgephy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 
1000baseT-FDX-master, auto
ue0:  on ure0
ue0: Ethernet address: 00:e0:4c:68:04:20

and

ugen1.4:  at usbus1
axge0 on uhub3
axge0:  on usbus1
miibus0:  on axge0
ukphy0:  PHY 3 on miibus0
ukphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 
1000baseT-FDX-master, auto, auto-flow
ue0:  on axge0
ue0: Ethernet address: 50:a0:30:02:b7:f6

both of them stop working when there is significant traffic on the
interface, i.e. scp ISO image to the different computer.

Please, could you advice good, reliable USB-c Ethernet interface.

Thank you very much.

Regards,

lk



Hi,

I think the older axge's are OK.


axge0 on uhub0
axge0:  on usbus0
miibus0:  on axge0
rgephy0:  PHY 3 on miibus0
rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 
100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, 
1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow
ue0:  on axge0
ue0: Ethernet address: 3c:2e:be:a0:b5:db


BTW: We should debug the issue with the axge's .

--HPS



Re: 60+% ping packet loss on Pi3 under -current and stable-13

2022-05-02 Thread Hans Petter Selasky

On 5/2/22 18:43, Bakul Shah wrote:

This is due to tcpdump.


On May 2, 2022, at 8:53 AM, bob prohaska  wrote:

One new oddity is seeing in the daily security report the lines
www.zefox.org  kernel log messages:
+ue0: promiscuous mode enabled
+ue0: promiscuous mode disabled
+ue0: promiscuous mode enabled
+ue0: promiscuous mode disabled
+ue0: promiscuous mode enabled
+ue0: promiscuous mode disabled





Hi,

You may also want to correlate:

usbconfig dump_stats

with netstat for ue0 .

Especially the BULK column. If something is wrong on the USB level, it 
will show up there.


--HPS



Re: 60+% ping packet loss on Pi3 under -current and stable-13

2022-05-02 Thread Hans Petter Selasky

On 5/2/22 03:13, bob prohaska wrote:

On Sun, May 01, 2022 at 05:10:59PM -0700, Mark Millard wrote:
[reply at end]

On 2022-May-1, at 16:27, bob prohaska  wrote:


On Sun, May 01, 2022 at 12:58:45PM -0700, Mark Millard wrote:


Looks like there is some problem getting past
gig1-1-1.gw.davsca11.sonic.net .



That seems independent of my own internal connection problems,
but worth taking up with my ISP on Monday. Meanwhile, can you
ping any other hosts in the 50.1.20.31-24 range? All are up
at the moment. Hosts 28 and 24 are the troublemakers.

If anybody cares there's an ascii-art network diagram at
http://www.zefox.net/~fbsd/netmap

Not sure it'll survive the mailing list, but here goes:
dsl_modem-switch-router-lan---wifi-pi4_workstation
  |  | |
  |  | |---Mac workstation
  |  |
  |  |--printer
--|
|
|--50.1.20.30 ns1.zefox.net Pi2 12.3 usb-serial50.1.20.27
|--50.1.20.29 ns2.zefox.net Pi2 12.3 usb-serial50.1.20.30
|--50.1.20.27 www.zefox.net Pi2 12.3 usb-serial50.1.20.26
|--50.1.20.26 www.zefox.com Pi2 -current usb-serial---50.1.20.24
|--50.1.20.24 pelorus.zefox.org Pi3 13.1 usb-serial---50.1.20.28
switch
|--50.1.20.25 nemesis.zefox.com Pi4 -current usb-serial---50.1.20.29
|--50.1.20.28 www.zefox.org Pi3 -current usb-serial50.1.20.25



For ns1.zefox.net there is no problem and
it looks like:

  My traceroute  [v0.95]
amd64_ZFS (192.168.1.120) -> ns1.zefox.net (50.1.20.29)
2022-05-01T16:52:27-0700
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
Packets   
Pings
  HostLoss%   Snt   Last   Avg  
Best  Wrst StDev
  1. 192.168.1.1   0.0%531.2   0.8  
 0.1   1.4   0.4
  2. 172.30.26.67  0.0%53   11.8  25.0  
11.8  61.0  11.4
  3. 68.85.243.125 0.0%53   10.0  10.0  
 7.7  46.9   5.3
  4. 96.216.60.165 0.0%538.8   9.3  
 7.8  12.1   0.9
  5. 68.85.243.197 0.0%538.6  13.2  
 8.6  28.3   4.2
  6. be-36231-cs03.seattle.wa.ibone.comcast.net0.0%53   15.3  14.8  
13.0  16.9   1.0
  7. be-2312-pe12.seattle.wa.ibone.comcast.net 0.0%53   16.2  15.9  
12.9  59.8   6.5
  8. (waiting for reply)
  9. be3717.ccr22.sfo01.atlas.cogentco.com 0.0%53   29.8  30.9  
26.5  97.9  10.1
10. be2430.ccr31.sjc04.atlas.cogentco.com 0.0%53   29.0  29.0  
26.6  39.3   1.8
11. 38.104.141.82 0.0%53   28.9  33.8  
26.1 115.0  17.0
12. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net0.0%53   32.1  31.3  
29.2  33.9   1.0
13. 0.xe-0-0-0.cr1.scrmca13.sonic.net 0.0%53   30.5  32.1  
29.2  57.6   4.3
14. gig1-1-1.gw.wscrca11.sonic.net0.0%53   31.8  32.0  
28.8  43.7   2.0
15. gig1-1-1.gw.davsca11.sonic.net0.0%52   31.0  32.4  
30.2  38.4   1.4
16. ns1.zefox.net 0.0%52   51.4  51.1  
49.8  53.4   0.8

ns2.zefox.net and others got a 17. instead of
a 16. An example is:

  My traceroute  [v0.95]
amd64_ZFS (192.168.1.120) -> ns2.zefox.net (50.1.20.30)
2022-05-01T16:58:45-0700
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
Packets   
Pings
  HostLoss%   Snt   Last   Avg  
Best  Wrst StDev
  1. 192.168.1.1   0.0%550.3   0.9  
 0.1   1.4   0.4
  2. 172.30.26.66  0.0%55   13.5  26.4  
10.4  54.7  10.1
  3. 68.85.243.77  0.0%55   10.5   9.1  
 7.9  10.5   0.6
  4. 24.124.129.1060.0%548.3   9.5  
 8.2  13.4   1.0
  5. 96.216.60.165 0.0%548.8   9.8  
 7.8  22.8   2.2
  6. 68.85.243.197 0.0%54   17.1  15.1  
 9.0  37.3   5.9
  7. be-36241-cs04.seattle.wa.ibone.comcast.net0.0%54   15.2  15.0  
13.2  17.8   0.9
  8. be-2412-pe12.seattle.wa.ibone.comcast.net 0.0%54   15.0  14.8  
13.2  17.1   1.0
  9. (waiting for reply)
10. be2075.ccr21.sfo01.atlas.cogentco.com 0.0%54   28.4  29.2  
26.9  36.8   1.4
11. be2379.ccr31.sjc04.atlas.cogentco.com 0.0%54   29.8  30.0  
27.3  84.2   7.6

Re: epoch callback panic

2022-04-01 Thread Hans Petter Selasky

On 4/1/22 22:33, Hans Petter Selasky wrote:

Hi,

Maybe you need to grab the lock before destroying it?

Is this easily reproducible?

--HPS


Can you figure out the owner of the lock?

I guess the owner is not in an epoch section like it should!

--HPS



Re: epoch callback panic

2022-04-01 Thread Hans Petter Selasky

On 4/1/22 19:07, Peter Holm wrote:

markj@ asked me to post this one:

panic: rw lock 0xf801bccb1410 not unlocked
cpuid = 4
time = 1648770125
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00e48a3d10
vpanic() at vpanic+0x17f/frame 0xfe00e48a3d60
panic() at panic+0x43/frame 0xfe00e48a3dc0
_rw_destroy() at _rw_destroy+0x35/frame 0xfe00e48a3dd0
in_lltable_destroy_lle_unlocked() at in_lltable_destroy_lle_unlocked+0x1a/frame 
0xfe00e48a3df0
epoch_call_task() at epoch_call_task+0x13a/frame 0xfe00e48a3e40
gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame 0xfe00e48a3ec0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame 0xfe00e48a3ef0
fork_exit() at fork_exit+0x80/frame 0xfe00e48a3f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe00e48a3f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

Details @ https://people.freebsd.org/~pho/stress/log/log0275.txt



Hi,

Maybe you need to grab the lock before destroying it?

Is this easily reproducible?

--HPS




Re: Receive Side Coalescing(RSC) and LRO

2022-02-09 Thread Hans Petter Selasky

On 2/9/22 10:28, Wei Hu wrote:

But maybe it is the way to go as FreeBSD lacks system wide support to 
differentiate hardware and software LRO.


Hi Wei,

Software LRO has been found superior to hardware LRO, especially when 
hundreds of connections are involved. The hardware is simply not able 
keep up.


For few connections, hardware LRO may give some performance benefits, 
but I would rather recommend RoCE/Infiniband for such use-cases, because 
at the high rates involved, even a single packet loss will cause 
terrible performance degradation.


Personally, I don't see a need for hardware LRO.

BTW: kib@ is working on adding more capability bits for ifconfig via nv 
lists. You might be interested in that:


https://reviews.freebsd.org/D32551

--HPS




Re: Receive Side Coalescing(RSC) and LRO

2022-02-08 Thread Hans Petter Selasky

On 2/8/22 16:32, Wei Hu wrote:

Hi,

I am trying to find the term that FreeBSD uses for the network offloading 
feature like RSC. RSC is Microsoft's term which is essentially the same as LRO 
in Linux, in which the packet aggregation happens on the hardware NIC.

The LRO on FreeBSD seems different. It looks to be the GRO in Linux, in which 
the packet aggregation happens in software above the NIC driver.  There is a 
feature bit IFCAP_LRO in net/if.h.

So, is there a different feature bit on FreeBSD which means only for the 
hardware RSC/LRO? Or does the IFCAP_LRO mean both hardware and software LRO? 
What I want to achieve is to let user disable the hardware RSC/LRO and leave 
software LRO untouched on FreeBSD. What is the proper way to differentiate 
these two on FreeBSD?

Thanks,
Wei



Adding:

RSS assisted sorted LRO

New child needs new name?

--HPS



Re: LAN ure interface problem

2021-10-22 Thread Hans Petter Selasky

On 10/22/21 16:00, Ludovit Koren wrote:


Hi,

I have installed FreeBSD 14.0-CURRENT #1 main-n250134-225639e7db6-dirty
on my notebook HP EliteBook 830 G7 and I am using RealTek usb LAN
interface:

ure0 on uhub0
ure0:  on usbus1
miibus0:  on ure0
rgephy0:  PHY 0 on miibus0
rgephy0: OUI 0x00e04c, model 0x, rev. 0
rgephy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 
1000baseT-FDX-master, auto
ue0:  on ure0
ue0: bpf attached
ue0: Ethernet address: 00:e0:4c:68:04:20


When there is bigger load on the interface, for example rsync of the big
directory, the carrier is lost. The only solution I found is to remove
and insert the usb interface; ifconfig ue0 down, ifconfig ue0 up did not
help. The output of the ifconfig:

ue0: flags=8843 metric 0 mtu 1500
 
options=68009b
 ether 00:e0:4c:68:04:20
 inet 192.168.1.18 netmask 0xff00 broadcast 192.168.1.255
 media: Ethernet autoselect (100baseTX )
 status: active
 nd6 options=29

I do not know and did not find anything relevant, if the driver is buggy
or the hardware has some problems. Please, advice.

Regards,



Not the same device, but similar issue:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=258057

--HPS




Re: Creating/destroying bulk VLAN interfaces takes too long

2021-08-30 Thread Hans Petter Selasky

On 8/30/21 9:13 AM, Özkan KIRIK wrote:

Hello,

I'm using FreeBSD stable/12. Creating/destroying bulk vlan interfaces takes
too long to finish. Running parallel doesn't matter.
Is there any fast way to create 100 vlan interfaces?

seq 1 100 | /usr/bin/time xargs -t -n 1 -I % ifconfig em1.% create
...
ifconfig em1.99 create
ifconfig em1.100 create
14.78 real 0.03 user 1.19 sys

with 4 parallel workers:
seq 1 100 | /usr/bin/time xargs -t -P4 -n 1 -I % ifconfig em1.% create
...
ifconfig em1.99 create
ifconfig em1.100 create
14.46 real 0.03 user 1.20 sys

destroying:
ifconfig -g vlan | /usr/bin/time xargs -t -n 1 -I % ifconfig % destroy
...
ifconfig em1.98 destroy
ifconfig em1.100 destroy
21.89 real 0.03 user 1.64 sys

Any suggestions?



Creating VLAN interfaces sometimes involve firmware commands on the 
network devices which take time.


--HPS



Re: fib[46]_lookup_rt usage in netflow.c, sa_len comparison with AF_INET

2021-05-18 Thread Hans Petter Selasky

On 5/18/21 5:46 PM, Guy Yur wrote:

Hi,

I was looking for examples on how to use fib6_lookup_rt and noticed
there are comparisons between sa_len and AF_ flags in netflow.c
if (nh->gw_sa.sa_len == AF_INET)
if (nh->gw_sa.sa_len == AF_INET6)

Are these typos for sa_family?


Hi,

According to:

sys/net/route/nhop.c

Yes. Should both be sa_family.

--HPS



https://cgit.freebsd.org/src/tree/sys/netgraph/netflow/netflow.c#n363
https://cgit.freebsd.org/src/tree/sys/netgraph/netflow/netflow.c#n437

Added in 
https://cgit.freebsd.org/src/commit/sys/netgraph/netflow/netflow.c?id=4e19e0d92ac6dfa5d2df6d525922f1e60487a9cc 


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: rip6_output not in net epoch across call to ip6_setpktopts()

2021-05-18 Thread Hans Petter Selasky

On 5/18/21 9:18 PM, Ryan Stone wrote:

The issue seems to be that rip6_output() calls into ip6_setpktopts()
outside of the net epoch.  Should I just wrap the setpktopts call in a
net epoch enter/exit, or does anybody think that there's something
cleverer that should be done there?


Hi,

Epoch automagically detects recursion, and optimizes for that. If there 
is an inner section of enter/exit, covered by an outer section, the 
overhead will be less for the inner section.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: RSS on FreeBSD stable/12 gateway

2021-03-07 Thread Hans Petter Selasky

On 3/7/21 10:03 PM, Özkan KIRIK wrote:

Any suggestions to enable RSS ?


I found that RSS hardware computed checksums are not correct when using 
iflib (intel hardware), compared to what the software expects, so 
traffic goes on wrong queue and gets dropped simply. Maybe you see 
something similar.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ims_merge in in_mcast.c

2020-10-13 Thread Hans Petter Selasky

On 2020-10-12 19:11, Dheeraj Kandula wrote:

On line 987 and 991 shouldn't the index be 0 instead of 1.

i.e. ims->ims_st[0].ex -= n;
and
ims->ims_st[0].in -= n;

On a rollback, the entry at index 0 is incremented and the entry at index 1
is decremented.

On a non-rollback merge, the entry at index 0 is decremented and the entry
at index 1 is incremented.


Hi,

If you look at inm_commit() you see that [0] is overwritten by [1], so I 
believe the current code is correct. Same goes for both IPv4 and IPv6. 
Are you seeing an issue with multicast investigating this issue?


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx5 irq

2020-10-01 Thread Hans Petter Selasky

On 2020-10-01 18:57, Slawa Olhovchenkov wrote:

Do you planed to use more describe irq's name?


The kernel doesn't support more than X number of bytes per name 
unfortunately.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx5 irq

2020-10-01 Thread Hans Petter Selasky

On 2020-10-01 11:13, Michal Vančo via freebsd-net wrote:

On 01/10/2020 10:52, Hans Petter Selasky wrote:

On 2020-10-01 10:24, Michal Vančo wrote:

But why is the actual number of IRQ lines bigger than number of CPU
cores?


There are some dedicated IRQ's used for firmware management.

Else the driver will use the number of online CPU's by default as the
number of rings, if the hardware supports it.


Thanks for clarification. Is there any way to optimize this? In my case
I have 2 CPU sockets with 8 cores each (SMT is disabled). NIC is
connected via PCIe to the first CPU socket (numa domain 0). In this
case, wouldn't it be better if all interrupts were firing only on cores
of first socket?



Hi,

You can use "cpuset" to bind those IRQ threads to the right core.

There is no automatic way :-)

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx5 irq

2020-10-01 Thread Hans Petter Selasky

On 2020-10-01 10:24, Michal Vančo wrote:

But why is the actual number of IRQ lines bigger than number of CPU cores?


There are some dedicated IRQ's used for firmware management.

Else the driver will use the number of online CPU's by default as the 
number of rings, if the hardware supports it.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx5 irq

2020-10-01 Thread Hans Petter Selasky

On 2020-10-01 09:39, Michal Vančo via freebsd-net wrote:

Hi


Hi Michal,


I have a server with one Mellanox ConnectX-4 adapter and the following
CPU configuration (SMT disabled):

# dmesg | grep SMP
FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs
FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads
FreeBSD/SMP Online: 2 package(s) x 8 core(s)

What I don't understand is the number of IRQs allocated for each mlx5_core:

# vmstat -i | grep mlx5_core
irq320: mlx5_core0 1  0
irq321: mlx5_core0  18646775 84
irq322: mlx5_core0    21  0
irq323: mlx5_core0 97793  0
irq324: mlx5_core0 84685  0
irq325: mlx5_core0 89288  0
irq326: mlx5_core0 93564  0
irq327: mlx5_core0 86892  0
irq328: mlx5_core0 99141  0
irq329: mlx5_core0 86695  0
irq330: mlx5_core0    104023  0
irq331: mlx5_core0 85238  0
irq332: mlx5_core0 88387  0
irq333: mlx5_core0  93310221    420


^^^ it appears you have some application which is using a single TCP 
connection heavily. Then the traffic doesn't get distributed.



irq334: mlx5_core0   1135906  5
irq335: mlx5_core0 85394  0
irq336: mlx5_core0 88361  0
irq337: mlx5_core0 88826  0
irq338: mlx5_core0  17909515 81
irq339: mlx5_core1 1  0
irq340: mlx5_core1  18646948 84
irq341: mlx5_core1    25  0
irq342: mlx5_core1    208684  1
irq343: mlx5_core1 91567  0
irq344: mlx5_core1 88340  0
irq345: mlx5_core1 92597  0
irq346: mlx5_core1 85108  0
irq347: mlx5_core1 98858  0
irq348: mlx5_core1 88103  0
irq349: mlx5_core1    104906  0
irq350: mlx5_core1 84947  0
irq351: mlx5_core1 99767  0
irq352: mlx5_core1   9482571 43
irq353: mlx5_core1   1724267  8
irq354: mlx5_core1 96698  0
irq355: mlx5_core1    473324  2
irq356: mlx5_core1 86760  0
irq357: mlx5_core1  11590861 52

I expected number of IRQs to be equal number of CPUS. According to
Mellanox docs, I should be able to pin each interrupt to specific core
to loadbalance. How can I do this in this case when number of IRQs is
larger than number of cores? Is there any way to lower the number of
interrupts?



You can lower the number of interrupts by changing the coalescing 
sysctl's in the mce..conf tree.


dev.mce.0.conf.tx_coalesce_pkts: 32
dev.mce.0.conf.tx_coalesce_usecs: 16
dev.mce.0.conf.rx_coalesce_pkts: 32
dev.mce.0.conf.rx_coalesce_usecs: 3

For example 1024 pkts and 125 us.

And also set the queue size bigger than 1024 pkts:

dev.mce.0.conf.rx_queue_size: 1024
dev.mce.0.conf.tx_queue_size: 1024

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: bridge/igb panic: sleepq_add: td 0xfffffe01bbce5300 to sleep on wchan 0xffffffff8157d9a0 with sleeping prohibited

2020-09-11 Thread Hans Petter Selasky

On 2020-09-11 14:08, Hans Petter Selasky wrote:

I think this is another variant of:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=232362


Also adding this one for the record:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240609

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: bridge/igb panic: sleepq_add: td 0xfffffe01bbce5300 to sleep on wchan 0xffffffff8157d9a0 with sleeping prohibited

2020-09-11 Thread Hans Petter Selasky

On 2020-09-11 13:47, xto...@mm.st wrote:

xto...@mm.st wrote:
Updating from latest CURRENT snapshot 
(FreeBSD-13.0-CURRENT-amd64-20200910-1544934ffb2) to r365620 broke the 
bridges with igb (I350-T2) for me.  Booting to kernel.old and/or 
commenting the entries in rc.conf helps.


rc.conf:

cloned_interfaces="bridge0 bridge1 tap0 tap1 tap2 tap3"
ifconfig_em0="inet ..."
ifconfig_igb0="up"
ifconfig_igb1="up"
ifconfig_bridge0="addm igb0 addm tap0 addm tap1"
ifconfig_bridge1="addm igb1 addm tap2 addm tap3"


NICs (em0 is on-board, igb0/igb1 is addon I350-T2 card):

em0:  mem 0x92d0-0x92d1 
at device 31.6 numa-domain 0 on pci0

em0: Using 1024 TX descriptors and 1024 RX descriptors
em0: Using an MSI interrupt
em0: Ethernet address: e0:d5:5e:6c:aa:36
em0: netmap queues/slots: TX 1/1024, RX 1/1024
igb0:  mem 
0xfbb0-0xfbbf,0xfbc84000-0xfbc87fff at device 0.0 numa-domain 
0 on pci16

igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 8 RX queues 8 TX queues
igb0: Using MSI-X interrupts with 9 vectors
igb0: Ethernet address: a0:36:9f:0a:cf:42
igb0: netmap queues/slots: TX 8/1024, RX 8/1024
igb1:  mem 
0xfba0-0xfbaf,0xfbc8-0xfbc83fff at device 0.1 numa-domain 
0 on pci16

igb1: Using 1024 TX descriptors and 1024 RX descriptors
igb1: Using 8 RX queues 8 TX queues
igb1: Using MSI-X interrupts with 9 vectors
igb1: Ethernet address: a0:36:9f:0a:cf:43
igb1: netmap queues/slots: TX 8/1024, RX 8/1024


panic:

panic: sleepq_add: td 0xfe01bbce5300 to sleep on wchan 
0x8157d9a0 with sleeping prohibited

cpuid = 16
time = 1599808542
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe01ba658c40

vpanic() at vpanic+0x182/frame 0xfe01ba658c90
panic() at panic+0x43/frame 0xfe01ba658cf0
sleepq_add() at sleepq_add+0x359/frame 0xfe01ba658d40
_sleep() at _sleep+0x20c/frame 0xfe01ba658df0
pause_sbt() at pause_sbt+0xfe/frame 0xfe01ba658e20
e1000_reset_hw_82580() at e1000_reset_hw_82580+0x1c8/frame 
0xfe01ba658e60

em_if_stop() at em_if_stop+0x1b/frame 0xfe01ba658e80
iflib_stop() at iflib_stop+0xbd/frame 0xfe01ba658ed0
iflib_if_ioctl() at iflib_if_ioctl+0x397/frame 0xfe01ba658f40
bridge_mutecaps() at bridge_mutecaps+0x145/frame 0xfe01ba658fb0
bridge_ioctl_add() at bridge_ioctl_add+0x468/frame 0xfe01ba659000
bridge_ioctl() at bridge_ioctl+0x32b/frame 0xfe01ba6590d0
in_control() at in_control+0x322/frame 0xfe01ba659180
ifioctl() at ifioctl+0x3e8/frame 0xfe01ba659250
kern_ioctl() at kern_ioctl+0x28e/frame 0xfe01ba6592c0
sys_ioctl() at sys_ioctl+0x127/frame 0xfe01ba659390
amd64_syscall() at amd64_syscall+0x140/frame 0xfe01ba6594b0
fast_syscall_common() at fast_syscall_common+0xf8/frame 
0xfe01ba6594b0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8004b4aba, rsp = 
0x7fffe2b8, rbp = 0x7fffe360 ---

Uptime: 14s
Dumping 3794 out of 97961 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%






Hi,

I think this is another variant of:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=232362

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Ipv6 neighbor limit

2020-09-03 Thread Hans Petter Selasky

On 2020-09-03 14:34, Cristian Cardoso wrote:

Hi
Would anyone know if there is any limit in the FreeBSD kernel for IPv6
neighbors? I checked the ndp documentation and found nothing, looking
at the return of the sysctl command I also did not find anything
explicit.



Hi,

There is something called:

sys/netinet/in.h:#defineIP_MAX_MEMBERSHIPS  4095
sys/netinet6/in6.h:#define  IPV6_MAX_MEMBERSHIPS4095

Is this what you are looking for?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: somewhat reproducable vimage panic

2020-08-10 Thread Hans Petter Selasky

On 2020-07-23 21:26, Bjoern A. Zeeb wrote:
That’ll probably work;  still, the deferred teardown work seems wrong to 
me;  I haven’t investigated;  the patch kind-of says exactly that as 
well: if “wait until deferred stuff is done” is all we are doing, why 
can we not do it on the spot then?


Hi Bjoern,

Trying to move the discussion over to Phabricator at:
https://reviews.freebsd.org/D24914

The answer to your question I believe is this commit:

https://svnweb.freebsd.org/base/head/sys/netinet/in_mcast.c?revision=333175=markup

It affects both IPv4 and IPv6.

I know that sometimes multicast entries can be freed from timer 
callbacks. I think having a task, probably one is enough, for network 
related configuration is acceptable. With D24914 there will be two 
threads to teardown which is probably overkill, but anyway makes a solid 
solution for now.


I don't know why Stephen didn't think about draining those tasks. I know 
some people are not actively using VIMAGE and that might be the reason.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Multicast issue, interface not leaving Mutlicast Group

2020-08-08 Thread Hans Petter Selasky

On 2020-08-07 15:25, Abelenda Diego wrote:

Hello,

I have discovered that I had a multicast issue for years I did not know about. 
I use a FreeBSD (opnsense) setup as router for my home network and have 
igmpproxy for IPTV. Somehow everything seems to work, until I realized that my 
ISP was making a DoS with multicast. It is pretty much what was described years 
ago here: 
https://forum.netgate.com/topic/62591/igmp-issues-causing-isp-to-perform-multicast-dos-on-my-pfsense/7.
 But the solution of not using FreeBSD seem weird. So dug a lot learning about 
Multicast IGMPv{2,3} etc in the process. Here is an abstract of what I found:



Which version of FreeBSD is this (uname -a) ?

There has been some fixes in the multicast area from time to time, and 
you should make sure you've got all the fixes incorporated in the kernel 
you are using, typically by testing a kernel based on a -stable or 
-current branch of FreeBSD.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: somewhat reproducable vimage panic

2020-08-04 Thread Hans Petter Selasky

On 2020-07-25 21:21, John-Mark Gurney wrote:

So far so good...  I am getting these on occasion:
in6_purgeaddr: err=65, destination address delete failed


Maybe you could add a "kdb_backtrace()" call when that error happens?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: somewhat reproducable vimage panic

2020-07-27 Thread Hans Petter Selasky

On 2020-07-25 21:21, John-Mark Gurney wrote:

Yeah, agreed. I think hselasky has a better fix:
https://reviews.freebsd.org/D24914

I just saw his e-mail in a different thread.

I'm testing out this patch now, and let people know how it goes.. It'll
be nice to not have to worry about these panics..

So far so good...  I am getting these on occasion:
in6_purgeaddr: err=65, destination address delete failed

But that's more that the patch prevented a panic.

The other issue that I'm now seeing is that because we don't forcefully
clear out the multicast task, it can take a good 20+ seconds from the
time a jail is destroyed to the interface appearing again in vnet0.
Pretty sure this is related to the dmesg from above...


Hi,

D24914 just ensures proper draining. Feel free to accept the patch if 
you think I should submit it. It fixes some problems seen at work too!


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Abandoning ifnet work

2020-07-23 Thread Hans Petter Selasky

On 2020-07-23 05:45, Kyle Evans wrote:

abandoning because the review process for this area is simply not


Are you looking for this:

https://reviews.freebsd.org/D24914

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Multicast: membership to (*, G) group after leaving a (S, G) group

2020-07-07 Thread Hans Petter Selasky

On 2020-07-07 13:55, Fabrice Colliot wrote:

Sorry, I forgot to mention it. I've tried on FreeBSD 11.3 and FreeBSD 12.0.



Can you try 12.0 using a 12-stable kernel and see if there are any 
differences?


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Multicast: membership to (*, G) group after leaving a (S, G) group

2020-07-07 Thread Hans Petter Selasky

On 2020-07-07 12:01, Fabrice Colliot wrote:

Hi,

I'm using smcroute to join and leave multicast groups and I don't
understand the behavior of FreeBSD when the group is left.

Here is what I do:

smcroute join em1 10.3.4.5 224.0.55.55
ifmcstat -i em1
em1:
 inet 10.10.0.1
 igmpv3 rv 2 qi 125 qri 10 uri 3
 group 224.0.55.55 mode include
 mcast-macaddr 01:00:5e:00:37:37
 group 224.0.0.1 mode exclude
 mcast-macaddr 01:00:5e:00:00:01

smcroute leave em1 10.3.4.5 224.0.55.55
ifmcstat -i em1
em1:
 inet 10.10.0.1
 igmpv3 rv 2 qi 125 qri 10 uri 3
 group 224.0.55.55 mode undefined
 mcast-macaddr 01:00:5e:00:37:37
 group 224.0.0.1 mode exclude
 mcast-macaddr 01:00:5e:00:00:01

At this point, I expected to have no membership left on em1 for 224.0.55.55
but ifmcstat shows that the interface is still a member of the group but in
undefined mode.

I was wondering if anybody could tell me why the group membership seems to
be transitioned to a (*, G) membership when all the (S, G) memberships are
removed.



Which version of FreeBSD is this?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: LTE modem?

2020-05-05 Thread Hans Petter Selasky

On 2020-05-05 08:12, Shamim Shahriar wrote:

Personally I prefer the ones that offer an Ethernet port;


The Huawei E3372 LTE USB-stik   has a FreeBSD compatible USB ethernet port.

But does not support voice calls, only SMS.

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib]

2020-04-07 Thread Hans Petter Selasky

On 2020-04-08 01:23, Mark Johnston wrote:

On Mon, Apr 06, 2020 at 02:34:50PM -0700, Eric Joyner wrote:

On Mon, Apr 6, 2020 at 2:29 PM Mark Johnston  wrote:


On Mon, Apr 06, 2020 at 02:19:25PM -0700, Eric Joyner wrote:

Mark,

I think I was mistaken about the backtrace looking the same. I was

looking

at it from within ddb, and I think I focused on the
epoch_block_handler_preempt line and didn't notice that it only stopped
there this time. Here's the new one I've got from kgdb:


Thanks.  Could you try to print "td->td_name" from frame 4?  It should
also be available as er->er_blockedtd.  Basically, I'm trying to verify
that the interrupt thread itself isn't the one that we're waiting for,
else there is another bug to be fixed.

If you can provide kernel symbols and vmcore, I'd be happy to look at it
myself.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



Here's what I get:

(kgdb) frame 4
#4  epoch_block_handler_preempt (global=0xf80003de0100,
cr=0xfe00dee85900, arg=0x0) at /usr/src/sys/kern/subr_epoch.c:507
507 }
(kgdb) print td->td_name
$1 = "if_io_tqg_31\000\000\000\000\000\000\000"
(kgdb) print er->er_blockedtd
$2 = (struct thread *) 0x0


I spent some time looking at the core.  It looks like we have yet
another problem: the gtaskqueue code won't exit the net epoch if it is
constantly running a net task.  Could you please retry with the patches
from before, and this one included?



There is the same issue in kern_intr.c (FYI).

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: additional ifreq accessors?

2020-04-07 Thread Hans Petter Selasky

On 2020-04-07 19:26, Poul-Henning Kamp wrote:


In message <20200407172151.gb72...@spindle.one-eyed-alien.net>, Brooks Davis 
writes:


My question for the lists is: should we adopt the
more-technically-correct accessors in FreeBSD or stick with
slightly-cheaper and more conventional aliasing approach[0]?


The accessors buys us much more code-isolation, so that would be my vote.



Is there a reason for using "void *" here?

char *ifr_addr_get_data(void *ifrp);

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH]: ipoib with mlx4 initialisation ordering

2020-02-22 Thread Hans Petter Selasky

On 2020-02-22 01:48, Andreas Kempe wrote:

Hello everyone,

We have had issues with our machine using IPoIB on FreeBSD with the
mlx4 driver. The machine would hang on shutdown.

We traced the issue to IPoIB registering multicast groups that
increase the reference count of the port in the ib_multicast client.
When shutting down the machine, the kernel tore down the ib_multicast
before it tore down IPoIB, causing it to wait forever for the
references to disappear before it deleted the multicast client.

This issue can be remedied by changing the initialisation of the IPoIB
module to happen after the mlx4 driver is initialised. By doing this,
all multicast groups will be cleaned up before the ib_multicast client
is destroyed.

See patch attached. Sponsored by: Lysator ACS

Cordially,
Andreas Kempe


I'll have a closer look on Monday.

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: epoch and ath(4) - what should we be doing?

2020-02-20 Thread Hans Petter Selasky

On 2020-02-20 02:01, Adrian Chadd wrote:

Questions:

* are these things recursive?


Yes.


* what are the rules around sleeping? I've seen some ... discussions
that were quite animated around this.


Any non-sleepable lock is allowed under EPOCH(9).


* what should I be doing as an epoch tracker if I could call the
receive routine from multiple paths. I see a few drivers have a single
place where they're doing EPOCH_ENTER/EPOCH_EXIT using an epoch
tracker allocated in the interrupt handler stack, but what if I also
want to call that receive path from another function path too? Can I
just stuff an epoch_tracker on the stack and it'll DTRT ?


Try:
https://reviews.freebsd.org/D23674


* .. is there some updated doc or brain dump somewhere I can read? I'd
like to go add this to a couple out of tree wifi drivers under
development so this would make that whole thing much easier.


Gleb ???

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Does sosend() need CURVNET_SET/CURVNET_RESTORE?

2020-02-02 Thread Hans Petter Selasky

On 2020-02-02 22:22, Rick Macklem wrote:

Hi,

The current krpc code calls sosend() and soreceive() without any
CURVNET_SET()/CURVNET_RESTORE() wrapped around them.

When I recently used sosend_generic(), it panic'd without them.

Do they need to be added around sosend()/soreceive()?

I'll admit to knowing nothing about vnet.

Thanks, rick


What is the panic backtrace?

Usually one of these tree variants is used:

CURVNET_SET(TD_TO_VNET(td));
CURVNET_SET(ifp->if_vnet);
CURVNET_SET(so->so_vnet);

--HPS


___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib]

2020-01-29 Thread Hans Petter Selasky

On 2020-01-29 22:44, Eric Joyner wrote:

On Wed, Jan 29, 2020 at 1:41 PM Hans Petter Selasky  wrote:


On 2020-01-29 22:30, Eric Joyner wrote:

Hi freebsd-net,

We've encountered an issue with unloading the iavf(4) driver on FreeBSD
12.1 (and stable). On a VM with two iavf(4) interfaces, if we send heavy
traffic to iavf1 and try to kldunload the driver, the kldunload process
hangs on iavf0 until iavf1 stops receiving traffic.

After some debugging, it looks like epoch_drain_callbacks() [via
if_detach_internal()] tries to switch CPUs to run on one that iavf1 is
using for RX processing, but since iavf1 is busy, it can't make the

switch,

so cpu_switch() just hangs and nothing happens until iavf1's RX thread
stops being busy.

I can work around this by inserting a kern_yield(PRI_USER) somewhere in

one

of the iavf txrx functions that iflib calls into (e.g.
iavf_isc_rxd_available), but that's not a proper fix. Does anyone know

what

to do to prevent this from happening?

Wildly guessing, does maybe epoch_drain_callbacks() need a higher

priority

than the PI_SOFT used in the group taskqueues used in iflib's RX

processing?




Hi,

Which scheduler is this? ULE or BSD?

EPOCH(9) expects some level of round-robin scheduling on the same
priority level. Setting a higher priority on EPOCH(9) might cause epoch
to start spinning w/o letting the lower priority thread which holds the
EPOCH() section to finish.

--HPS



Hi Hans,

kern.sched.name gives me "ULE"



Hi Eric,

epoch_drain_callbacks() depends on that epoch_call_task() gets execution 
which is executed from a GTASKQUEUE at PI_SOFT. Also 
epoch_drain_callbacks() runs at the priority of the calling thread, and 
if this is lower than PI_SOFT, and a gtaskqueue is spinning heavily, 
then that won't work.


For a single CPU system you will be toast in this situation regardless 
if there is no free time on a CPU for EPOCH().


In general if epoch_call_task() doesn't get execution time, you will 
have a problem.


Maybe add a flag to iflib which stops the grouptask's before detaching 
the network interface?


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib]

2020-01-29 Thread Hans Petter Selasky

On 2020-01-29 22:30, Eric Joyner wrote:

Hi freebsd-net,

We've encountered an issue with unloading the iavf(4) driver on FreeBSD
12.1 (and stable). On a VM with two iavf(4) interfaces, if we send heavy
traffic to iavf1 and try to kldunload the driver, the kldunload process
hangs on iavf0 until iavf1 stops receiving traffic.

After some debugging, it looks like epoch_drain_callbacks() [via
if_detach_internal()] tries to switch CPUs to run on one that iavf1 is
using for RX processing, but since iavf1 is busy, it can't make the switch,
so cpu_switch() just hangs and nothing happens until iavf1's RX thread
stops being busy.

I can work around this by inserting a kern_yield(PRI_USER) somewhere in one
of the iavf txrx functions that iflib calls into (e.g.
iavf_isc_rxd_available), but that's not a proper fix. Does anyone know what
to do to prevent this from happening?

Wildly guessing, does maybe epoch_drain_callbacks() need a higher priority
than the PI_SOFT used in the group taskqueues used in iflib's RX processing?



Hi,

Which scheduler is this? ULE or BSD?

EPOCH(9) expects some level of round-robin scheduling on the same 
priority level. Setting a higher priority on EPOCH(9) might cause epoch 
to start spinning w/o letting the lower priority thread which holds the 
EPOCH() section to finish.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Strange logic in r336438

2020-01-17 Thread Hans Petter Selasky

On 2020-01-17 00:31, Eric van Gyzen wrote:

I was just reviewing r336438:

https://svnweb.freebsd.org/base?view=revision=336438

In bxe_interrupt_detach(), the nested loops over sc->num_queues don't 
look right.  We drain the taskqueues for queue 0, but then free the 
taskqueues for queues 1-N without draining them.  Should the second loop 
come _after_ the first loop, instead of _in_ it?




Hi,

taskqueue_free() will do some kind of last minute draining, if you look 
at the implementation.


However if you want to ensure all tasks are completed, taskqueue_drain() 
before free() is preferred.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Intel ix staled under heavy load

2020-01-15 Thread Hans Petter Selasky

On 2020-01-14 19:26, Nick Rogers wrote:

On Tue, Jan 14, 2020 at 10:09 AM Hans Petter Selasky 
wrote:


On 2020-01-14 16:07, Slawa Olhovchenkov wrote:

this is a known issue in iflib.

Unresolved?


See mail I sent off-list.



I would be interested to know if this is resolved or not as well.



No, not yet.

Slawa, can you dump the iflib sysctl's when the card is in the stalled 
state?


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Intel ix staled under heavy load

2020-01-14 Thread Hans Petter Selasky

On 2020-01-14 16:07, Slawa Olhovchenkov wrote:

this is a known issue in iflib.

Unresolved?


See mail I sent off-list.

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Intel ix staled under heavy load

2020-01-14 Thread Hans Petter Selasky

On 2020-01-14 15:54, Slawa Olhovchenkov wrote:

What is problem? How to resolve this?


Iff you do "ifconfig xxx down" and then "ifconfig xxx up" and the 
interface comes back, this is a known issue in iflib.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] ipoib: Patch for crash in icmp_error, fault trap 12

2020-01-11 Thread Hans Petter Selasky

Thank you for your patch:

https://svnweb.freebsd.org/changeset/base/356633

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[IFLIB] When system runs out of mbufs incoming network traffic stops entirely

2019-12-03 Thread Hans Petter Selasky

Hi,

It appears iflib has a little defect. When the system temporarily runs 
out of mbufs, iflib based network drivers stop receiving packets 
forever, even when the mbuf zone recover.


In mlx5en(4) which doesn't use iflib we have a special watchdog/callout 
to retry filling the RX DMA ring when we are out of mbufs. Can iflib do 
the same?



[zone: mbuf_cluster] kern.ipc.nmbclusters limit reached


Simply doing "ifconfig down" and "ifconfig up" recovers the adapter:


igb0: link state changed to DOWN
igb0: link state changed to UP


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Problems with Multicast (IGMP) since upgrade from 11.3 to 12.1

2019-12-01 Thread Hans Petter Selasky

FYI:

Solution is here:

https://reviews.freebsd.org/D22595

Working on getting it into base.

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ix0 and ix1 ifconfig options different on Supermicro board

2019-11-28 Thread Hans Petter Selasky

On 2019-11-27 18:50, BulkMailForRudy wrote:

   iperf3 -c 10.1.1.1 -P 4  --->  5.1Gbps


I think iperf3 is single-threaded multiple connections. While iperf use 
multiple threads 


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Still em(4) broken with epoch changes

2019-10-23 Thread Hans Petter Selasky

On 2019-10-23 09:33, Konstantin Belousov wrote:

I tried to netboot my test box today with r353914 kernel, and still I get
the following panic on machine attempt to start multiuser:

Feeding entropy: .
lo0: link state changed to UP
uhub0: 3 ports with 3 removable, self powered
uhub1: 3 ports with 3 removable, self powered
panic: sleeping in an epoch section
cpuid = 7
time = 1571815747
KDB: stack backtrace:
db_trace_self_wrapper() at 0x802d070b = 
db_trace_self_wrapper+0x2b/frame 0xfe000ee9fba0
vpanic() at 0x803b79bd = vpanic+0x19d/frame 0xfe000ee9fbf0
panic() at 0x803b7753 = panic+0x43/frame 0xfe000ee9fc50
_sleep() at 0x803c38b5 = _sleep+0x4a5/frame 0xfe000ee9fcf0
pause_sbt() at 0x803c3cff = pause_sbt+0x10f/frame 0xfe000ee9fd30
e1000_write_phy_reg_mdic() at 0x80d1332e = 
e1000_write_phy_reg_mdic+0xee/frame 0xfe000ee9fd70
e1000_enable_phy_wakeup_reg_access_bm() at 0x80d16b6b = 
e1000_enable_phy_wakeup_reg_access_bm+0x2b/frame 0xfe000ee9fd90
e1000_update_mc_addr_list_pch2lan() at 0x80d3119a = 
e1000_update_mc_addr_list_pch2lan+0x3a/frame 0xfe000ee9fdd0
em_if_multi_set() at 0x80d0cb49 = em_if_multi_set+0x1c9/frame 
0xfe000ee9fe20
iflib_if_ioctl() at 0x80d87b70 = iflib_if_ioctl+0x100/frame 
0xfe000ee9fe90
if_addmulti() at 0x804b1e9f = if_addmulti+0x2af/frame 0xfe000ee9ff30
in6_joingroup_locked() at 0x8058094e = in6_joingroup_locked+0x2ee/frame 
0xfe000eea0030
in6_joingroup() at 0x80580634 = in6_joingroup+0x44/frame 
0xfe000eea0060
in6_update_ifa() at 0x80579c4c = in6_update_ifa+0x177c/frame 
0xfe000eea0210
nd6_ra_input() at 0x8059fd7f = nd6_ra_input+0x108f/frame 
0xfe000eea04b0
icmp6_input() at 0x8057359a = icmp6_input+0x69a/frame 0xfe000eea0640
ip6_input() at 0x8058b63e = ip6_input+0xb2e/frame 0xfe000eea0720
netisr_dispatch_src() at 0x804bb84a = netisr_dispatch_src+0x9a/frame 
0xfe000eea0790
ether_demux() at 0x804b6d0f = ether_demux+0x15f/frame 0xfe000eea07c0
ether_nh_input() at 0x804b814f = ether_nh_input+0x39f/frame 
0xfe000eea0810
netisr_dispatch_src() at 0x804bb84a = netisr_dispatch_src+0x9a/frame 
0xfe000eea0880
ether_input() at 0x804b71d8 = ether_input+0x58/frame 0xfe000eea08d0
_task_fn_rx() at 0x80d83610 = _task_fn_rx+0xb20/frame 0xfe000eea09e0
gtaskqueue_run_locked() at 0x80402819 = 
gtaskqueue_run_locked+0xf9/frame 0xfe000eea0a30
gtaskqueue_thread_loop() at 0x804025d8 = 
gtaskqueue_thread_loop+0x88/frame 0xfe000eea0a60
fork_exit() at 0x8037b31c = fork_exit+0xcc/frame 0xfe000eea0ab0
fork_trampoline() at 0x8063114e = fork_trampoline+0xe/frame 
0xfe000eea0ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100024 ]
Stopped at  0x80403cfb = kdb_enter+0x3b:movq
$0,0x808e3968 = kdb_why

Can we please get this sorted out ?


PR is here:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241223

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: panic: sleeping in an epoch section

2019-10-09 Thread Hans Petter Selasky

On 2019-10-09 15:56, Mark Johnston wrote:

On Wed, Oct 09, 2019 at 10:40:04AM +0200, Hans Petter Selasky wrote:

On 2019-10-09 06:36, Yuri Pankov wrote:

Tried updating from r353072 to r353334 and getting the following panic
reproducibly on boot (starting dhclient?):

panic: sleeping in an epoch section
cpuid = 5
time = 1570591558
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe00af780140
vpanic() at vpanic+0x19d/frame 0xfe00af780190
panic() at panic+0x43/frame 0xfe00af7801f0
_sleep() at _sleep+0x463/frame 0xfe00af780290
pause_sbt() at pause_sbt+0x10f/frame 0xfe00af7802d0
e1000_write_phy_reg_mdic() at e1000_write_phy_reg_mdic+0xee/frame
0xfe00af780310
e1000_enable_phy_wakeup_reg_access_bm() at
e1000_enable_phy_wakeup_reg_access_bm+0x2b/frame 0xfe00af780330
e1000_update_mc_addr_list_pch2lan() at
e1000_update_mc_addr_list_pch2lan+0x3a/frame 0xfe00af780370
em_if_multi_set() at em_if_multi_set+0x1d4/frame 0xfe00af7803c0
iflib_if_ioctl() at iflib_if_ioctl+0x100/frame 0xfe00af780430
if_addmulti() at if_addmulti+0x2af/frame 0xfe00af7804d0
in_joingroup_locked() at in_joingroup_locked+0x235/frame 0xfe00af780570
in_joingroup() at in_joingroup+0x5c/frame 0xfe00af7805d0
in_control() at in_control+0xadf/frame 0xfe00af780680
ifioctl() at ifioctl+0x40f/frame 0xfe00af780750
kern_ioctl() at kern_ioctl+0x295/frame 0xfe00af7807b0
sys_ioctl() at sys_ioctl+0x15d/frame 0xfe00af780880
amd64_syscall() at amd64_syscall+0x2b9/frame 0xfe00af7809b0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00af7809b0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80048051a, rsp =
0x7fffe3e8, rbp = 0x7fffe430 ---


The SIOCADDMULTI if_ioctl() is not allowed to sleep, because it can be
called from the fast-path, so this is a bug in e1000 driver. Does the
attached patch workaround the issue?


What fast path are you referring to?  The locking protocol used by the
multicast code was changed specifically to allow for sleeps in driver
ioctl handlers.


I recall a long time ago seeing that input packet processing may end up 
calling if_ioctl's . Things may have changed since then though.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: panic: sleeping in an epoch section

2019-10-09 Thread Hans Petter Selasky

On 2019-10-09 06:36, Yuri Pankov wrote:
Tried updating from r353072 to r353334 and getting the following panic 
reproducibly on boot (starting dhclient?):


panic: sleeping in an epoch section
cpuid = 5
time = 1570591558
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe00af780140

vpanic() at vpanic+0x19d/frame 0xfe00af780190
panic() at panic+0x43/frame 0xfe00af7801f0
_sleep() at _sleep+0x463/frame 0xfe00af780290
pause_sbt() at pause_sbt+0x10f/frame 0xfe00af7802d0
e1000_write_phy_reg_mdic() at e1000_write_phy_reg_mdic+0xee/frame 
0xfe00af780310
e1000_enable_phy_wakeup_reg_access_bm() at 
e1000_enable_phy_wakeup_reg_access_bm+0x2b/frame 0xfe00af780330
e1000_update_mc_addr_list_pch2lan() at 
e1000_update_mc_addr_list_pch2lan+0x3a/frame 0xfe00af780370

em_if_multi_set() at em_if_multi_set+0x1d4/frame 0xfe00af7803c0
iflib_if_ioctl() at iflib_if_ioctl+0x100/frame 0xfe00af780430
if_addmulti() at if_addmulti+0x2af/frame 0xfe00af7804d0
in_joingroup_locked() at in_joingroup_locked+0x235/frame 0xfe00af780570
in_joingroup() at in_joingroup+0x5c/frame 0xfe00af7805d0
in_control() at in_control+0xadf/frame 0xfe00af780680
ifioctl() at ifioctl+0x40f/frame 0xfe00af780750
kern_ioctl() at kern_ioctl+0x295/frame 0xfe00af7807b0
sys_ioctl() at sys_ioctl+0x15d/frame 0xfe00af780880
amd64_syscall() at amd64_syscall+0x2b9/frame 0xfe00af7809b0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00af7809b0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80048051a, rsp = 
0x7fffe3e8, rbp = 0x7fffe430 ---


The SIOCADDMULTI if_ioctl() is not allowed to sleep, because it can be 
called from the fast-path, so this is a bug in e1000 driver. Does the 
attached patch workaround the issue?


--HPS
Index: sys/dev/e1000/e1000_osdep.h
===
--- sys/dev/e1000/e1000_osdep.h	(revision 353336)
+++ sys/dev/e1000/e1000_osdep.h	(working copy)
@@ -82,7 +82,7 @@
 
 static inline void
 safe_pause_us(int x) {
-	if (cold) {
+	if (cold || in_epoch(net_epoch_preempt)) {
 		DELAY(x);
 	} else {
 		pause("e1000_delay", max(1,  x/(100/hz)));
@@ -91,7 +91,7 @@
 
 static inline void
 safe_pause_ms(int x) {
-	if (cold) {
+	if (cold || in_epoch(net_epoch_preempt)) {
 		DELAY(x*1000);
 	} else {
 		pause("e1000_delay", ms_scale(x));
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Infiniband: Mellanox MT26418 in ethernet mode causes crash on shutdown

2019-02-25 Thread Hans Petter Selasky

Hi,

Your patch looks good and Mellanox will test it a bit internally before 
pushing upstream.


I think the if_down() call is not strictly needed. ether_ifdetach() 
already does this. Can you test the patch w/o the if_down() call?


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Infiniband: Mellanox MT26418 in ethernet mode causes crash on shutdown

2019-02-24 Thread Hans Petter Selasky

On 2/24/19 1:23 AM, Andreas Kempe wrote:

Hello,

When running a Mellanox MT26418 in ethernet mode, the kernel crashes
with the following stack trace on system shutdown:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80e3f5f4
stack pointer   = 0x28:0xfe064abec6e0
frame pointer   = 0x28:0xfe064abec700
code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1 (init)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0x80b4c5b7 at kdb_backtrace+0x67
#1 0x80b05b57 at vpanic+0x177
#2 0x80b059d3 at panic+0x43
#3 0x8106efdf at trap_fatal+0x35f
#4 0x8106f039 at trap_pfault+0x49
#5 0x8106e807 at trap+0x2c7
#6 0x8104f03c at calltrap+0x8
#7 0x80e3fae2 at mlx4_en_stop_port+0x3d2
#8 0x80e40ff6 at mlx4_en_destroy_netdev+0x1e6
#9 0x80e3e47d at mlx4_en_remove+0xcd
#10 0x80e1ab01 at mlx4_remove_device+0xb1
#11 0x80e1b0b8 at mlx4_unregister_device+0x98
#12 0x80e1c5c5 at mlx4_unload_one+0x85
#13 0x80e23543 at mlx4_shutdown+0x83
#14 0x80d6b6e9 at linux_pci_shutdown+0x39
#15 0x80b4004a at bus_generic_shutdown+0x5a
#16 0x80b4004a at bus_generic_shutdown+0x5a
#17 0x80b4004a at bus_generic_shutdown+0x5a


I've traced the issue to the following lines of code in
sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c in mlx4_en_destroy_netdev():

 /* Unregister device - this will close the port if it was up */
 if (priv->registered) {
 mutex_lock(>state_lock);
 ether_ifdetach(dev);
 mutex_unlock(>state_lock);
}>> mutex_lock(>state_lock);
 mlx4_en_stop_port(dev);
 mutex_unlock(>state_lock);



The issue is that mlx4_en_stop_port() follows the fcall chain below and
tries to fetch the MAC address of the device in mlx4_en_put_qp.
mlx4_en_destroy_netdev->mlx4_en_stop_port->mlx4_en_put_qp

The sequence above causes the kernel to choke because the MAC address
was freed in the previous call to ether_ifdetach in if_detach_internal
with the following call chain:
mlx4_en_destroy_netdev->ether_ifdetach->if_detach->if_detach_internal

I've written a small workaround that works on our test machine, although
I suspect this could potentially cause issues as we're destroying the
port before we destroy the interface. Please see the attached patch for
the workaround.

Cordially,
Andreas Kempe
Lysator ACS


CC'ing FreeBSD-drivers at Mellanox.

Thank you for your patch. We'll have a look at it.

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: IPv6 Broken in 12

2019-01-04 Thread Hans Petter Selasky

On 1/4/19 3:29 PM, Shamim Shahriar wrote:

Dear List members, good afternoon and happy new year

I am trying to setup a FreeBSD server v12 amd64, and it appears that IPv6
on that is actually broken. I have confirmed that by having same hardware
running v11.2 (amd64), and that is working without any issue.

Preamble:
The "infrastructure" in question is running mostly Juniper devices, and the
routers are advertising everything. So the only thing (related to IPv6) I
have in my rc.conf are

ifconfig_em0_ipv6="inet6 accept_rtadv"
rtsold_enable="YES"

This gives the machines an IPv6 from the intended subnet, and also
configures the defaultroute for the devices

em0: flags=8843 metric 0 mtu 1500

options=81209b
 ether 52:54:00:1a:a4:1a
 inet6 fe80::e:f:11:12:a41a%em0 prefixlen 64 scopeid 0x1
 inet6 a:b:c:d:e:f:11:12:a41a prefixlen 64 autoconf
 inet 172.16.1.23 netmask 0xff00 broadcast 172.16.1.255
 media: Ethernet autoselect (1000baseT )
 status: active
 nd6 options=23


# netstat -nr
Routing tables

Internet:
DestinationGatewayFlags Netif Expire
default172.16.1.1   UGS em0
127.0.0.1  link#2 UH  lo0
172.16.1.0/24link#1 U   em0
172.16.1.23  link#1 UHS lo0

Internet6:
Destination   Gateway   Flags
Netif Expire
::/96 ::1   UGRS
lo0
default   fe80::e:f:11:12:200%em0   UG
em0
::1   link#2UH
lo0
:::0.0.0.0/96 ::1   UGRS
lo0
a:b:c:d::/64  link#1U
em0
a:b:c:d:e:f:11:12:a41alink#1UHS
lo0
fe80::/10 ::1   UGRS
lo0
fe80::%em0/64 link#1U
em0
fe80::e:f:11:12:a41a%em0  link#1UHS
lo0
fe80::%lo0/64 link#2U
lo0
fe80::1%lo0   link#2UHS
lo0
ff02::/16


Problem:
In FreeBSD v12, if I do a tcpdump, it appears that the router is constantly
asking who has my IP, and the machine is not responding to it at all.

(running on two different console on the same machine)
ping6 mx1
PING6(56=40+8+8 bytes) a:b:c:d:5054:ff:fe1a:a41a --> :aa:bb:cc::72


# tcpdump -ni em0 icmp6
14:17:18.980755 IP6 a:b:c:d:e:f:11:12:a41a > :aa:bb:cc::72: ICMP6, echo
request, seq 0, length 16
14:17:19.617708 IP6 fe80::200:5eff:fe00:200 > ff02::1:ff1a:a41a: ICMP6,
neighbor solicitation, who has a:b:c:d:e:f:11:12:a41a, length 32
14:17:20.003172 IP6 a:b:c:d:e:f:11:12:a41a > :aa:bb:cc::72: ICMP6, echo
request, seq 1, length 16
14:17:20.617615 IP6 fe80::200:5eff:fe00:200 > ff02::1:ff1a:a41a: ICMP6,
neighbor solicitation, who has a:b:c:d:e:f:11:12:a41a, length 32
14:17:21.023423 IP6 a:b:c:d:e:f:11:12:a41a > :aa:bb:cc::72: ICMP6, echo
request, seq 2, length 16


Whereas, if I am running FreeBSD v11.2, it is working alright. I am getting
the ping response and what not.

NOTE: Both the v12 and v11.2 was downloaded as of today (in a matter of
minutes -- not even hours) for setting up the machines.

Could someone please confirm if what I am seeing is expected? If yes, how
soon is this likely to be fixed?

If you require further information or need me to run more tests, please do
let me know. I will have the machines running for some time (reasonable
time), before I decide which of the two will prevail :D

Thanks and regards


Hi,

Can you try the second debug patch mentioned here:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233535

Is this issue isolated to Link-Local IPv6 or is global IPv6 involved aswell?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Deadlock in VLAN 12-current w/patch

2018-10-12 Thread Hans Petter Selasky

On 10/8/18 5:36 PM, Hans Petter Selasky wrote:

Hi Matthew,

There is a deadlock when destroying VLANs after the epoch changes were 
made. Can you have a look and consider the attached patch for 12-current?


Thank you!



Hi,

Differential review is here:
https://reviews.freebsd.org/D17496

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Deadlock in VLAN 12-current w/patch

2018-10-08 Thread Hans Petter Selasky

Hi Matthew,

There is a deadlock when destroying VLANs after the epoch changes were 
made. Can you have a look and consider the attached patch for 12-current?


Thank you!

--HPS

Thread 1:

epoch_block_handler_preempt() at epoch_block_handler_preempt+0x90/frame 0xfe00261b3970  
ck_epoch_synchronize_wait() at ck_epoch_synchronize_wait+0x9d/frame 0xfe00261b39c0  
epoch_wait_preempt() at epoch_wait_preempt+0x170/frame 0xfe00261b3a20   
vlan_setmulti() at vlan_setmulti+0xb4/frame 0xfe00261b3a70  
vlan_ioctl() at vlan_ioctl+0x83/frame 0xfe00261b3ad0
in6m_release_task() at in6m_release_task+0x32e/frame 0xfe00261b3b30 
gtaskqueue_run_locked() at gtaskqueue_run_locked+0xf9/frame 0xfe00261b3b80  
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame 0xfe00261b3bb0
fork_exit() at fork_exit+0x84/frame 0xfe00261b3bf0  
fork_trampoline() at fork_trampoline+0xe/frame 0xfe00261b3bf0   
--- trap 0, rip = 0, rsp = 0, rbp = 0 --- 


Thread 2:

sleepq_switch() at sleepq_switch+0x10d/frame 0xfe002bffd3f0 
sleepq_wait() at sleepq_wait+0x43/frame 0xfe002bffd420  
_sx_xlock_hard() at _sx_xlock_hard+0x4a6/frame 0xfe002bffd4c0   
_sx_xlock() at _sx_xlock+0xc1/frame 0xfe002bffd500  
in6_leavegroup() at in6_leavegroup+0x27/frame 0xfe002bffd520
in6_purgeaddr() at in6_purgeaddr+0xc2/frame 0xfe002bffd6a0  
if_purgeaddrs() at if_purgeaddrs+0x11e/frame 0xfe002bffd750 
if_detach_internal() at if_detach_internal+0x709/frame 0xfe002bffd7d0   
if_detach() at if_detach+0x3d/frame 0xfe002bffd7f0  
vlan_clone_destroy() at vlan_clone_destroy+0x21/frame 0xfe002bffd820
if_clone_destroyif() at if_clone_destroyif+0x175/frame 0xfe002bffd870   
if_clone_destroy() at if_clone_destroy+0x205/frame 0xfe002bffd8c0   
ifioctl() at ifioctl+0x582/frame 0xfe002bffd990 
kern_ioctl() at kern_ioctl+0x2ba/frame 0xfe002bffd9f0   
sys_ioctl() at sys_ioctl+0x15e/frame 0xfe002bffdac0 
amd64_syscall() at amd64_syscall+0x28c/frame 0xfe002bffdbf0 
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe002bffdbf0 
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80047b92a, rsp = 

Re: ib_unregister_device - OFED related question

2018-07-07 Thread Hans Petter Selasky

On 07/07/18 02:49, Somayajulu, David wrote:

We see that module_unload() grabs the Gaint Lock prior to invoking UNLOAD. 
Isn't this a problem with cma_process_remove() or am I missing something?


Hi,

The LinuxKPI should DROP_GIANT and PICKUP_GIANT when sleeping. I haven't 
checked FreeBSD 10 recently, but this is the case for FreeBSD 11 and 
FreeBSD 12.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kldload ibcore.ko fails in snapshot: FreeBSD-12.0-CURRENT-amd64-20180329-r331740-disc1

2018-04-27 Thread Hans Petter Selasky

On 04/26/18 22:45, Somayajulu, David wrote:

Thanks Hans and Julian.
I did the following and still see the problem

#cd /usr/src
#make buildworld WITH_OFED=yes
#make installworld WITH_OFED=yes
#reboot
#cd /usr/src
#make buildkernel WITH_OFED=yes KERNCONF=MYKERNEL  ; 
MYKERNEL content is shown below in case it is a cause
#make installkernel WITH_OFED=yes  KERNCONF=MYKERNEL
#reboot
#cd /usr/src/sys/modules/linuxkpi
#make clean && make WITH_OFED=yes


Hi,

WITH_OFED=YES is only valid for buildworld.

Please add DEBUG_FLAGS="-DVIMAGE=1" whenever you are building modules 
outside buildworld.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kldload ibcore.ko fails in snapshot: FreeBSD-12.0-CURRENT-amd64-20180329-r331740-disc1

2018-04-25 Thread Hans Petter Selasky

On 04/25/18 16:12, Julian Elischer wrote:

On 24/4/18 3:15 pm, Hans Petter Selasky wrote:

On 04/24/18 01:33, Somayajulu, David wrote:

Hi All,
kldload ibcore.ko
fails in the above snapshot with the following error.

# kldload -v /usr/obj/usr/src/amd64.amd64/sys/modules/ibcore/ibcore.ko
kldload: an error occurred while loading module 
/usr/obj/usr/src/amd64.amd64/sys/modules/ibcore/ibcore.ko. Please 
check dmesg(8) for more details.


/var/log/messages indicates the following.

Apr 23 16:28:07 bsd25_12 kernel: link_elf_obj: symbol if_index undefined
Apr 23 16:28:07 bsd25_12 kernel: linker_load_file: 
/usr/obj/usr/src/amd64.amd64/sys/modules/ibcore/ibcore.ko - 
unsupported file type


What am I missing?



Hi,

It looks like ibcore.ko was built w/o DEBUG_FLAGS="-DVIMAGE=1"


that shouldn't be in debug flags..   Not sure WHERE it should be, but I 
think that isn't it.

(may work though)



If you build outside the "buildkernel WITH_OFED=YES" target these flags 
must be specified manually in my experience.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kldload ibcore.ko fails in snapshot: FreeBSD-12.0-CURRENT-amd64-20180329-r331740-disc1

2018-04-24 Thread Hans Petter Selasky

On 04/24/18 01:33, Somayajulu, David wrote:

Hi All,
kldload ibcore.ko
fails in the above snapshot with the following error.

# kldload -v /usr/obj/usr/src/amd64.amd64/sys/modules/ibcore/ibcore.ko
kldload: an error occurred while loading module 
/usr/obj/usr/src/amd64.amd64/sys/modules/ibcore/ibcore.ko. Please check 
dmesg(8) for more details.

/var/log/messages indicates the following.

Apr 23 16:28:07 bsd25_12 kernel: link_elf_obj: symbol if_index undefined
Apr 23 16:28:07 bsd25_12 kernel: linker_load_file: 
/usr/obj/usr/src/amd64.amd64/sys/modules/ibcore/ibcore.ko - unsupported file 
type

What am I missing?



Hi,

It looks like ibcore.ko was built w/o DEBUG_FLAGS="-DVIMAGE=1"

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4 weird error "Failed to map EQ context memory" after update

2018-02-17 Thread Hans Petter Selasky

On 02/17/18 14:51, Greg V wrote:

On 01/20/2018 12:18, Hans Petter Selasky wrote:

On 01/20/18 00:17, Greg V via freebsd-net wrote:


On 01/19/2018 12:54, Hans Petter Selasky wrote:

On 01/18/18 14:11, Greg V wrote:
Hi. I've upgraded CURRENT from December 19 
(https://github.com/freebsd/freebsd/commit/fd53ccf393f4f8ac1948e97eca108) 
to today 
(https://github.com/freebsd/freebsd/commit/391a83c86bb91ae3840cf37b7de478f42cc97e2a) 
and my Mellanox ConnectX-2 network card stopped working:


mlx4_core0:  mem 
0xfe10-0xfe1f,0xf080-0xf0ff irq 32 at device 0.0 on 
pci7

mlx4_core: Mellanox ConnectX core driver v3.4.1 (October 2017)
mlx4_core: Initializing mlx4_core
mlx4_core0: command 0xffa failed: fw status = 0x1
mlx4_core0: Failed to map EQ context memory, aborting
device_attach: mlx4_core0 attach returned 12


Loading the OLD mlx4.ko and mlx4en.ko on the NEW kernel actually 
does work fine!


Reverting all mlx4 changes between then and now (no big changes, 
mostly just the 1 << 31 thing from D13858) and rebuilding the mlx4 
module with CC=clang50 does not help.


What happened?!

Upgraded CURRENT again today, the problem went away :)


OK, nice to know.

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4 weird error "Failed to map EQ context memory" after update

2018-01-20 Thread Hans Petter Selasky

On 01/20/18 00:17, Greg V via freebsd-net wrote:


On 01/19/2018 12:54, Hans Petter Selasky wrote:

On 01/18/18 14:11, Greg V wrote:
Hi. I've upgraded CURRENT from December 19 
(https://github.com/freebsd/freebsd/commit/fd53ccf393f4f8ac1948e97eca108) 
to today 
(https://github.com/freebsd/freebsd/commit/391a83c86bb91ae3840cf37b7de478f42cc97e2a) 
and my Mellanox ConnectX-2 network card stopped working:


mlx4_core0:  mem 
0xfe10-0xfe1f,0xf080-0xf0ff irq 32 at device 0.0 on pci7

mlx4_core: Mellanox ConnectX core driver v3.4.1 (October 2017)
mlx4_core: Initializing mlx4_core
mlx4_core0: command 0xffa failed: fw status = 0x1
mlx4_core0: Failed to map EQ context memory, aborting
device_attach: mlx4_core0 attach returned 12


Loading the OLD mlx4.ko and mlx4en.ko on the NEW kernel actually does 
work fine!


Reverting all mlx4 changes between then and now (no big changes, 
mostly just the 1 << 31 thing from D13858) and rebuilding the mlx4 
module with CC=clang50 does not help.


What happened?!


Hi,

Can you do:

objdump -Dx /boot/kernel/mlx4.ko > mlx4.ko.txt
objdump -Dx /boot/kernel/mlx4en.ko > mlx4en.ko.txt

And diff the text result between working and non-working ko's.
That results in 180883 lines (9.2 megabytes) of diff for mlx4.ko. The 
CC=clang50 one is only a bit better at 7.6 MB :(


Can you open this diff using "meld". And look for instructions which 
have changed, not only their location.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4 weird error "Failed to map EQ context memory" after update

2018-01-19 Thread Hans Petter Selasky

On 01/18/18 14:11, Greg V wrote:
Hi. I've upgraded CURRENT from December 19 
(https://github.com/freebsd/freebsd/commit/fd53ccf393f4f8ac1948e97eca108) to 
today 
(https://github.com/freebsd/freebsd/commit/391a83c86bb91ae3840cf37b7de478f42cc97e2a) 
and my Mellanox ConnectX-2 network card stopped working:


mlx4_core0:  mem 0xfe10-0xfe1f,0xf080-0xf0ff 
irq 32 at device 0.0 on pci7

mlx4_core: Mellanox ConnectX core driver v3.4.1 (October 2017)
mlx4_core: Initializing mlx4_core
mlx4_core0: command 0xffa failed: fw status = 0x1
mlx4_core0: Failed to map EQ context memory, aborting
device_attach: mlx4_core0 attach returned 12


Loading the OLD mlx4.ko and mlx4en.ko on the NEW kernel actually does 
work fine!


Reverting all mlx4 changes between then and now (no big changes, mostly 
just the 1 << 31 thing from D13858) and rebuilding the mlx4 module with 
CC=clang50 does not help.


What happened?!


Hi,

Can you do:

objdump -Dx /boot/kernel/mlx4.ko > mlx4.ko.txt
objdump -Dx /boot/kernel/mlx4en.ko > mlx4en.ko.txt

And diff the text result between working and non-working ko's.

Can you also make sure that /boot/modules does not contain anything *mlx4* ?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ConnectX ethernet card: how do I get tghe driver for it.

2017-09-27 Thread Hans Petter Selasky

On 09/27/17 17:13, Ben RUBSON wrote:

Hi Eugene,

cd /usr/src/sys/modules/mlx4
make
make install
make clean
cd /usr/src/sys/modules/mlxen
make
make install
make clean

Add this to /boot/loader.conf :
mlx4_load="YES"
mlxen_load="YES"



In 12-current it is installed by default: mlx4en  . Prior to 12-current 
mlxen.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mbuf_jumbo_9k & iSCSI failing

2017-09-22 Thread Hans Petter Selasky

On 09/22/17 22:33, Ben RUBSON wrote:

On 22 Sep 2017, at 20:48, Ryan Stone  wrote:

Hans and I have proposed different approaches to the problem.  I was
taken off this issue at $WORK for a while, but coincidentally I just
picked it up again in the last week or so.  I'm working on evaluating
the performance characteristics of the two approaches and once I'm
satisfied with that I'll work with Hans to get a solution into the
tree.


Many thanks for the update Ryan !
Good timing again on the email :)



Hi Ryan,

If I didn't send you patches for testing for JUMBO 9K and acceleration 
of small packet RX - ping me off-list.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] D1777: Associated fix for arp/nd6 timer usage.

2017-08-10 Thread hselasky (Hans Petter Selasky)
hselasky added a subscriber: glebius.
hselasky added a comment.


  @oleg : Beware of the callout return value differences between FreeBSD 
9-10-11 and 12 !
  @glebius

REVISION DETAIL
  https://reviews.freebsd.org/D1777

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rrs, kib, jhb, imp, lstewart, gnn, sbruno, bz, adrian, rwatson
Cc: glebius, oleg, ae, bz, freebsd-net-list, emaste, hiren, julian, hselasky
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 13:56, Slawa Olhovchenkov wrote:

On Tue, Aug 08, 2017 at 01:49:08PM +0200, Hans Petter Selasky wrote:


On 08/08/17 13:33, Slawa Olhovchenkov wrote:

TW_RUNLOCK(V_tw_lock);
and
if (INP_INFO_TRY_WLOCK(_tcbinfo)) {

`inp` can be invalidated, freed and this pointer may be invalid?


If you look one line up there is a pcbref ??


Yes.
Can different thread take this inp and freed it?
May be timer thread?


No, it cannot be freed while there is a ref.

Some lines down the ref is dropped once the inp pointer is no longer needed.

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 13:33, Slawa Olhovchenkov wrote:

TW_RUNLOCK(V_tw_lock);
and
if (INP_INFO_TRY_WLOCK(_tcbinfo)) {

`inp` can be invalidated, freed and this pointer may be invalid?


If you look one line up there is a pcbref ??

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 10:06, Ben RUBSON wrote:

On 08 Aug 2017, at 10:02, Hans Petter Selasky <h...@selasky.org> wrote:

On 08/08/17 10:00, Ben RUBSON wrote:

kgdb) print *twq_2msl.tqh_first
$2 = {
   tw_inpcb = 0xf8031c570740,


print *twq_2msl.tqh_first->tw_inpcb


(kgdb) print *twq_2msl.tqh_first->tw_inpcb
$3 = {
   inp_hash = {
 le_next = 0x0,
 le_prev = 0xfe000f78adb8
   },
   inp_pcbgrouphash = {
 le_next = 0x0,
 le_prev = 0x0
   },
   inp_list = {
 le_next = 0xf80c2a07f570,
 le_prev = 0x81e15e20
   },
   inp_ppcb = 0xf80d1bf12210,
   inp_pcbinfo = 0x81e15e28,
   inp_pcbgroup = 0x0,
   inp_pcbgroup_wild = {
 le_next = 0x0,
 le_prev = 0x0
   },
   inp_socket = 0x0,
   inp_cred = 0xf804ae6ca400,
   inp_flow = 0,
   inp_flags = 92274688,
   inp_flags2 = 16,
   inp_vflag = 0 '\0',
   inp_ip_ttl = 64 '@',
   inp_ip_p = 0 '\0',
   inp_ip_minttl = 0 '\0',
   inp_flowid = 946611505,
   inp_refcount = 2,
   inp_pspare = 0xf8031c5707c0,
   inp_flowtype = 191,
   inp_rss_listen_bucket = 0,
   inp_ispare = 0xf8031c5707f0,
   inp_inc = {
 inc_flags = 0 '\0',
 inc_len = 0 '\0',
 inc_fibnum = 0,
 inc_ie = {
   ie_fport = 53987,
   ie_lport = 47873,
   ie_dependfaddr = {
 ie46_foreign = {
   ia46_pad32 = 0xf8031c570808,
   ia46_addr4 = {
 s_addr = 3011802202
   }
 },
 ie6_foreign = {
   __u6_addr = {
 __u6_addr8 = 0xf8031c570808 "",
 __u6_addr16 = 0xf8031c570808,
 __u6_addr32 = 0xf8031c570808
   }
 }
   },
   ie_dependladdr = {
 ie46_local = {
   ia46_pad32 = 0xf8031c570818,
   ia46_addr4 = {
 s_addr = 4068705883
   }
 },
 ie6_local = {
   __u6_addr = {
 __u6_addr8 = 0xf8031c570818 "",
 __u6_addr16 = 0xf8031c570818,
 __u6_addr32 = 0xf8031c570818
   }
 }
   },
   ie6_zoneid = 0
 }
   },
   inp_label = 0x0,
   inp_sp = 0x0,
   inp_depend4 = {
 inp4_ip_tos = 0 '\0',
 inp4_options = 0x0,
 inp4_moptions = 0x0
   },
   inp_depend6 = {
 inp6_options = 0x0,
 inp6_outputopts = 0x0,
 inp6_moptions = 0x0,
 inp6_icmp6filt = 0x0,
 inp6_cksum = 0,
 inp6_hops = 0
   },
   inp_portlist = {
 le_next = 0xf80274298ae0,
 le_prev = 0xf800454999b0
   },
   inp_phd = 0xf800454999a0,
   inp_gencnt = 2119756,
   inp_lle = 0x0,
   inp_lock = {
 lock_object = {
   lo_name = 0x814e6940 "tcpinp",
   lo_flags = 90898432,
   lo_data = 0,
   lo_witness = 0x0
 },
 rw_lock = 18446735277871559936
   },
   inp_rt_cookie = 10,
   inp_rtu = {
 inpu_route = {
   ro_rt = 0x0,
   ro_lle = 0x0,
   ro_prepend = 0x0,
   ro_plen = 0,
   ro_flags = 384,
   ro_mtu = 0,
   spare = 0,
   ro_dst = {
 sa_len = 16 '\020',
 sa_family = 2 '\002',
 sa_data = 0xf8031c5708f2 ""
   }
 },
 inpu_route6 = {
   ro_rt = 0x0,
   ro_lle = 0x0,
   ro_prepend = 0x0,
   ro_plen = 0,
   ro_flags = 384,
   ro_mtu = 0,
   spare = 0,
   ro_dst = {
 sin6_len = 16 '\020',
 sin6_family = 2 '\002',
 sin6_port = 0,
 sin6_flowinfo = 3011802202,
 sin6_addr = {
   __u6_addr = {
 __u6_addr8 = 0xf8031c5708f8 "",
 __u6_addr16 = 0xf8031c5708f8,
 __u6_addr32 = 0xf8031c5708f8
   }
 },
 sin6_scope_id = 0
   }
 }
   }
}
(kgdb)



Hi,

Here is the conclusion:

The following code is going in an infinite loop:



for (;;) {
TW_RLOCK(V_tw_lock);
tw = TAILQ_FIRST(_twq_2msl);
if (tw == NULL || (!reuse && (tw->tw_time - ticks) > 0)) {
TW_RUNLOCK(V_tw_lock);
break;
}
KASSERT(tw->tw_inpcb != NULL, ("%s: tw->tw_inpcb == NULL",
__func__));

inp = tw->tw_inpcb;
in_pcbref(inp);
TW_RUNLOCK(V_tw_lock);

if (INP_INFO_TRY_RLOCK(_tcbinfo)) {

INP_WLOCK(inp);
tw = intotw(inp);
if (in_pcbrele_wlocked(inp)) {


in_pcbrele_wlocked() returns (1) because INP_FREED (16) is set in 
inp->inp_flags2. I guess you have invariants disabled, because the 
KASSERT() below should have caused a panic.



KASSERT(tw == NULL, ("%s: held last inp "
"reference but tw not NULL", __func__));
INP_INFO_RUNLOCK(_tcbi

Re: mlx4en, timer irq @100%...

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 10:00, Ben RUBSON wrote:

kgdb) print *twq_2msl.tqh_first
$2 = {
   tw_inpcb = 0xf8031c570740,


print *twq_2msl.tqh_first->tw_inpcb

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 09:43, Ben RUBSON wrote:

OK.
I'm quite (well, absolutely) new to kgdb, some clue on how I should proceed ?
Thank you !
Ben


print twq_2msl
print *twq_2msl.tqh_first


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 09:37, Ben RUBSON wrote:



On 08 Aug 2017, at 09:33, Hans Petter Selasky <h...@selasky.org> wrote:

On 08/08/17 09:04, Ben RUBSON wrote:

"print V_twq_2msl" returns the following :
No symbol "V_twq_2msl" in current context.


Are you using VIMAGE ?


No, GENERIC FreeBSD 11.0 on a physical server.


Can you try to figure out where this symbol is located. We need to have 
a dump of it.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 09:04, Ben RUBSON wrote:

Here is vmstat -z :
https://benrubson.github.io/vmstatz.log


From what I can see there are not TCP allocation failures. This rules 
out one class of bugs:




socket: 864, 2092652, 105, 371, 2318298,   0,   0
unpcb:  240, 2092656,  14,1506,   70953,   0,   0
ipq: 56, 127374,   0,   0,   0,   0,   0
udp_inpcb:  464, 2092656,  12,1044, 1515824,   0,   0
udpcb:   32, 2092750,  12,7363, 1515824,   0,   0
tcp_inpcb:  464, 2092656, 448, 432,  731341,   0,   0
tcpcb: 1040, 2092653,  35, 205,  731341,   0,   0
tcptw:   88,  27810, 413,3862,  161063,   0,   0
syncache:   168,  15364,   0,1357,   17611,   0,   0
hostcache:  128,  0,   0,   0,   0,   0,   0
sackhole:32,  0,   0,5625,   42406,   0,   0
tcpreass:40, 254700,   0,6900, 2276087,   0,   0




--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 09:04, Ben RUBSON wrote:

"print V_twq_2msl" returns the following :
No symbol "V_twq_2msl" in current context.


Are you using VIMAGE ?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-08 Thread Hans Petter Selasky

On 08/08/17 01:52, Ben RUBSON wrote:



On 07 Aug 2017, at 19:57, Hans Petter Selasky <h...@selasky.org> wrote:

On 08/07/17 19:19, Ben RUBSON wrote:

On 07 Aug 2017, at 18:19, Matt Joras <mjo...@freebsd.org> wrote:

On 08/07/2017 09:11, Hans Petter Selasky wrote:

Hi,

Try to enter "kgdb" and run:

thread apply all bt

Look for the callout function in question.

--HPS


If you don't have a way to attach kgdb handy you could also break into
ddb(4) and run "alltrace". Though gdb would be more useful for an
ongoing session if we need more than the backtrace since you could
switch to that thread and investigate it directly.


Hi Hans & Matt,
Thank you for your answers, glad to hear from you :)
So here is the full kgdb(thread apply all bt) command log :
https://benrubson.github.io/kgdb.log
We found the faulty thread :
# procstat -ak | grep "swi4.*tcp"
12 100029 intr swi4: clock (0)  tcp_tw_2msl_scan pfslowtimo 
softclock_call_cc softclock intr_event_execute_handlers ithread_loop fork_exit 
fork_trampoline
# kgdb
(...)
Thread 747 (Thread 100029):
#0  sched_switch (td=0xf8000f337500, newtd=0xf8010e144000, flags=) at /usr/src/sys/kern/sched_ule.c:1973
#1  0xfe1000f92d80 in ?? ()
#2  0xfe0f8f74b6e0 in ?? ()
#3  0x810bd274 in handleevents (now=, fake=Error 
accessing memory address 0xffcc: Bad address.
) at /usr/src/sys/kern/kern_clocksource.c:223
Previous frame inner to this frame (corrupt stack?)
(...)
Of course let me know if you need further info.


Can you try to dump "td":

set print pretty on
thread 747
frame 0
print *td

It might give some more clues.


Here it is :
https://benrubson.github.io/td.log

Thx !



Can you show output from:

vmstat -z

Can you from kgdb do:

print V_twq_2msl

And follow the next link field and see where it goes?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-07 Thread Hans Petter Selasky

On 08/04/17 21:09, Ben RUBSON wrote:

On 04 Aug 2017, at 19:42, Ben RUBSON  wrote:

Feel free to ask me whatever you need to investigate on this !
I let this (production :/) server in this state to have a chance to get 
interesting traces.


Server no more in production, I moved service to the standby node.
So we can do everything :)
swi4 still burning !

What is strange is that according to my monitoring, it began exactly at 
15:00:00 UTC.
Certainly a coincidence, as I can't find anything related.



Hi,

Try to enter "kgdb" and run:

thread apply all bt

Look for the callout function in question.

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-04 Thread Hans Petter Selasky

On 08/04/17 19:42, Ben RUBSON wrote:



On 04 Aug 2017, at 19:31, Hans Petter Selasky <h...@selasky.org> wrote:

On 08/04/17 19:13, Ben RUBSON wrote:

12 100029 intr swi4: clock (0)  tcp_tw_2msl_scan pfslowtimo 
softclock_call_cc softclock intr_event_execute_handlers ithread_loop fork_exit 
fork_trampoline


Hi,

Can you "procstat -ak" a few times and grep for swi4. If the entry above does 
not disappear this is the culpit. Either the callout list is corrupted or there is an 
issue inside tcp_tw_2msl_scan().


Still here since my log catch 30 minutes ago, and sounds like it does not want 
to disappear.


I'm CC'ing Glebius hence he's been involved with timer issues in the kernel 
earlier.


Feel free to ask me whatever you need to investigate on this !
I let this (production :/) server in this state to have a chance to get 
interesting traces.



Hi,

I guess we need to involve kgdb to get the full backtrace. Let's wait 
and see if anyone here knows how to do it right so the machine doesn't 
crash and the state is lost.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-04 Thread Hans Petter Selasky

On 08/04/17 19:13, Ben RUBSON wrote:

12 100029 intr swi4: clock (0)  tcp_tw_2msl_scan pfslowtimo 
softclock_call_cc softclock intr_event_execute_handlers ithread_loop fork_exit 
fork_trampoline


Hi,

Can you "procstat -ak" a few times and grep for swi4. If the entry above 
does not disappear this is the culpit. Either the callout list is 
corrupted or there is an issue inside tcp_tw_2msl_scan().



I'm CC'ing Glebius hence he's been involved with timer issues in the 
kernel earlier.



--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mlx4en, timer irq @100%...

2017-08-04 Thread Hans Petter Selasky

On 08/04/17 18:59, Ben RUBSON wrote:

Hello,

Not sure this is the right list, but as it seems related to a mlx4en device...

# vmstat -i 1
(...)
interrupt  total   rate
cpu23:timer 1198   1127

# top -P ALL
(...)
CPU 23:  0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle

# netstat -I mlxen0 -d -w 1
 input mlxen0   output
packets  errs idrops  bytespackets  errs  bytes colls drops
(and not output at all, same for mlxen1 !)

# uname -sr
FreeBSD 11.0-RELEASE-p9

So, as you can see, one of my CPUs is used at 100% by timer interrupts,
since about 2 hours, and suddenly.
Initiating network connections to this server is now slow.
And what I found is that I can't use netstat on my 2 mlx4en devices anymore
(my monitoring tool is then no more fed).

sysctl hw.mlxen0 is OK, no errors, and trafic counters grow slowly.

What should I do ?
How to investigate on this ?

Thank you very much for your help & support,



Hi,

Try "procstat -ak". It should give an idea what is going on.

What version of FreeBSD is this?

Is this a regression issue?

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: memory leaks in 11.0?

2017-07-11 Thread Hans Petter Selasky

On 07/11/17 15:56, Kajetan Staszkiewicz wrote:

Hello,

I finally upgraded one of many of my routers to 11.0.

Unfortunately after running fine for a month it ran out of memory. "wired"
memory slowly grows up to allocating all memory in system when no more memory
is left for other programs. Things first get swapped and eventually die.

The router runs BIRD which has not much to do, it is for internal networks
only, pf, pfsync (currently disabled via `ifconfig pfsync0 down`), filebeat,
smokeping, ntp, nrpe and custom python cron job for sending data to Graphite.

`vmstat -z` shows constantly increasing allocation of "512" and "UMA Slabs".
Memory allocated for all pf-related things seems fine. I have graphite graps
for every `vmstat -z` and the icrease on "512" grows in similar way as "wired"
memory. "512" has 2 917 392 used objects allocated at this moment, "UMA Slabs"
is 379 006, there is 2636MiB "wired" memory.

How can I debug which part of kernel is responsible for this? I run GENERIC
kernel with ixl driver 1.7.11 from Intel, as the one in GENERIC had issues
detecting links on my x710 NIC.

I ask here, because it is a router, mostly being busy with his network cards,
routing and pf. Please direct me to a better group if you can.

I can crash this system if needed and dump memory (I hope that is possible on
GENERIC) for analysis.



Hi,

Last time I traced memory leaks I used some dtrace scripts to trace all 
allocations and frees and then analayzed the result using a perl script 
which I found on the internet.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: AW: axge0 and AX88179

2017-07-05 Thread Hans Petter Selasky

On 07/05/17 07:28, Oleg Lelchuk wrote:

Yes, I am having exactly the same problem that Shteryana described. I also
got messages about wrong ip length when I started dhclient for ue0. If I
plug my device into a usb 2.0 port and enable flow control, I get
networking speeds that are around 250 Mbit/sec. If flow control is disabled
and the device is still connected to the usb 2.0 port, then I get speeds
around 200 Mbit/sec. When the device is attached to a usb 3.0 port and the
flow control is enabled, I get speeds that are around 128 Mbit/sec. The
speeds drop to something like 40 Mbit/sec when the flow control is disabled
and the device is still attached to the usb 3.0 port. But I enabled flow
control only on one machine. It was the machine to which the device was
attached to. I didn't enable flow control on the other machine on the
network, but I am sure my speeds won't get much improvement even if it's
enabled on both machines.
Since I just joined the list, let me copy and paste Shteryana's email
message that she wrote a while back:

Hi all,

I've experienced a similar problem but didn't get to analyzing it
deeper (or reporting) unfortunately ; the device is

ugen0.8:  at usbus0, cfg=0 md=HOST spd=SUPER
(5.0Gbps) pwr=ON (124mA)

   bLength = 0x0012
   bDescriptorType = 0x0001
   bcdUSB = 0x0300
   bDeviceClass = 0x00ff  
   bDeviceSubClass = 0x00ff
   bDeviceProtocol = 0x
   bMaxPacketSize0 = 0x0009
   idVendor = 0x0b95
   idProduct = 0x1790
   bcdDevice = 0x0100
   iManufacturer = 0x0001  
   iProduct = 0x0002  
   iSerialNumber = 0x0003  
   bNumConfigurations = 0x0001

what I've noticed so far -

dhclient complains about wrong ip length -

% sudo dhclient ue1
DHCPDISCOVER on ue1 to 255.255.255.255 port 67 interval 7
ifconDHCPDISCOVER on ue1 to 255.255.255.255 port 67 interval 15
ip length 328 disagrees with bytes received 332.
accepting packet with data after udp payload.
DHCPOFFER from 192.168.10.1
DHCPREQUEST on ue1 to 255.255.255.255 port 67
ip length 328 disagrees with bytes received 332.
accepting packet with data after udp payload.
DHCPACK from  192.168.10.1
bound to  192.168.10.7 -- renewal in 3600 seconds.

Running iperf3 & watching netstat -i at the same time on the interface
in question shows increasing InErrors on the interface -
NameMtu Network   Address  Ipkts Ierrs Idrop
Opkts Oerrs  Coll
...
ue11500   00:0e:c6:c6:db:ea66333   354 0
35939 0 0
ue1   - 192.168.10.0/24  192.168.10.7   66323 - -
   35927 - -

Iperf reports ~37.1 Mbits/sec on the interface.

If I attach the device on a USB2 port (or force USB2 speed on the
port) I was able to get ~ 150-200Mbits on the same system with the
same device.

This particular system has an Intel Lynx Point USB 3.0 controller

pciconf -lv | grep -A 4 xhcixhci0 at pci0
:0:20:0:
class=0x0c0330 card=0x8179103c chip=0x8c318086
rev=0x05 hdr=0x00
 vendor = 'Intel Corporation'
 device = '8 Series/C220 Series Chipset Family USB xHCI'
 class  = serial bus
 subclass   = USB

,is running FreeBSD 12.0-CURRENT #19 r315483M (M stands for a patch
similar to https://people.freebsd.org/~syrinx/freebsd_xhci-20170318-01.diff
needed to trick the controller on that particular system to actually
use USB3.0 speeds).

The same ASIX AX88179 device worked just fine reaching over 500
Mbits/sec on another system running 11.0-RELEASE-p7 with a different
USB 3 controller (will double check and report back at first
opportunity).

Hope this helps.


Hi,

Maybe you can try to use "usbdump -i usbusX -f Y -s 65536" to compare 
the USB traffic on both of these controllers in USB 3.0 mode. Make some 
stats on the transfer lengths and look for USB errors.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: AW: axge0 and AX88179

2017-06-15 Thread Hans Petter Selasky

On 06/15/17 20:29, Tom Huerlimann wrote:

Hello HPS

Thank you for your help and your investigation on this.

I start a couple of additional test in the next few days and let you know if I 
can find additional details.

Just to be sure:
  What was the FreeBSD version you have tested with?

Best regards
Tom


FreeBSD-12-current.

--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: axge0 and AX88179

2017-06-14 Thread Hans Petter Selasky

Hi Tom,

Thanks for shipping me your device.

I've now done some basic tests and your device shows varying results.

When connecting it to a GBit capable ethernet port using a short cable, 
it ends up negotiating 10MBit link speed, whilst connecting to another 
other port, 1Gbit link speed. When connecting another such device using 
the same chip and phy and everything, only a different PCB layout, 
1GBit/s is always negotiated.


Running a simple back2back iperf test gives me:

iperf -i 1 -P4 -c 1.1.1.1

Client connecting to 1.1.1.1, TCP port 5001
TCP window size: 32.8 KByte (default)

[  3] local 1.1.1.2 port 20458 connected with 1.1.1.1 port 5001
[ ID] Interval   Transfer Bandwidth
[  3]  0.0- 1.0 sec  81.4 MBytes   683 Mbits/sec
[  3]  1.0- 2.0 sec  86.2 MBytes   724 Mbits/sec
[  3]  2.0- 3.0 sec  86.2 MBytes   724 Mbits/sec
[  3]  3.0- 4.0 sec  86.4 MBytes   725 Mbits/sec
[  3]  4.0- 5.0 sec  86.2 MBytes   724 Mbits/sec
[  3]  5.0- 6.0 sec  86.4 MBytes   725 Mbits/sec
[  3]  6.0- 7.0 sec  86.0 MBytes   721 Mbits/sec
[  3]  7.0- 8.0 sec  86.4 MBytes   725 Mbits/sec
[  3]  8.0- 9.0 sec  86.1 MBytes   722 Mbits/sec
[  3]  9.0-10.0 sec  86.1 MBytes   722 Mbits/sec
[  3]  0.0-10.0 sec   858 MBytes   719 Mbits/sec

1) Can you try to override the link speed negotiated:
ifconfig ueX media 100baseTX mediaopt full-duplex

2) Can you try to enable flowcontrol:
ifconfig ueX media autoselect mediaopt flowcontrol

3) The full list of medias accepted is available by entering:
ifconfig -m ueX

If none of the above helps, I'm afraid your device might suffer from an 
electrical design problem.


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: AW: AW: axge0 and AX88179

2017-05-25 Thread Hans Petter Selasky

Hi,



Does someone have an idea what I did forget to check/verify?



You can try to enable debugging:

sysctl hw.usb.axge.debug=255

Or:

Try to log the USB traffic using "usbdump"

usbdump -i usbusX -f y -s 65536

And look for errors like "ERR".

Did you verify two such adapters back2back with iperf, for packetloss 
and other issues?


--HPS
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: AW: axge0 and AX88179

2017-05-25 Thread Hans Petter Selasky

On 05/25/17 20:37, Tom Huerlimann wrote:

Hi all,

I have the problem, that I cannot reach more than 20-40Mbit/s when using the
AX88179 chip (1Gbit/s NIC) on a USB 3.0 SuperSpeed Port (same on a 480Mbps
High Speed USB v2.0-Port).

# usbconfig dump_device_desc
(...)
ugen0.7:  at usbus0, cfg=0 md=HOST spd=SUPER
(5.0Gbps) pwr=ON (124mA)

   bLength = 0x0012
   bDescriptorType = 0x0001
   bcdUSB = 0x0300
   bDeviceClass = 0x00ff  
   bDeviceSubClass = 0x00ff
   bDeviceProtocol = 0x
   bMaxPacketSize0 = 0x0009
   idVendor = 0x0b95
   idProduct = 0x1790
   bcdDevice = 0x0100
   iManufacturer = 0x0001  
   iProduct = 0x0002  
   iSerialNumber = 0x0003  <01>
   bNumConfigurations = 0x0001

plugged into this USB controller:

# dmesg | grep -i usb
xhci0:  mem 0xd080-0xd080 irq 20
at device 20.0 on pci0 usbus0 on xhci0
ehci0:  mem 0xd0815000-0xd08153ff irq 23
at device 29.0 on pci0
usbus1: waiting for BIOS to give up control
usbus1: timed out waiting for BIOS
usbus1: EHCI version 1.0
usbus1 on ehci0
usbus0: 5.0Gbps Super Speed USB v3.0
usbus1: 480Mbps High Speed USB v2.0
ugen1.1:  at usbus1
uhub0:  on usbus1
ugen0.1: <0x8086> at usbus0
uhub1: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
ugen0.2:  at usbus0
umass0:  on usbus0
ugen0.3:  at usbus0
uhub2:  on
usbus0
ugen1.2:  at usbus1
uhub3:  on
usbus1
ugen0.4:  at usbus0
ukbd0:  on usbus0
uhid0:  on usbus0
ugen0.5:  at usbus0
uhub4:  on
usbus0
ugen0.6:  at usbus0
ukbd1: 
on usbus0
uhid1: 
on usbus0
ugen0.7:  at usbus0
axge0:  on usbus0
ue0:  on axge0

# dmesg | grep -i axge
axge0:  on usbus0
miibus2:  on axge0
ue0:  on axge0
axge0: at uhub1, port 7, addr 6 (disconnected)
axge0:  on usbus0
miibus2:  on axge0
ue0:  on axge0
axge0:  on usbus0
miibus2:  on axge0
ue0:  on axge0

I'm using FreeBSD 10.3-RELEASE-p19.

Did someone of you ever managed to reach a higher bandwidth with axge driver
and AX88179 chipset?


Yes, you can reach more than 100Mbit/s with USB 3.0.

What does ifconfig say about this device?

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Question on taskqueue_drain

2017-04-19 Thread Hans Petter Selasky

On 04/19/17 19:36, Somayajulu, David wrote:

Hans,
Thanks for the info.

No sleeping functions like taskqueue_drain() can be called when the MTX_DEF 
lock is grabbed.

I am guessing this is true irrespective of whether the taskqueue is "fast" or 
not.
Thanks


Yes, that is correct.

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Question on taskqueue_drain

2017-04-18 Thread Hans Petter Selasky

On 04/19/17 05:37, Sepherosa Ziehau wrote:

On Wed, Apr 19, 2017 at 10:39 AM, Somayajulu, David
 wrote:

Sorry what I meant to ask was, whether it is O.K to call taskqueue_drain(), 
when an MTX_DEF lock is grabbed prior to calling taskqueue_drain().



You will hit WITNESS, if the drain needs to wait; that's probably the
best case.  If the lock will be acquired in the task being drained,
this leads to deadlock.



Hi,

No sleeping functions like taskqueue_drain() can be called when the 
MTX_DEF lock is grabbed.


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Question on contrib/ofed

2017-02-13 Thread Hans Petter Selasky

On 02/14/17 00:26, Somayajulu, David wrote:

Hi All,
I have been trying building the OFED user mode libraries/apps in FreeBSD11 and 
have not been successful (please see below).  Are there any configure/set up 
commands that need to be run ?

I would appreciate any help.


Try to add:

WITH_OFED=YES

To the make comment.

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: RTL8153 Gigabit Ethernet USB Adapter

2017-01-23 Thread Hans Petter Selasky

On 01/23/17 21:06, diffusae wrote:

Hi!

Maybe a noobs question but I am mostly familiar with Linux.

Currently there is no driver for the RTL8153 Gigabit Ethernet Adapter.

Bus 001 Device 004: ID 0bda:8153 Realtek Semiconductor Corp.

https://www.freebsd.org/relnotes/CURRENT/hardware/support.html

RealTek has a Unix (Linux) driver here:

http://www.realtek.com/downloads/downloadsView.aspx?Langid=1=56=56=5=4=3=false#RTL8153

How can I compile a custome kernel with this driver?

I am using FreeBSD 11.0-STABLE (RPI-B) #0 r308738



Hi,

Have a look in sys/dev/usb/net and see if you find any similar devices. 
I think your driver is already supported. You need to:


1) kldload usb_quirk
2) kldload if_ure
3) replug your device and it should attach (FreeBSD-12 at least)

grep -r 8153 /usr/src/sys/dev/usb

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: decent 40G network adapters

2017-01-18 Thread Hans Petter Selasky

On 01/18/17 10:48, Eugene M. Zheganin wrote:

Hi.

Could someone recommend a decent 40Gbit adapter that are proven to be
working under FreeBSD ? The intended purpose - iSCSI traffic, not much
pps, but rates definitely above 10G. I've tried Supermicro-manufactured
Intel XL710 ones (two boards, different servers - same sad story:
packets loss, server unresponsive, spikes), seems like they have a
problem in a driver (or firmware), and though Intel support states this
is because the Supermicro tampered with the adapter, I'm still
suspicious about ixl(4). I've also seen in the ML a guy reported the
exact same problem with ixl(4) as I have found.

So, what would you say ? Chelsio ?



Hi,

I think also the Mellanox, mlx4 and mlx5 drivers will support this. Are 
you using infiniband or TCP for backend?


--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] D8685: Fix a false positive in a buf_ring assert

2016-12-01 Thread hselasky (Hans Petter Selasky)
hselasky accepted this revision.
hselasky added a reviewer: hselasky.
hselasky added a comment.
This revision has a positive review.


  Looks good to me.

REVISION DETAIL
  https://reviews.freebsd.org/D8685

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rstone, hselasky
Cc: hselasky, freebsd-net-list, emaste
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] D8685: Fix a false positive in a buf_ring assert

2016-11-30 Thread hselasky (Hans Petter Selasky)
hselasky added inline comments.

INLINE COMMENTS

> buf_ring.h:71
> + if (br->br_cons_head != br->br_prod_head) {
> + for (i = br->br_cons_head + 1; i != br->br_prod_head;
> + i = ((i + 1) & br->br_cons_mask))

should "br->br_cons_head + 1" be masked by br->br_cons_mask ??

REVISION DETAIL
  https://reviews.freebsd.org/D8685

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: rstone
Cc: hselasky, freebsd-net-list, emaste
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Adding RTL8153 support to rue(4) (actually cdce(4)) USB to Ethernet driver [SOLVED]

2016-10-25 Thread Hans Petter Selasky

On 10/25/16 07:08, David Horwitt wrote:

... or, at least, worked around.

I added a SetEthernetPacketFilter request with wValue PACKET_TYPE_PROMISCUOUS 
in cdce_init() (right before the
cdce_start() call) and joy ensued.

Note that (PACKET_TYPE_DIRECTED | PACKET_TYPE_BROADCAST) did _not_ work, but 
setting the promiscuous bit was the
key.

HPS: thanks again for your help.



Do you want to submit a patch upstream?

Maybe like a quirk or tunable sysctl?

--HPS

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


  1   2   3   >