[Bug 220076] [patch] [panic] [netgraph] repeatable kernel panic due to a race in ng_iface(4)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220076 Mark Linimonchanged: What|Removed |Added CC|freebsd-net@FreeBSD.org | Assignee|freebsd-b...@freebsd.org|freebsd-net@FreeBSD.org -- You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 200382] Loading netgraph via bsnmpd, etc can cause domain to be registered after domain_finalize has been called
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200382 Mark Linimonchanged: What|Removed |Added Assignee|n...@freebsd.org |freebsd-net@FreeBSD.org --- Comment #3 from Mark Linimon --- Canonicalize assignment. -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 200382] Loading netgraph via bsnmpd, etc can cause domain to be registered after domain_finalize has been called
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200382 Mark Linimonchanged: What|Removed |Added Assignee|n...@freebsd.org |freebsd-net@FreeBSD.org --- Comment #3 from Mark Linimon --- Canonicalize assignment. -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Problem reports for freebsd-net@FreeBSD.org that need special attention
To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status |Bug Id | Description +---+--- In Progress |165622 | [ndis][panic][patch] Unregistered use of FPU in k In Progress |206581 | bxe_ioctl_nvram handler is faulty New |204438 | setsockopt() handling of kern.ipc.maxsockbuf limi New |205592 | TCP processing in IPSec causes kernel panic New |206053 | kqueue support code of netmap causes panic New |213410 | [carp] service netif restart causes hang only whe New |215874 | [patch] [icmp] [mbuf_tags] teach icmp_error() opt New |217748 | sys/dev/ixgbe/if_ix.c: PVS-Studio: Assignment to Open|173444 | socket: IPV6_USE_MIN_MTU and TCP is broken Open|193452 | Dell PowerEdge 210 II -- Kernel panic bce (broadc Open|194485 | Userland cannot add IPv6 prefix routes Open|194515 | Fatal Trap 12 Kernel with vimage Open|199136 | [if_tap] Added down_on_close sysctl variable to t Open|202510 | [CARP] advertisements sourced from CARP IP cause Open|206544 | sendmsg(2) (sendto(2) too?) can fail with EINVAL; Open|211031 | [panic] in ng_uncallout when argument is NULL Open|211962 | bxe driver queue soft hangs and flooding tx_soft_ Open|218653 | Intel e1000 network link drops under high network 18 problems total for which you should take action. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf_jumbo_9k & iSCSI failing
> On 25 Jun 2017, at 17:32, Ryan Stonewrote: > > Having looking at the original email more closely, I see that you showed an > mlxen interface with a 9020 MTU. Seeing allocation failures of 9k mbuf > clusters increase while you are far below the zone's limit means that you're > definitely running into the bug I'm describing, and this bug could plausibly > cause the iSCSI errors that you describe. > > The issue is that the newer version of the driver tries to allocate a single > buffer to accommodate an MTU-sized packet. Over time, however, memory will > become fragmented and eventually it can become impossible to allocate a 9k > physically contiguous buffer. When this happens the driver is unable to > allocate buffers to receive packets and is forced to drop them. Presumably, > if iSCSI suffers too many packet drops it will terminate the connection. The > older version of the driver limited itself to page-sized buffers, so it was > immune to issues with memory fragmentation. Thank you for your explanation Ryan. You say "over time", and you're right, I have to wait several days (here 88) before the problem occurs. Strange however that in 2500MB free memory system is unable to find 9k physically contiguous. But we never know :) Let's then wait for your patch ! (and reboot for now) Many thx ! Ben ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf_jumbo_9k & iSCSI failing
Having looking at the original email more closely, I see that you showed an mlxen interface with a 9020 MTU. Seeing allocation failures of 9k mbuf clusters increase while you are far below the zone's limit means that you're definitely running into the bug I'm describing, and this bug could plausibly cause the iSCSI errors that you describe. The issue is that the newer version of the driver tries to allocate a single buffer to accommodate an MTU-sized packet. Over time, however, memory will become fragmented and eventually it can become impossible to allocate a 9k physically contiguous buffer. When this happens the driver is unable to allocate buffers to receive packets and is forced to drop them. Presumably, if iSCSI suffers too many packet drops it will terminate the connection. The older version of the driver limited itself to page-sized buffers, so it was immune to issues with memory fragmentation. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf_jumbo_9k & iSCSI failing
> On 25 Jun 2017, at 17:14, Ryan Stonewrote: > > Is this setup using the mlx4_en driver? If so, recent versions of that > driver has a regression when using MTUs greater than the page size (4096 on > i386/amd64). The bug will cause the card to drop packets when the system is > under memory pressure, and in certain causes the card can get into a state > when it is no longer able to receive packets. I am working on a fix; I can > post a patch when it's complete. Thank you very much for your feedback Ryan. Yes, my system is using mlx4_en driver, the one directly from FreeBSD 11.0 sources tree. Any indicator I could catch to be sure I'm experiencing the issue you are working on ? Sounds like anyway I may be suffering from it... Of course I would be glad to help testing your patch when it's complete. Thank you again, Ben ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf_jumbo_9k & iSCSI failing
Is this setup using the mlx4_en driver? If so, recent versions of that driver has a regression when using MTUs greater than the page size (4096 on i386/amd64). The bug will cause the card to drop packets when the system is under memory pressure, and in certain causes the card can get into a state when it is no longer able to receive packets. I am working on a fix; I can post a patch when it's complete. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
mbuf_jumbo_9k & iSCSI failing
> On 30 Dec 2016, at 22:55, Ben RUBSONwrote: > > Hello, > > 2 FreeBSD 11.0-p3 servers, one iSCSI initiator, one target. > Both with Mellanox ConnectX-3 40G. > > Since a few days, sometimes, under undetermined circumstances, as soon as > there is some (very low) iSCSI traffic, some of the disks get disconnected : > kernel: WARNING: 192.168.2.2 (iqn..): no ping reply (NOP-Out) after 5 > seconds; dropping connection > > At the same moment, sysctl counters hw.mlxen1.stat.rx_ring*.error grow on > initiator side. > > I then tried to reproduce these network errors burning the link at 40G > full-duplex using iPerf. > But I did not manage to increase these error counters. > > It's strange because it's a sporadic issue, I can have traffic on iSCSI disks > without any issue, and sometimes, they get disconnected with errors growing. > On 01 Jan 2017, at 09:16, Meny Yossefi wrote: > > Any chance you ran out of mbufs in the system? > On 02 Jan 2017, at 12:09, Ben RUBSON wrote: > > I think you are right, this could be a mbufs issue. > Here are some more numbers : > > # vmstat -z | grep -v "0, 0$" > ITEM SIZE LIMIT USED FREE REQ FAIL > SLEEP > 4 Bucket:32, 0,2673, 28327, 88449799,17317, 0 > 8 Bucket:64, 0, 449, 15609, 13926386, 4871, 0 > 12 Bucket: 96, 0, 335,5323, 10293892, 142872, 0 > 16 Bucket: 128, 0, 533,6070,7618615, 472647, 0 > 32 Bucket: 256, 0,8317, 22133, 36020376, 563479, 0 > 64 Bucket: 512, 0,1238,3298, 20138111, 11430742, 0 > 128 Bucket:1024, 0,1865,2963, 21162182, 158752, 0 > 256 Bucket:2048, 0,1626, 450, 80253784, 4890164, 0 > mbuf_jumbo_9k: 9216, 603712, 16400,8744, 4128521064, 2661, 0 > On 03 Jan 2017, at 07:27, Meny Yossefi wrote: > > Have you tried increasing the mbufs limit? > (sysctl) kern.ipc.nmbufs (Maximum number of mbufs allowed) > On 04 Jan 2017, at 14:47, Ben RUBSON wrote: > > No I did not try this yet. > However, from the numbers above (and below), I think I should increase > kern.ipc.nmbjumbo9 instead ? > On 30 Jan 2017, at 15:36, Ben RUBSON wrote: > > So, to give some news, increasing kern.ipc.nmbjumbo9 helped a lot. > Just a very little issue (compared to the others before) over the last 3 > weeks. Hello, I'm back today with this issue. Above is my discussion with Meny from Mellanox at the beginning of 2017. (topic was "iSCSI failing, MLX rx_ring errors ?", on freebsd-net list) So this morning issue came again, some of my iSCSI disks were disconnected. Below are some numbers. # vmstat -z | grep -v "0, 0$" ITEM SIZELIMIT USED FREE REQ FAIL SLEEP 8 Bucket:64, 0, 654,8522, 28604967, 11, 0 12 Bucket: 96, 0, 976,5092, 23758734, 78, 0 32 Bucket: 256, 0, 789,4491, 43446969, 137, 0 64 Bucket: 512, 0, 666,2750, 47568959, 1272018, 0 128 Bucket:1024, 0,1047,1249, 28774042, 232504, 0 256 Bucket:2048, 0,1611, 369, 139988097, 8931139, 0 vmem btag: 56, 0, 2949738, 15506, 18092235, 20908, 0 mbuf_jumbo_9k: 9216, 2037529, 16400,8776, 8610737115, 297, 0 # uname -rs FreeBSD 11.0-RELEASE-p8 # uptime 3:34p.m. up 88 days, 15:57, 2 users, load averages: 0.95, 0.67, 0.62 # grep kern.ipc.nmb /boot/loader.conf kern.ipc.nmbjumbo9=2037529 kern.ipc.nmbjumbo16=1 # sysctl kern.ipc | grep mb kern.ipc.nmbufs: 26080380 kern.ipc.nmbjumbo16: 4 kern.ipc.nmbjumbo9: 6112587 kern.ipc.nmbjumbop: 2037529 kern.ipc.nmbclusters: 4075060 kern.ipc.maxmbufmem: 33382887424 # ifconfig mlxen1 mlxen1: flags=8843 metric 0 mtu 9020 options=ed07bb nd6 options=29 media: Ethernet autoselect (40Gbase-CR4 ) status: active I just caught the issue growing : # vmstat -z | grep mbuf_jumbo_9k ITEM SIZE LIMIT USED FREEREQ FAIL SLEEP mbuf_jumbo_9k: 9216, 2037529, 16415, 7316,8735246407, 665, 0 mbuf_jumbo_9k: 9216, 2037529, 16411, 7320,8735286748, 665, 0 mbuf_jumbo_9k: 9216, 2037529, 16415, 7316,8735298937, 667, 0 mbuf_jumbo_9k: 9216, 2037529, 16438, 7293,8735337634, 667, 0 mbuf_jumbo_9k: 9216, 2037529, 16407, 7324,8735354339, 668, 0 mbuf_jumbo_9k: 9216, 2037529, 16400, 7331,8735382105, 669, 0 mbuf_jumbo_9k: 9216, 2037529, 16402, 7329,8735392836, 671, 0 mbuf_jumbo_9k: 9216, 2037529, 16400, 7331,8735423910, 671, 0