Re: Traffic "corruption" in 12-stable
> On Aug 4, 2020, at 11:51, Mark Johnston wrote: > > On Mon, Aug 03, 2020 at 05:22:37PM -0400, Joe Clarke wrote: >>> On Jul 27, 2020, at 15:41, Joe Clarke wrote: On Jul 27, 2020, at 15:01, Mark Johnston wrote: There are some fixes for vmx not present in stable/12 (yet). I did a merge of a number of outstanding revisions. Would you be able to test the patch? I haven't observed any problems with it on a host using igb, but I have no ability to test vmx at the moment. >>> >>> I’m down to test anything. I did notice quite a few vmxnet3 changes around >>> performance that appealed to me. I tried a few of them on my last kernel. >>> That took much longer to exhibit the problem, but eventually did. >>> >>> I can tell you I don’t have all of these patches in, though. I’ll build >>> with this diff and start running it now. I’ll let you know how it goes. >> >> So it’s been just over a week of runtime with this full patch set. I have >> seen no further issues with ingress packet “truncation”, and performance has >> been what I expect. I’m going to keep running, but I think this seems like >> a good set to MFC. > > Done in r363844, thanks. Thank you. On day 8, and still no issues. Joe --- PGP Key : http://www.marcuscom.com/pgp.asc ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Traffic "corruption" in 12-stable
On Mon, Aug 03, 2020 at 05:22:37PM -0400, Joe Clarke wrote: > > On Jul 27, 2020, at 15:41, Joe Clarke wrote: > >> On Jul 27, 2020, at 15:01, Mark Johnston wrote: > >> There are some fixes for vmx not present in stable/12 (yet). I did a > >> merge of a number of outstanding revisions. Would you be able to test > >> the patch? I haven't observed any problems with it on a host using igb, > >> but I have no ability to test vmx at the moment. > > > > I’m down to test anything. I did notice quite a few vmxnet3 changes around > > performance that appealed to me. I tried a few of them on my last kernel. > > That took much longer to exhibit the problem, but eventually did. > > > > I can tell you I don’t have all of these patches in, though. I’ll build > > with this diff and start running it now. I’ll let you know how it goes. > > So it’s been just over a week of runtime with this full patch set. I have > seen no further issues with ingress packet “truncation”, and performance has > been what I expect. I’m going to keep running, but I think this seems like a > good set to MFC. Done in r363844, thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Traffic "corruption" in 12-stable
> On Jul 27, 2020, at 15:41, Joe Clarke wrote: > > > >> On Jul 27, 2020, at 15:01, Mark Johnston wrote: >> >> On Sun, Jul 26, 2020 at 06:16:07PM -0400, Joe Clarke wrote: >>> About two weeks ago, I upgraded from the latest 11-stable to the latest >>> 12-stable. After that, I periodically see the network throughput come to a >>> near standstill. This FreeBSD machine is an ESXi VM with two interfaces. >>> It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It >>> runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a >>> tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN >>> side) uses the default 1500. >>> >>> Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN >>> ping times), I know the problem has occurred because my lldpd reports: >>> >>> Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on >>> bridge0 >>> >>> And if I turn on ipfw verbose messages, I see tons of: >>> >>> Jul 26 16:02:23 namale kernel: ipfw: pullup failed >>> >>> This leads to me to believe packets are being corrupted on ingress. I’ve >>> applied all the recent iflib changes, but the problem persists. What causes >>> it, I don’t know. >>> >>> The only thing that changed (and yes, it’s a big one) is I upgraded to >>> 12-stable. Meaning, the rest of the network infra and topology has >>> remained the same. This did not happen at all in 11-stable. >>> >>> I’m open to suggestions. >> >> There are some fixes for vmx not present in stable/12 (yet). I did a >> merge of a number of outstanding revisions. Would you be able to test >> the patch? I haven't observed any problems with it on a host using igb, >> but I have no ability to test vmx at the moment. > > I’m down to test anything. I did notice quite a few vmxnet3 changes around > performance that appealed to me. I tried a few of them on my last kernel. > That took much longer to exhibit the problem, but eventually did. > > I can tell you I don’t have all of these patches in, though. I’ll build with > this diff and start running it now. I’ll let you know how it goes. So it’s been just over a week of runtime with this full patch set. I have seen no further issues with ingress packet “truncation”, and performance has been what I expect. I’m going to keep running, but I think this seems like a good set to MFC. Thanks again for your help. Joe --- PGP Key : http://www.marcuscom.com/pgp.asc ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Traffic "corruption" in 12-stable
> On Jul 27, 2020, at 15:01, Mark Johnston wrote: > > On Sun, Jul 26, 2020 at 06:16:07PM -0400, Joe Clarke wrote: >> About two weeks ago, I upgraded from the latest 11-stable to the latest >> 12-stable. After that, I periodically see the network throughput come to a >> near standstill. This FreeBSD machine is an ESXi VM with two interfaces. >> It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It >> runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a >> tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN >> side) uses the default 1500. >> >> Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN >> ping times), I know the problem has occurred because my lldpd reports: >> >> Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on >> bridge0 >> >> And if I turn on ipfw verbose messages, I see tons of: >> >> Jul 26 16:02:23 namale kernel: ipfw: pullup failed >> >> This leads to me to believe packets are being corrupted on ingress. I’ve >> applied all the recent iflib changes, but the problem persists. What causes >> it, I don’t know. >> >> The only thing that changed (and yes, it’s a big one) is I upgraded to >> 12-stable. Meaning, the rest of the network infra and topology has remained >> the same. This did not happen at all in 11-stable. >> >> I’m open to suggestions. > > There are some fixes for vmx not present in stable/12 (yet). I did a > merge of a number of outstanding revisions. Would you be able to test > the patch? I haven't observed any problems with it on a host using igb, > but I have no ability to test vmx at the moment. I’m down to test anything. I did notice quite a few vmxnet3 changes around performance that appealed to me. I tried a few of them on my last kernel. That took much longer to exhibit the problem, but eventually did. I can tell you I don’t have all of these patches in, though. I’ll build with this diff and start running it now. I’ll let you know how it goes. Thanks! Joe --- PGP Key : http://www.marcuscom.com/pgp.asc ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Traffic "corruption" in 12-stable
On Sun, Jul 26, 2020 at 06:16:07PM -0400, Joe Clarke wrote: > About two weeks ago, I upgraded from the latest 11-stable to the latest > 12-stable. After that, I periodically see the network throughput come to a > near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It > acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs > ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 > VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses > the default 1500. > > Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN ping > times), I know the problem has occurred because my lldpd reports: > > Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on > bridge0 > > And if I turn on ipfw verbose messages, I see tons of: > > Jul 26 16:02:23 namale kernel: ipfw: pullup failed > > This leads to me to believe packets are being corrupted on ingress. I’ve > applied all the recent iflib changes, but the problem persists. What causes > it, I don’t know. > > The only thing that changed (and yes, it’s a big one) is I upgraded to > 12-stable. Meaning, the rest of the network infra and topology has remained > the same. This did not happen at all in 11-stable. > > I’m open to suggestions. There are some fixes for vmx not present in stable/12 (yet). I did a merge of a number of outstanding revisions. Would you be able to test the patch? I haven't observed any problems with it on a host using igb, but I have no ability to test vmx at the moment. https://people.freebsd.org/~markj/patches/iflib-stable12.diff ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Traffic "corruption" in 12-stable
> On Jul 27, 2020, at 01:00, Eugene Grosbein wrote: > > 27.07.2020 5:16, Joe Clarke wrote: > >> About two weeks ago, I upgraded from the latest 11-stable to the latest >> 12-stable. After that, I periodically see the network throughput come to a >> near standstill. This FreeBSD machine is an ESXi VM with two interfaces. >> It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It >> runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a >> tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN >> side) uses the default 1500. >> >> Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN >> ping times), I know the problem has occurred because my lldpd reports: >> >> Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on >> bridge0 >> >> And if I turn on ipfw verbose messages, I see tons of: >> >> Jul 26 16:02:23 namale kernel: ipfw: pullup failed >> >> This leads to me to believe packets are being corrupted on ingress. I’ve >> applied all the recent iflib changes, but the problem persists. What causes >> it, I don’t know. >> >> The only thing that changed (and yes, it’s a big one) is I upgraded to >> 12-stable. Meaning, the rest of the network infra and topology has remained >> the same. This did not happen at all in 11-stable. >> >> I’m open to suggestions. > > First, try: ifconfig $ifname -rxcsum -txcsum Thanks for the suggestion. I should have mentioned I’ve been initializing these two interfaces since 11-stable with: ifconfig_vmx0="up mtu 9000 -tso -lro -vlanhwtso -rxcsum -txcsum -rxcsum6 -txcsum6 -tso4 -tso6 -vlanhwcsum” ifconfig_vmx1="DHCP -tso -lro -vlanhwtso -rxcsum -txcsum -rxcsum6 -txcsum6 -tso4 -tso6 -vlanhwcsum” And I’m running: FreeBSD namale.marcuscom.com 12.1-STABLE FreeBSD 12.1-STABLE NAMALE amd64 1201520 1201520 I most recently built this yesterday, but the previous kernel that exhibited the problem was built about a week ago. It had the fragment fixes for iflib.c. Joe > --- PGP Key : http://www.marcuscom.com/pgp.asc ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Traffic "corruption" in 12-stable
On 27/7/20 3:00 pm, Eugene Grosbein wrote: 27.07.2020 5:16, Joe Clarke wrote: About two weeks ago, I upgraded from the latest 11-stable to the latest 12-stable. After that, I periodically see the network throughput come to a near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses the default 1500. Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN ping times), I know the problem has occurred because my lldpd reports: Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on bridge0 And if I turn on ipfw verbose messages, I see tons of: Jul 26 16:02:23 namale kernel: ipfw: pullup failed This leads to me to believe packets are being corrupted on ingress. I’ve applied all the recent iflib changes, but the problem persists. What causes it, I don’t know. The only thing that changed (and yes, it’s a big one) is I upgraded to 12-stable. Meaning, the rest of the network infra and topology has remained the same. This did not happen at all in 11-stable. I’m open to suggestions. First, try: ifconfig $ifname -rxcsum -txcsum ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" And possibly " -vlanhwtso -tso4" as well. Graham ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Traffic "corruption" in 12-stable
27.07.2020 5:16, Joe Clarke wrote: > About two weeks ago, I upgraded from the latest 11-stable to the latest > 12-stable. After that, I periodically see the network throughput come to a > near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It > acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs > ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 > VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses > the default 1500. > > Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN ping > times), I know the problem has occurred because my lldpd reports: > > Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on > bridge0 > > And if I turn on ipfw verbose messages, I see tons of: > > Jul 26 16:02:23 namale kernel: ipfw: pullup failed > > This leads to me to believe packets are being corrupted on ingress. I’ve > applied all the recent iflib changes, but the problem persists. What causes > it, I don’t know. > > The only thing that changed (and yes, it’s a big one) is I upgraded to > 12-stable. Meaning, the rest of the network infra and topology has remained > the same. This did not happen at all in 11-stable. > > I’m open to suggestions. First, try: ifconfig $ifname -rxcsum -txcsum ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Traffic "corruption" in 12-stable
About two weeks ago, I upgraded from the latest 11-stable to the latest 12-stable. After that, I periodically see the network throughput come to a near standstill. This FreeBSD machine is an ESXi VM with two interfaces. It acts as a router. It uses vmxnet3 interfaces for both LAN and WAN. It runs ipfw with in-kernel NAT. The LAN side uses a bridge with vmx0 and a tap0 L2 VPN interface. My LAN side uses an MTU of 9000, and my vmx1 (WAN side) uses the default 1500. Besides seeing massive packet loss and huge latency (~ 200 ms for on-LAN ping times), I know the problem has occurred because my lldpd reports: Jul 26 15:47:03 namale lldpd[1126]: frame too short for tlv received on bridge0 And if I turn on ipfw verbose messages, I see tons of: Jul 26 16:02:23 namale kernel: ipfw: pullup failed This leads to me to believe packets are being corrupted on ingress. I’ve applied all the recent iflib changes, but the problem persists. What causes it, I don’t know. The only thing that changed (and yes, it’s a big one) is I upgraded to 12-stable. Meaning, the rest of the network infra and topology has remained the same. This did not happen at all in 11-stable. I’m open to suggestions. Thanks. Joe --- PGP Key : http://www.marcuscom.com/pgp.asc ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"