Re: NFSv3 issues with latest -current
On Wed, 1 Nov 2017 00:27:50 +, Rick Macklem wrote: Rodney W. Grimes wrote: [stuff snipped] I wrote: Btw, NFS often causes this because... - Typically TSO is limited to a 64K packet (including TCP/IP and MAC headers). - When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers for an RPC (or a multiple of 64K like 128K). --> This results in tcp_output() generating a 64K TSO segment followed by a small TCP segment (since another RPC message doesn;t usually end up queued quickly enough to fill in the rest of the second TCP segment). - Also, at the end of file, you can get an RPC which is just under 64K including NFS and TCP/IP headers. (The drivers often broke when adding the MAC header bumped this case to > 64K.) Thanks go to Yuri for diagnosing this, rick Just a thought, not asking anyone to write one :-) It would be handy to have some sh(1) scripts that can exercise this bug case and have it readily avaliable to network driver authors for testing the tso (or other large segment) code. You can't easily reproduce this from userland. It depends on the way NFS fills in the mbuf chain for I/O RPCs. (iSCSI does something similar.) However, if your shell script does an NFS mount and the writes/reads a file just under 64K in size on the mount... Yes, I should be able to test this, it's not a production in any case. And just in case, it's not related to nfs, sorry for jumping to guesses, Rick, scp behaves the same, giving a fair transfer rate of 10kbps, and 10MBps with that change backed out. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFSv3 issues with latest -current
Rodney W. Grimes wrote: [stuff snipped] > I wrote: >> Btw, NFS often causes this because... >> - Typically TSO is limited to a 64K packet (including TCP/IP and MAC >> headers). >> - When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers >> for an RPC (or a multiple of 64K like 128K). >> --> This results in tcp_output() generating a 64K TSO segment followed by a >> small TCP segment (since another RPC message doesn;t usually end up >> queued quickly enough to fill in the rest of the second TCP segment). >> - Also, at the end of file, you can get an RPC which is just under 64K >> including >> NFS and TCP/IP headers. (The drivers often broke when adding the MAC >> header bumped this case to > 64K.) >> >> Thanks go to Yuri for diagnosing this, rick > > Just a thought, not asking anyone to write one :-) > > It would be handy to have some sh(1) scripts that can exercise this bug > case and have it readily avaliable to network driver authors for testing > the tso (or other large segment) code. You can't easily reproduce this from userland. It depends on the way NFS fills in the mbuf chain for I/O RPCs. (iSCSI does something similar.) However, if your shell script does an NFS mount and the writes/reads a file just under 64K in size on the mount... rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFSv3 issues with latest -current
> Cy Schubert wrote: > [stuff snipped] > >The sysctl is net.inet.tcp.tso. You can also disable tso through ifconfig > >for an interface. > > > For testing this case, I'd recommend using the sysctl. Since the net device > driver is often the culprit, that device driver might not handle the > "ifconfig" > correctly either. > > Btw, NFS often causes this because... > - Typically TSO is limited to a 64K packet (including TCP/IP and MAC headers). > - When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers > for an RPC (or a multiple of 64K like 128K). > --> This results in tcp_output() generating a 64K TSO segment followed by a > small TCP segment (since another RPC message doesn;t usually end up > queued quickly enough to fill in the rest of the second TCP segment). > - Also, at the end of file, you can get an RPC which is just under 64K > including > NFS and TCP/IP headers. (The drivers often broke when adding the MAC > header bumped this case to > 64K.) > > Thanks go to Yuri for diagnosing this, rick Just a thought, not asking anyone to write one :-) It would be handy to have some sh(1) scripts that can exercise this bug case and have it readily avaliable to network driver authors for testing the tso (or other large segment) code. -- Rod Grimes rgri...@freebsd.org ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFSv3 issues with latest -current
Cy Schubert wrote: [stuff snipped] >The sysctl is net.inet.tcp.tso. You can also disable tso through ifconfig >for an interface. > For testing this case, I'd recommend using the sysctl. Since the net device driver is often the culprit, that device driver might not handle the "ifconfig" correctly either. Btw, NFS often causes this because... - Typically TSO is limited to a 64K packet (including TCP/IP and MAC headers). - When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers for an RPC (or a multiple of 64K like 128K). --> This results in tcp_output() generating a 64K TSO segment followed by a small TCP segment (since another RPC message doesn;t usually end up queued quickly enough to fill in the rest of the second TCP segment). - Also, at the end of file, you can get an RPC which is just under 64K including NFS and TCP/IP headers. (The drivers often broke when adding the MAC header bumped this case to > 64K.) Thanks go to Yuri for diagnosing this, rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFSv3 issues with latest -current
On Sun, 29 Oct 2017 18:11:41 +0300, Yuri Pankov wrote: On Sun, 29 Oct 2017 13:13:31 +, Rick Macklem wrote: Yuri Pankov wrote: All file operations (e.g. copying the file over NFSv3 for me) seem to be stuck running the latest -current (r325100). Reverting just the kernel to r323779 (arbitrary chosen) seems to help. I noticed the "Stale file handle when mounting nfs" message but I don't get the "stale file handle" messages from mount, probably as I'm not running any linux clients. These kinds of problems are usually related to your net interface device driver or the TCP stack. A couple of things to try: - Disable TSO (look for a sysctl with "tso" in it). - Try using mount options rsize=32768,waize=32768 to reduce the I/O size. Some device drivers don't handle long chains of mbufs well, especially when the size is near 64K. (These issues have been fixed in current, but if a bug slips into a net driver update or ???) - Look at recent changes to the net device driver you are using and try reverting those changes if you can do so. - Capture packets and look at them in wireshark (which knows NFS) and see what is going on the wire. There hasn't been any recent changes to NFS that should affect NFSv3 mounts or to the kernel rpc, so I doubt the NFSv4.1 changes would be involved. Thanks for the hints, Rick! Indeed, it was one of the changes to sys/dev/e1000, reverting just the commit made everything look normal again (CC'ing the author). One thing I forgot to mention here, the problem is visible only if the client side has MTU of 1500 configured; when both sides have MTU 9000, everything looks to be normal -- noticed this when my XenServer (having MTU of 1500 on management interface) wasn't able to read the ISO from NFS datastore. The NIC is: igb0@pci0:2:0:0:class=0x02 card=0x10c915d9 chip=0x10c98086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82576 Gigabit Network Connection' class = network subclass = ethernet Interface configuration (note the MTU): igb0: flags=8843metric 0 mtu 9000 options=e525bb ether 00:25:90:72:54:22 inet6 fe80::225:90ff:fe72:5422%igb0 prefixlen 64 scopeid 0x1 inet 192.168.1.4 netmask 0xff00 broadcast 192.168.1.255 nd6 options=23 media: Ethernet autoselect (1000baseT ) status: active And the commit itself: commit f81cb8df32ae96299b8bbc2e948c17ad3aab59ca Author: shurd Date: Sat Sep 23 01:33:20 2017 + Some small packet performance improvements If the packet is smaller than MTU, disable the TSO flags. Move TCP header parsing inside the IS_TSO?() test. Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need to be zeroed before TX. Reviewed by:sbruno Approved by:sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12442 Notes: svn path=/head/; revision=323941 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFSv3 issues with latest -current
In message, Rick Macklem writes: > Yuri Pankov wrote: > > All file operations (e.g. copying the file over NFSv3 for me) seem to be > > stuck running the latest -current (r325100). Reverting just the kernel > > to r323779 (arbitrary chosen) seems to help. I noticed the "Stale file > > handle when mounting nfs" message but I don't get the "stale file > > handle" messages from mount, probably as I'm not running any linux clients. > These kinds of problems are usually related to your net interface device > driver or the TCP stack. > > A couple of things to try: > - Disable TSO (look for a sysctl with "tso" in it). The sysctl is net.inet.tcp.tso. You can also disable tso through ifconfig for an interface. Additionally, though not directly related to this, TSO is not generally firewall friendly either. Our friends at pfsense document it be disabled. It can cause issues with our ipfilter firewall on some interfaces (e.g. fxp). I also recall reading a Juniper KB about path MTU discovery and TSO not playing well together. Not that TSO is bad but disabling it is something to try when diagnosing network problems. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFSv3 issues with latest -current
On Sun, 29 Oct 2017 13:13:31 +, Rick Macklem wrote: Yuri Pankov wrote: All file operations (e.g. copying the file over NFSv3 for me) seem to be stuck running the latest -current (r325100). Reverting just the kernel to r323779 (arbitrary chosen) seems to help. I noticed the "Stale file handle when mounting nfs" message but I don't get the "stale file handle" messages from mount, probably as I'm not running any linux clients. These kinds of problems are usually related to your net interface device driver or the TCP stack. A couple of things to try: - Disable TSO (look for a sysctl with "tso" in it). - Try using mount options rsize=32768,waize=32768 to reduce the I/O size. Some device drivers don't handle long chains of mbufs well, especially when the size is near 64K. (These issues have been fixed in current, but if a bug slips into a net driver update or ???) - Look at recent changes to the net device driver you are using and try reverting those changes if you can do so. - Capture packets and look at them in wireshark (which knows NFS) and see what is going on the wire. There hasn't been any recent changes to NFS that should affect NFSv3 mounts or to the kernel rpc, so I doubt the NFSv4.1 changes would be involved. Thanks for the hints, Rick! Indeed, it was one of the changes to sys/dev/e1000, reverting just the commit made everything look normal again (CC'ing the author). The NIC is: igb0@pci0:2:0:0:class=0x02 card=0x10c915d9 chip=0x10c98086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82576 Gigabit Network Connection' class = network subclass = ethernet Interface configuration (note the MTU): igb0: flags=8843metric 0 mtu 9000 options=e525bb ether 00:25:90:72:54:22 inet6 fe80::225:90ff:fe72:5422%igb0 prefixlen 64 scopeid 0x1 inet 192.168.1.4 netmask 0xff00 broadcast 192.168.1.255 nd6 options=23 media: Ethernet autoselect (1000baseT ) status: active And the commit itself: commit f81cb8df32ae96299b8bbc2e948c17ad3aab59ca Author: shurd Date: Sat Sep 23 01:33:20 2017 + Some small packet performance improvements If the packet is smaller than MTU, disable the TSO flags. Move TCP header parsing inside the IS_TSO?() test. Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need to be zeroed before TX. Reviewed by:sbruno Approved by:sbruno (mentor) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D12442 Notes: svn path=/head/; revision=323941 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFSv3 issues with latest -current
Yuri Pankov wrote: > All file operations (e.g. copying the file over NFSv3 for me) seem to be > stuck running the latest -current (r325100). Reverting just the kernel > to r323779 (arbitrary chosen) seems to help. I noticed the "Stale file > handle when mounting nfs" message but I don't get the "stale file > handle" messages from mount, probably as I'm not running any linux clients. These kinds of problems are usually related to your net interface device driver or the TCP stack. A couple of things to try: - Disable TSO (look for a sysctl with "tso" in it). - Try using mount options rsize=32768,waize=32768 to reduce the I/O size. Some device drivers don't handle long chains of mbufs well, especially when the size is near 64K. (These issues have been fixed in current, but if a bug slips into a net driver update or ???) - Look at recent changes to the net device driver you are using and try reverting those changes if you can do so. - Capture packets and look at them in wireshark (which knows NFS) and see what is going on the wire. There hasn't been any recent changes to NFS that should affect NFSv3 mounts or to the kernel rpc, so I doubt the NFSv4.1 changes would be involved. rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
NFSv3 issues with latest -current
Hi, All file operations (e.g. copying the file over NFSv3 for me) seem to be stuck running the latest -current (r325100). Reverting just the kernel to r323779 (arbitrary chosen) seems to help. I noticed the "Stale file handle when mounting nfs" message but I don't get the "stale file handle" messages from mount, probably as I'm not running any linux clients. Is anyone else seeing the same? Is there any other information I need to provide here? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"