Re: NFSv3 issues with latest -current

2017-10-31 Thread Yuri Pankov

On Wed, 1 Nov 2017 00:27:50 +, Rick Macklem wrote:

Rodney W. Grimes wrote:
[stuff snipped]

I wrote:

Btw, NFS often causes this because...
- Typically TSO is limited to a 64K packet (including TCP/IP and MAC headers).
- When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers
   for an RPC (or a multiple of 64K like 128K).
--> This results in tcp_output() generating a 64K TSO segment followed by a
  small TCP segment (since another RPC message doesn;t usually end up
  queued quickly enough to fill in the rest of the second TCP segment).
- Also, at the end of file, you can get an RPC which is just under 64K including
   NFS and TCP/IP headers. (The drivers often broke when adding the MAC
   header bumped this case to > 64K.)

Thanks go to Yuri for diagnosing this, rick


Just a thought, not asking anyone to write one :-)

It would be handy to have some sh(1) scripts that can exercise this bug
case and have it readily avaliable to network driver authors for testing
the tso (or other large segment) code.

You can't easily reproduce this from userland. It depends on the way NFS fills 
in
the mbuf chain for I/O RPCs. (iSCSI does something similar.)

However, if your shell script does an NFS mount and the writes/reads a
file just under 64K in size on the mount...


Yes, I should be able to test this, it's not a production in any case. 
And just in case, it's not related to nfs, sorry for jumping to guesses, 
Rick, scp behaves the same, giving a fair transfer rate of 10kbps, and 
10MBps with that change backed out.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv3 issues with latest -current

2017-10-31 Thread Rick Macklem
Rodney W. Grimes wrote:
[stuff snipped]
> I wrote:
>> Btw, NFS often causes this because...
>> - Typically TSO is limited to a 64K packet (including TCP/IP and MAC 
>> headers).
>> - When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers
>>   for an RPC (or a multiple of 64K like 128K).
>> --> This results in tcp_output() generating a 64K TSO segment followed by a
>>  small TCP segment (since another RPC message doesn;t usually end up
>>  queued quickly enough to fill in the rest of the second TCP segment).
>> - Also, at the end of file, you can get an RPC which is just under 64K 
>> including
>>   NFS and TCP/IP headers. (The drivers often broke when adding the MAC
>>   header bumped this case to > 64K.)
>>
>> Thanks go to Yuri for diagnosing this, rick
>
> Just a thought, not asking anyone to write one :-)
>
> It would be handy to have some sh(1) scripts that can exercise this bug
> case and have it readily avaliable to network driver authors for testing
> the tso (or other large segment) code.
You can't easily reproduce this from userland. It depends on the way NFS fills 
in
the mbuf chain for I/O RPCs. (iSCSI does something similar.)

However, if your shell script does an NFS mount and the writes/reads a
file just under 64K in size on the mount...

rick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv3 issues with latest -current

2017-10-31 Thread Rodney W. Grimes
> Cy Schubert wrote:
> [stuff snipped]
> >The sysctl is net.inet.tcp.tso. You can also disable tso through ifconfig
> >for an interface.
> >
> For testing this case, I'd recommend using the sysctl. Since the net device
> driver is often the culprit, that device driver might not handle the 
> "ifconfig"
> correctly either.
> 
> Btw, NFS often causes this because...
> - Typically TSO is limited to a 64K packet (including TCP/IP and MAC headers).
> - When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers
>   for an RPC (or a multiple of 64K like 128K).
> --> This results in tcp_output() generating a 64K TSO segment followed by a
>  small TCP segment (since another RPC message doesn;t usually end up
>  queued quickly enough to fill in the rest of the second TCP segment).
> - Also, at the end of file, you can get an RPC which is just under 64K 
> including
>   NFS and TCP/IP headers. (The drivers often broke when adding the MAC
>   header bumped this case to > 64K.)
> 
> Thanks go to Yuri for diagnosing this, rick

Just a thought, not asking anyone to write one :-)

It would be handy to have some sh(1) scripts that can exercise this bug
case and have it readily avaliable to network driver authors for testing
the tso (or other large segment) code.

-- 
Rod Grimes rgri...@freebsd.org
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv3 issues with latest -current

2017-10-31 Thread Rick Macklem
Cy Schubert wrote:
[stuff snipped]
>The sysctl is net.inet.tcp.tso. You can also disable tso through ifconfig
>for an interface.
>
For testing this case, I'd recommend using the sysctl. Since the net device
driver is often the culprit, that device driver might not handle the "ifconfig"
correctly either.

Btw, NFS often causes this because...
- Typically TSO is limited to a 64K packet (including TCP/IP and MAC headers).
- When NFS does reading/writing, it will do 64K + NFS, TCP/IP and MAC headers
  for an RPC (or a multiple of 64K like 128K).
--> This results in tcp_output() generating a 64K TSO segment followed by a
 small TCP segment (since another RPC message doesn;t usually end up
 queued quickly enough to fill in the rest of the second TCP segment).
- Also, at the end of file, you can get an RPC which is just under 64K including
  NFS and TCP/IP headers. (The drivers often broke when adding the MAC
  header bumped this case to > 64K.)

Thanks go to Yuri for diagnosing this, rick


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv3 issues with latest -current

2017-10-29 Thread Yuri Pankov

On Sun, 29 Oct 2017 18:11:41 +0300, Yuri Pankov wrote:

On Sun, 29 Oct 2017 13:13:31 +, Rick Macklem wrote:

Yuri Pankov wrote:

All file operations (e.g. copying the file over NFSv3 for me) seem to be
stuck running the latest -current (r325100).  Reverting just the kernel
to r323779 (arbitrary chosen) seems to help.  I noticed the "Stale file
handle when mounting nfs" message but I don't get the "stale file
handle" messages from mount, probably as I'm not running any linux clients.

These kinds of problems are usually related to your net interface device
driver or the TCP stack.

A couple of things to try:
- Disable TSO (look for a sysctl with "tso" in it).
- Try using mount options rsize=32768,waize=32768 to reduce the I/O
size. Some device drivers don't handle long chains of mbufs well,
especially when the size is near 64K.
(These issues have been fixed in current, but if a bug slips into a net driver
   update or ???)
- Look at recent changes to the net device driver you are using and try 
reverting
those changes if you can do so.
- Capture packets and look at them in wireshark (which knows NFS) and see
what is going on the wire.

There hasn't been any recent changes to NFS that should affect NFSv3 mounts
or to the kernel rpc, so I doubt the NFSv4.1 changes would be involved.


Thanks for the hints, Rick!

Indeed, it was one of the changes to sys/dev/e1000, reverting just the
commit made everything look normal again (CC'ing the author).


One thing I forgot to mention here, the problem is visible only if the 
client side has MTU of 1500 configured; when both sides have MTU 9000, 
everything looks to be normal -- noticed this when my XenServer (having 
MTU of 1500 on management interface) wasn't able to read the ISO from 
NFS datastore.



The NIC is:

igb0@pci0:2:0:0:class=0x02 card=0x10c915d9 chip=0x10c98086
rev=0x01 hdr=0x00
  vendor = 'Intel Corporation'
  device = '82576 Gigabit Network Connection'
  class  = network
  subclass   = ethernet

Interface configuration (note the MTU):

igb0: flags=8843 metric 0 mtu 9000
options=e525bb
  ether 00:25:90:72:54:22
  inet6 fe80::225:90ff:fe72:5422%igb0 prefixlen 64 scopeid 0x1
  inet 192.168.1.4 netmask 0xff00 broadcast 192.168.1.255
  nd6 options=23
  media: Ethernet autoselect (1000baseT )
  status: active

And the commit itself:

commit f81cb8df32ae96299b8bbc2e948c17ad3aab59ca
Author: shurd 
Date:   Sat Sep 23 01:33:20 2017 +

  Some small packet performance improvements

  If the packet is smaller than MTU, disable the TSO flags.
  Move TCP header parsing inside the IS_TSO?() test.
  Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need
to be zeroed before TX.

  Reviewed by:sbruno
  Approved by:sbruno (mentor)
  Sponsored by:   Limelight Networks
  Differential Revision:  https://reviews.freebsd.org/D12442

Notes:
  svn path=/head/; revision=323941




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv3 issues with latest -current

2017-10-29 Thread Cy Schubert
In message , Rick Macklem writes:
> Yuri Pankov wrote:
> > All file operations (e.g. copying the file over NFSv3 for me) seem to be
> > stuck running the latest -current (r325100).  Reverting just the kernel
> > to r323779 (arbitrary chosen) seems to help.  I noticed the "Stale file
> > handle when mounting nfs" message but I don't get the "stale file
> > handle" messages from mount, probably as I'm not running any linux clients.
> These kinds of problems are usually related to your net interface device
> driver or the TCP stack.
> 
> A couple of things to try:
> - Disable TSO (look for a sysctl with "tso" in it).

The sysctl is net.inet.tcp.tso. You can also disable tso through ifconfig 
for an interface.

Additionally, though not directly related to this, TSO is not generally 
firewall friendly either. Our friends at pfsense document it be disabled. 
It can cause issues with our ipfilter firewall on some interfaces (e.g. 
fxp). I also recall reading a Juniper KB about path MTU discovery and TSO 
not playing well together.

Not that TSO is bad but disabling it is something to try when diagnosing 
network problems.



-- 
Cheers,
Cy Schubert 
FreeBSD UNIX:     Web:  http://www.FreeBSD.org

The need of the many outweighs the greed of the few.



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv3 issues with latest -current

2017-10-29 Thread Yuri Pankov

On Sun, 29 Oct 2017 13:13:31 +, Rick Macklem wrote:

Yuri Pankov wrote:

All file operations (e.g. copying the file over NFSv3 for me) seem to be
stuck running the latest -current (r325100).  Reverting just the kernel
to r323779 (arbitrary chosen) seems to help.  I noticed the "Stale file
handle when mounting nfs" message but I don't get the "stale file
handle" messages from mount, probably as I'm not running any linux clients.

These kinds of problems are usually related to your net interface device
driver or the TCP stack.

A couple of things to try:
- Disable TSO (look for a sysctl with "tso" in it).
- Try using mount options rsize=32768,waize=32768 to reduce the I/O
   size. Some device drivers don't handle long chains of mbufs well,
   especially when the size is near 64K.
(These issues have been fixed in current, but if a bug slips into a net driver
  update or ???)
- Look at recent changes to the net device driver you are using and try 
reverting
   those changes if you can do so.
- Capture packets and look at them in wireshark (which knows NFS) and see
   what is going on the wire.

There hasn't been any recent changes to NFS that should affect NFSv3 mounts
or to the kernel rpc, so I doubt the NFSv4.1 changes would be involved.


Thanks for the hints, Rick!

Indeed, it was one of the changes to sys/dev/e1000, reverting just the 
commit made everything look normal again (CC'ing the author).


The NIC is:

igb0@pci0:2:0:0:class=0x02 card=0x10c915d9 chip=0x10c98086 
rev=0x01 hdr=0x00

vendor = 'Intel Corporation'
device = '82576 Gigabit Network Connection'
class  = network
subclass   = ethernet

Interface configuration (note the MTU):

igb0: flags=8843 metric 0 mtu 9000
options=e525bb
ether 00:25:90:72:54:22
inet6 fe80::225:90ff:fe72:5422%igb0 prefixlen 64 scopeid 0x1
inet 192.168.1.4 netmask 0xff00 broadcast 192.168.1.255
nd6 options=23
media: Ethernet autoselect (1000baseT )
status: active

And the commit itself:

commit f81cb8df32ae96299b8bbc2e948c17ad3aab59ca
Author: shurd 
Date:   Sat Sep 23 01:33:20 2017 +

Some small packet performance improvements

If the packet is smaller than MTU, disable the TSO flags.
Move TCP header parsing inside the IS_TSO?() test.
Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need 
to be zeroed before TX.


Reviewed by:sbruno
Approved by:sbruno (mentor)
Sponsored by:   Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D12442

Notes:
svn path=/head/; revision=323941
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv3 issues with latest -current

2017-10-29 Thread Rick Macklem
Yuri Pankov wrote:
> All file operations (e.g. copying the file over NFSv3 for me) seem to be
> stuck running the latest -current (r325100).  Reverting just the kernel
> to r323779 (arbitrary chosen) seems to help.  I noticed the "Stale file
> handle when mounting nfs" message but I don't get the "stale file
> handle" messages from mount, probably as I'm not running any linux clients.
These kinds of problems are usually related to your net interface device
driver or the TCP stack.

A couple of things to try:
- Disable TSO (look for a sysctl with "tso" in it).
- Try using mount options rsize=32768,waize=32768 to reduce the I/O
  size. Some device drivers don't handle long chains of mbufs well,
  especially when the size is near 64K.
(These issues have been fixed in current, but if a bug slips into a net driver
 update or ???)
- Look at recent changes to the net device driver you are using and try 
reverting
  those changes if you can do so.
- Capture packets and look at them in wireshark (which knows NFS) and see
  what is going on the wire.

There hasn't been any recent changes to NFS that should affect NFSv3 mounts
or to the kernel rpc, so I doubt the NFSv4.1 changes would be involved.

rick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


NFSv3 issues with latest -current

2017-10-29 Thread Yuri Pankov

Hi,

All file operations (e.g. copying the file over NFSv3 for me) seem to be 
stuck running the latest -current (r325100).  Reverting just the kernel 
to r323779 (arbitrary chosen) seems to help.  I noticed the "Stale file 
handle when mounting nfs" message but I don't get the "stale file 
handle" messages from mount, probably as I'm not running any linux clients.


Is anyone else seeing the same?  Is there any other information I need 
to provide here?

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"