[Bug 1887607] Re: NFSv4.1: Interrupted connections cause high bandwidth RPC ping-pong between client and server

Matthew Ruffell Tue, 28 Jul 2020 20:56:23 -0700

** Summary changed:

- NFS4.2: Cutting and Pasting files from NFS sec=sys to NFS sec=krb5p causes 
NFS to hang
+ NFSv4.1: Interrupted connections cause high bandwidth RPC ping-pong between 
client and server


** Description changed:

  BugLink: https://bugs.launchpad.net/bugs/1887607
  
  [Impact]
  
- If you have a desktop system, with two NFS v4.2 mounts: 
- - One that uses the baseline IP based security, aka sec=sys, 
- - and the other that uses Kerberos sec=krb5p based security,
+ There is a bug in NFS v4.1 that causes a large amount of RPC calls
+ between a client and server when a previous RPC call is interrupted.
+ This uses a large amount of bandwidth and can saturate the network.
  
- If you try and cut a file from the normal NFS mount, and paste it to a
- directory on the kerberos krb5p mount (using Nautilus), the NFS
- subsystem will lock up, Nautilus will hang, and the file won't be moved.
+ The symptoms are so:
  
- The problem only reproduces if you cut and paste. Copying and pasting
- does not trigger any problems. Using mv in terminal doesn't reproduce
- either, you need to use Nautilus.
+ * On NFS clients:
+ Attempts to access mounted NFS shares associated with the affected server 
block indefinitely.
  
- The issue was introduced into 4.15.0-60-generic, by the commit:
+ * On the network:
+ A storm of repeated RPCs between NFS client and server uses a lot of 
bandwidth. Each RPC is acknoledged by the server with an NFS4ERR_SEQ_MISORDERED 
error.
  
- commit 594d1644cd59447f4fceb592448d5cd09eb09b5e
- Author: Chris Perl <[email protected]>
- Date: Mon Dec 17 10:56:38 2018 -0500
- Subject: NFS: nfs_compare_mount_options always compare auth flavors.
- Link: 
https://github.com/torvalds/linux/commit/594d1644cd59447f4fceb592448d5cd09eb09b5e
 
+ * Other NFS clients connected to the same NFS server:
+ Performance drops dramatically.
  
- It was backported to 4.15.0-60-generic from upstream -stable, and landed
- in 4.4.175, 4.14.99 and 4.19.21. The commit itself does not actually
- cause the problem, it simply removes a check for NFS server security
- settings, which simply reveals a broken codepath which the testcase
- triggers.
+ This occurs during a "false retry", when a client attempts to make a new
+ RPC call using a slot+sequence number that references an older, cached
+ call. This happens when a user process interrupts an RPC call that is in
+ progress.
  
- Xenial 4.4.0-185-generic is not affected, only Bionic 4.15.0-60-generic
- onward.
+ I had previously fixed this for Disco in bug 1828978, and now a customer
+ has run into the issue in Bionic. A reproducer is supplied in the
+ testcase section, which was something missing from bug 1828978, since we
+ never determined how the issue actually occured back then.
  
  [Fix]
  
- The fix landed in 5.1-rc1, in the following commit:
+ This was fixed in 5.1 upstream with the below commit:
  
- commit 02ef04e432babf8fc703104212314e54112ecd2d
- Author: Chuck Lever <[email protected]>
- Date: Mon Feb 11 11:25:25 2019 -0500
- Subject: NFS: Account for XDR pad of buf->pages
- Link: 
https://github.com/torvalds/linux/commit/02ef04e432babf8fc703104212314e54112ecd2d
 
+ commit 3453d5708b33efe76f40eca1c0ed60923094b971
+ Author: Trond Myklebust <[email protected]>
+ Date: Wed Jun 20 17:53:34 2018 -0400
+ Subject: NFSv4.1: Avoid false retries when RPC calls are interrupted
  
- The above commit more or less relies on the below commit as a
- dependency, and is included in the SRU:
+ The fix is to pre-emptively increment the sequence number if an RPC call
+ is interrupted, and to address corner cases we interpret the
+ NFS4ERR_SEQ_MISORDERED error as a sign we need to locate an approperiate
+ sequence number between the value we sent, and the last successfully
+ acked SEQUENCE call.
  
- commit cf500bac8fd48b57f38ece890235923d4ed5ee91
- Author: Chuck Lever <[email protected]>
- Date: Mon Feb 11 11:25:20 2019 -0500
- Subject: SUNRPC: Introduce rpc_prepare_reply_pages()
- Link: 
https://github.com/torvalds/linux/commit/cf500bac8fd48b57f38ece890235923d4ed5ee91
+ The commit also requires two fixup commits, which landed in 5.5 and
+ 5.8-rc6 respectively:
  
- Also, as Andrea Righi pointed out, we need the following fixes commit:
+ commit 5c441544f045e679afd6c3c6d9f7aaf5fa5f37b0
+ Author: Trond Myklebust <[email protected]>
+ Date:   Wed Nov 13 08:34:00 2019 +0100
+ Subject: NFSv4.x: Handle bad/dead sessions correctly in 
nfs41_sequence_process()
  
- commit 29e7ca715f2a0b6c0a99b1aec1b0956d1f271955
- Author: Chuck Lever <[email protected]>
- Date:   Tue Apr 9 10:44:16 2019 -0400
- Subject: NFS: Fix handling of reply page vector
- Link: 
https://github.com/torvalds/linux/commit/29e7ca715f2a0b6c0a99b1aec1b0956d1f271955
+ commit 913fadc5b105c3619d9e8d0fe8899ff1593cc737
+ Author: Anna Schumaker <[email protected]>
+ Date:   Wed Jul 8 10:33:40 2020 -0400
+ Subject: NFS: Fix interrupted slots by sending a solo SEQUENCE operation
  
- It appears that some NFS calls return a NFS payload which is not a
- multiple of 4 bytes, but any payload sent over the network needs to be
- padded to an exact multiple of 4 bytes. It seems cutting and pasting
- from Nautilus triggers one such payload which is missing a byte, and it
- causes the NFS subsystem to hang during packet transmission. The fix
- ensures that all payloads use correct padding.
+ Commits 3453d5708b33efe76f40eca1c0ed60923094b971 and 
913fadc5b105c3619d9e8d0fe8899ff1593cc737 require small backports to bionic, as 
struct rpc_cred changed to 
+ const struct cred in 5.0, and the backports swap them back to struct rpc_cred 
since that is how 4.15 works.
  
  [Testcase]
  
  You will need four machines. The first, is a kerberos KDC. Set up
  Kerberos correctly and create new service principals for the NFS server
  and for the client. I used: nfs/nfskerb.mydomain.com and
  nfs/client.mydomain.com.
  
  The second machine will be a NFS server with the krb5p share. Add the nfs 
server kerberos keys to the system's keytab, and set up a NFS server that 
exports a directory with sec=krb5p. Example export:
  /mnt/secretfolder *.mydomain.com(rw,sync,no_subtree_check,sec=krb5p)
  
  The third machine is a regular NFS server. Export a directory with normal 
sec=sys security. Example export:
  /mnt/sharedfolder *.mydomain.com(rw,sync)
  
  The fourth is a desktop machine. Add the client kerberos keys to the system's 
keytab. Mount both NFS shares, making sure to use the NFS v4.2 protocol. I used 
the commands:
  mount -t nfs4 nfskerb.mydomain.com:/mnt/secretfolder /mnt/secretfolder_client/
  mount -t nfs4 nfs.mydomain.com:/mnt/sharedfolder /mnt/sharedfolder_client
  
  Check "mount -l" to ensure that NFS v4.2 is used:
  nfskerb.mydomain.com:/mnt/secretfolder on /mnt/secretfolder_client type nfs4 
(rw,relatime,vers=4.2,<...>,sec=krb5p,<...>)
  nfs.mydomain.com:/mnt/sharedfolder on /mnt/sharedfolder_client type nfs4 
(rw,relatime,vers=4.2,<...>,sec=sys,<...>)
  
  Generate some files full of random data. I found 20MB from /dev/random
  works great.
  
  Open each NFS share up in tabs in Nautilus. Copy the random data files
  to the sec=sys NFS share. When they are done, one at a time cut and then
  paste the file into the sec=krb5p NFS share. The bug will trigger either
  on the first, or subsequent tries, but less than 10 tries are needed
  usually.
  
  There is a test kernel available in the following PPA:
  https://launchpad.net/~mruffell/+archive/ubuntu/sf285439-test
  
  If you install the test kernel, files will cut and paste correctly, and
  NFS will work as expected.
  
  [Regression Potential]
  
- If a regression were to occur, it would impact users of the NFS
- subsystem, since the changes modify how padding is applied to all NFS
- packets, and a regression would affect all versions of NFS.
+ The changes are localised to NFS v4.1 and 4.2 only, and other versions
+ of NFS are not affected. If a regression occurs, users can downgrade NFS
+ versions to v4.0 or v3.x until a fix is made.
  
- If a regression were to occur, users would need to downgrade their
- kernel while awaiting a fix.
+ The changes only impact when connections are interrupted, and under
+ typical blue sky scenarios would not be invoked.
+ 
+ There have been no fixup commits or commits near the requested commit in
+ newer kernels, which points to this commit fixing the issue, and adopted
+ by the community.
+ 
+ [Other Info]
+ 
+ When I first submitted this fix for SRU, I believed that the fix was:
+ 
+ commit 02ef04e432babf8fc703104212314e54112ecd2d
+ Author: Chuck Lever <[email protected]>
+ Date: Mon Feb 11 11:25:25 2019 -0500
+ Subject: NFS: Account for XDR pad of buf->pages
+ Link: 
https://github.com/torvalds/linux/commit/02ef04e432babf8fc703104212314e54112ecd2d
+ 
+ This is not the case. This was a false positive fix. What it did was
+ break NFSv4 GETACL and FS_LOCATIONS requests. When you tried to
+ reproduce, the calls were never made since they were broken, and thus
+ could not be interrupted, and cutting and pasting files worked fine.
+ 
+ When you applied the fixup commit
+ 29e7ca715f2a0b6c0a99b1aec1b0956d1f271955 to fix NFSv4 GETACL and
+ FS_LOCATIONS requests, the problem returns, as GETACL and FS_LOCATIONS
+ are free to be interrupted and start a high bandwidth ping pong.

** Description changed:

  BugLink: https://bugs.launchpad.net/bugs/1887607
  
  [Impact]
  
  There is a bug in NFS v4.1 that causes a large amount of RPC calls
  between a client and server when a previous RPC call is interrupted.
  This uses a large amount of bandwidth and can saturate the network.
  
  The symptoms are so:
  
  * On NFS clients:
  Attempts to access mounted NFS shares associated with the affected server 
block indefinitely.
  
  * On the network:
  A storm of repeated RPCs between NFS client and server uses a lot of 
bandwidth. Each RPC is acknoledged by the server with an NFS4ERR_SEQ_MISORDERED 
error.
  
  * Other NFS clients connected to the same NFS server:
  Performance drops dramatically.
  
  This occurs during a "false retry", when a client attempts to make a new
  RPC call using a slot+sequence number that references an older, cached
  call. This happens when a user process interrupts an RPC call that is in
  progress.
  
  I had previously fixed this for Disco in bug 1828978, and now a customer
  has run into the issue in Bionic. A reproducer is supplied in the
  testcase section, which was something missing from bug 1828978, since we
  never determined how the issue actually occured back then.
  
  [Fix]
  
  This was fixed in 5.1 upstream with the below commit:
  
  commit 3453d5708b33efe76f40eca1c0ed60923094b971
  Author: Trond Myklebust <[email protected]>
  Date: Wed Jun 20 17:53:34 2018 -0400
  Subject: NFSv4.1: Avoid false retries when RPC calls are interrupted
  
  The fix is to pre-emptively increment the sequence number if an RPC call
  is interrupted, and to address corner cases we interpret the
  NFS4ERR_SEQ_MISORDERED error as a sign we need to locate an approperiate
  sequence number between the value we sent, and the last successfully
  acked SEQUENCE call.
  
  The commit also requires two fixup commits, which landed in 5.5 and
  5.8-rc6 respectively:
  
  commit 5c441544f045e679afd6c3c6d9f7aaf5fa5f37b0
  Author: Trond Myklebust <[email protected]>
  Date:   Wed Nov 13 08:34:00 2019 +0100
  Subject: NFSv4.x: Handle bad/dead sessions correctly in 
nfs41_sequence_process()
  
  commit 913fadc5b105c3619d9e8d0fe8899ff1593cc737
  Author: Anna Schumaker <[email protected]>
  Date:   Wed Jul 8 10:33:40 2020 -0400
  Subject: NFS: Fix interrupted slots by sending a solo SEQUENCE operation
  
- Commits 3453d5708b33efe76f40eca1c0ed60923094b971 and 
913fadc5b105c3619d9e8d0fe8899ff1593cc737 require small backports to bionic, as 
struct rpc_cred changed to 
- const struct cred in 5.0, and the backports swap them back to struct rpc_cred 
since that is how 4.15 works.
+ Commits 3453d5708b33efe76f40eca1c0ed60923094b971 and
+ 913fadc5b105c3619d9e8d0fe8899ff1593cc737 require small backports to
+ bionic, as struct rpc_cred changed to const struct cred in 5.0, and the
+ backports swap them back to struct rpc_cred since that is how 4.15
+ works.
  
  [Testcase]
  
  You will need four machines. The first, is a kerberos KDC. Set up
  Kerberos correctly and create new service principals for the NFS server
  and for the client. I used: nfs/nfskerb.mydomain.com and
  nfs/client.mydomain.com.
  
  The second machine will be a NFS server with the krb5p share. Add the nfs 
server kerberos keys to the system's keytab, and set up a NFS server that 
exports a directory with sec=krb5p. Example export:
  /mnt/secretfolder *.mydomain.com(rw,sync,no_subtree_check,sec=krb5p)
  
  The third machine is a regular NFS server. Export a directory with normal 
sec=sys security. Example export:
  /mnt/sharedfolder *.mydomain.com(rw,sync)
  
  The fourth is a desktop machine. Add the client kerberos keys to the system's 
keytab. Mount both NFS shares, making sure to use the NFS v4.2 protocol. I used 
the commands:
  mount -t nfs4 nfskerb.mydomain.com:/mnt/secretfolder /mnt/secretfolder_client/
  mount -t nfs4 nfs.mydomain.com:/mnt/sharedfolder /mnt/sharedfolder_client
  
  Check "mount -l" to ensure that NFS v4.2 is used:
  nfskerb.mydomain.com:/mnt/secretfolder on /mnt/secretfolder_client type nfs4 
(rw,relatime,vers=4.2,<...>,sec=krb5p,<...>)
  nfs.mydomain.com:/mnt/sharedfolder on /mnt/sharedfolder_client type nfs4 
(rw,relatime,vers=4.2,<...>,sec=sys,<...>)
  
  Generate some files full of random data. I found 20MB from /dev/random
  works great.
  
  Open each NFS share up in tabs in Nautilus. Copy the random data files
  to the sec=sys NFS share. When they are done, one at a time cut and then
  paste the file into the sec=krb5p NFS share. The bug will trigger either
  on the first, or subsequent tries, but less than 10 tries are needed
  usually.
  
  There is a test kernel available in the following PPA:
  https://launchpad.net/~mruffell/+archive/ubuntu/sf285439-test
  
  If you install the test kernel, files will cut and paste correctly, and
  NFS will work as expected.
  
  [Regression Potential]
  
- The changes are localised to NFS v4.1 and 4.2 only, and other versions
+ The changes are localised to NFS v4.1 and v4.2 only, and other versions
  of NFS are not affected. If a regression occurs, users can downgrade NFS
  versions to v4.0 or v3.x until a fix is made.
  
  The changes only impact when connections are interrupted, and under
  typical blue sky scenarios would not be invoked.
  
  There have been no fixup commits or commits near the requested commit in
  newer kernels, which points to this commit fixing the issue, and adopted
  by the community.
  
  [Other Info]
  
  When I first submitted this fix for SRU, I believed that the fix was:
  
  commit 02ef04e432babf8fc703104212314e54112ecd2d
  Author: Chuck Lever <[email protected]>
  Date: Mon Feb 11 11:25:25 2019 -0500
  Subject: NFS: Account for XDR pad of buf->pages
- Link: 
https://github.com/torvalds/linux/commit/02ef04e432babf8fc703104212314e54112ecd2d
  
  This is not the case. This was a false positive fix. What it did was
  break NFSv4 GETACL and FS_LOCATIONS requests. When you tried to
  reproduce, the calls were never made since they were broken, and thus
  could not be interrupted, and cutting and pasting files worked fine.
  
  When you applied the fixup commit
  29e7ca715f2a0b6c0a99b1aec1b0956d1f271955 to fix NFSv4 GETACL and
  FS_LOCATIONS requests, the problem returns, as GETACL and FS_LOCATIONS
  are free to be interrupted and start a high bandwidth ping pong.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1887607

Title:
  NFSv4.1: Interrupted connections cause high bandwidth RPC ping-pong
  between client and server

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887607/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1887607] Re: NFSv4.1: Interrupted connections cause high bandwidth RPC ping-pong between client and server

Reply via email to