** Description changed:

  Hi,
  
- We are seeing an NFSv4.1 client hang on Linux kernel 5.15 (Ubuntu
+ We are seeing an NFSv4.0 client hang on Linux kernel 5.15 (Ubuntu
  22.04).
  
  The issue starts when the server returns NFS4ERR_EXPIRED. The client
  then enters recovery, but reclaim never completes.
  
  The state manager thread is stuck with the following stack:
  
+ ```
  rpc_wait_bit_killable
  __rpc_wait_for_completion_task
  nfs4_run_open_task
  nfs4_open_recover_helper
  nfs4_open_recover
  nfs4_do_open_expired
  nfs40_open_expired
  __nfs4_reclaim_open_state
  nfs4_reclaim_open_state
  nfs4_do_reclaim
  nfs4_state_manager
+ ```
  
  Meanwhile:
  - The server repeatedly returns NFS4ERR_EXPIRED
  - The client does not successfully reclaim state
  - IO continues and repeatedly fails
  
  RPC stats show:
  - ~30M calls
  - very low retransmissions (94)
  
  This suggests the issue is unlikely to be caused by network loss or
  server unresponsiveness.
  
  Additionally, we have verified that:
  - Network connectivity is stable
  - The NFS server is operating normally (no restart or failover observed)
  
  Importantly:
  - We do observe that RENEW/SEQUENCE-related traffic is being sent from the 
client
  - However, the client still ends up with an expired lease (NFS4ERR_EXPIRED)
  
  This raises the question whether the lease renewal is not being properly
  processed or completed on the client side.
  
  Given that we are using NFSv4.1 (where lease renewal is implicit via
  SEQUENCE), we would like to understand:
  
  1. Under what conditions could the client still hit NFS4ERR_EXPIRED despite 
ongoing renew/SEQUENCE activity and a healthy server/network?
  2. Is it possible that RPC completion, session slot handling, or sequence 
handling issues could prevent the lease from being effectively renewed?
  3. Could this be a known issue in the NFSv4.1 recovery or session handling 
path in 5.15?
  
  It appears the client is stuck in the OPEN reclaim path waiting for RPC
  completion, and recovery cannot make forward progress.
  
  Are there known fixes or patches in newer kernels (e.g., 5.19 or 6.x)
  that address this class of issue?
  
  Any pointers or suggestions would be greatly appreciated.
  
  Thanks

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2146310

Title:
  NFSv4 client hang in OPEN reclaim path waiting for RPC completion

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2146310/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to