Public bug reported:
While trying to upgrade some of my systems to Ubuntu 12.04 "Precise" I'm
seeing strange hangs of various processes working with files on
nfs4-mounted /home. KDE sessions in particular hang very often on
startup or after short usage.
In hanged state all the processes, accessing NFS-mounted /home, enter
the state of uninterruptable sleep (D). Sometimes, after long wait
(around 10-15 minutes) some of these processes wake up and continue, but
realistically reboot is the only option to bring the machine back on-
line for a brief period before the next hang. After the hang dmesg
displays a number of kernel stack traces "process XXX blocked for more
than YYY seconds" with "ktime_get_ts" and "rpc_make_runnable" on the
top of call stack. It happens with both TCP and UDP transports.
The hang happens only when the network is loaded. When client is
connected directly to the NFS server (running under ubuntu Lucid with
oneiric backported kernel) via a separate Ethernet switch NFS on it
works perfectly ! But, if there is network congestion, the NFS accesses
randomly hang.
It is also possible to reproduce the hang by making a large rsync file
transfer to the client, while accessing the NFS-mounted /home. In this
case the NFS-reading processes hang almost instantly even when logging
in via console.
By all symptoms this hang resembles the one fixed by "SUNRPC: Fix a UDP
transport regression" in 3.2.0-32.51 Ubuntu kernel (exactly the kernel
I'm using and seeng hangs on). RPC traces show a number of hanged
requests, in "q:xprt_sending" state like this
Nov 2 20:22:51 XXX kernel: [15060.853376] -pid- flgs status -client- --rqstp-
-timeout ---ops--
Nov 2 20:22:51 XXX kernel: [15060.853393] 9903 0821 -11 f243f000 f256d700
0 f870d0f4 nfsv4 READ a:call_status q:xprt_sending
Nov 2 20:22:51 XXX kernel: [15060.853401] 9904 0821 -11 f243f000 f256d600
0 f870d0f4 nfsv4 READ a:call_status q:xprt_sending
Nov 2 20:22:51 XXX kernel: [15060.853408] 9916 0080 -11 f243f000 f256d500
0 f86c1b18 nfsv4 STATFS a:call_connect_status q:xprt_sending
Nov 2 20:22:51 XXX kernel: [15060.853415] 9917 0080 -11 f243f000 f256d200
0 f86c1b18 nfsv4 ACCESS a:call_connect_status q:xprt_sending
Nov 2 20:22:51 XXX kernel: [15060.853423] 9914 0281 -11 f256d800 f256d300
0 f870d8ec nfsv4 RENEW a:call_status q:xprt_sending
The problem can be similar to the one, fixed by "SUNRPC: Fix a UDP
transport regression", but in NFSv4.
I'm ready to provide more information on my configuration if necessary.
** Affects: nfs-utils (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1074470
Title:
NFSv4 client hang under network load
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1074470/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs