[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** No longer affects: linux (Ubuntu) ** No longer affects: linux (Ubuntu Trusty) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in nfs-utils package in Ubuntu: Fix Released Status in nfs-utils source package in Trusty: Fix Committed Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Test Case] 1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old. 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports 3) sudo exportfs -a && sudo service nfs-kernel-server restart Client: 1) sudo apt-get install nfs-common acct 2) sudo mkdir /account 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account 4) sudo touch /account/pacct 5) sudo accton /account/pacct 6) cd /account/ 7) run ls and other operations. *(dpkg operations seemed to do it for me) 8) wait 8) Terminal will become unresponsive within a few minutes, and new logins will not be possible [Regression Potential] * There are a total of 28 other references to request_key in the kernel. It is possible that previous failures to request_key may now be passing which may result in alternate code paths being taken. Those kernel subsystems that don't already have an explicit obvious dependency on keyutils are. * drivers/staging/lustre * fs/afs * fs/fscache * fs/nfs * lib/digsig.c * net/ceph * net/dns_resolver * net/rxrpc * security/integrity * security/keys * Minimal this was applied to vivid+wily via bug 1449074. [Other Info] * Sulution is to add keyutils as Required to nfs-common. Original Description __ ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process account
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
Thanks ruddk, it was a pleasure. We will let this bake in -proposed for a few weeks, and then barring any unforeseen issues it will get promoted to -updates. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Invalid Status in nfs-utils package in Ubuntu: Fix Released Status in linux source package in Trusty: Invalid Status in nfs-utils source package in Trusty: Fix Committed Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Test Case] 1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old. 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports 3) sudo exportfs -a && sudo service nfs-kernel-server restart Client: 1) sudo apt-get install nfs-common acct 2) sudo mkdir /account 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account 4) sudo touch /account/pacct 5) sudo accton /account/pacct 6) cd /account/ 7) run ls and other operations. *(dpkg operations seemed to do it for me) 8) wait 8) Terminal will become unresponsive within a few minutes, and new logins will not be possible [Regression Potential] * There are a total of 28 other references to request_key in the kernel. It is possible that previous failures to request_key may now be passing which may result in alternate code paths being taken. Those kernel subsystems that don't already have an explicit obvious dependency on keyutils are. * drivers/staging/lustre * fs/afs * fs/fscache * fs/nfs * lib/digsig.c * net/ceph * net/dns_resolver * net/rxrpc * security/integrity * security/keys * Minimal this was applied to vivid+wily via bug 1449074. [Other Info] * Sulution is to add keyutils as Required to nfs-common. Original Description __ ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writin
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
--- Comment From ru...@us.ibm.com 2015-11-07 00:59 EDT--- Thanks. The updated 1:1.2.8-6ubuntu1.2 version of nfs-common did indeed pull in the keyutils package. This was tested for both the ppc64el and amd64 architectures. root@p824l:~# apt-show-versions nfs-common nfs-common:ppc64el/trusty-updates 1:1.2.8-6ubuntu1.1 uptodate root@p824l:~# apt-show-versions keyutils keyutils not installed (available for: ppc64el) root@p824l:~# ls /sbin/request-key ls: cannot access /sbin/request-key: No such file or directory root@p824l:/etc/apt# apt-get install nfs-common/trusty-proposed Reading package lists... Done Building dependency tree Reading state information... Done Selected version '1:1.2.8-6ubuntu1.2' (Ubuntu:14.04/trusty-proposed [ppc64el]) for 'nfs-common' The following extra packages will be installed: keyutils Suggested packages: open-iscsi watchdog Recommended packages: python The following NEW packages will be installed: keyutils The following packages will be upgraded: nfs-common 1 upgraded, 1 newly installed, 0 to remove and 6 not upgraded. ... root@p824l:/etc/apt# apt-show-versions nfs-common nfs-common:ppc64el/trusty-proposed 1:1.2.8-6ubuntu1.2 uptodate root@p824l:/etc/apt# apt-show-versions keyutils keyutils:ppc64el/trusty 1.5.6-1 uptodate root@p824l:/etc/apt# ls /sbin/request-key /sbin/request-key ** Tags removed: verification-needed ** Tags added: verification-done -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Invalid Status in nfs-utils package in Ubuntu: Fix Released Status in linux source package in Trusty: Invalid Status in nfs-utils source package in Trusty: Fix Committed Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Test Case] 1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old. 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports 3) sudo exportfs -a && sudo service nfs-kernel-server restart Client: 1) sudo apt-get install nfs-common acct 2) sudo mkdir /account 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account 4) sudo touch /account/pacct 5) sudo accton /account/pacct 6) cd /account/ 7) run ls and other operations. *(dpkg operations seemed to do it for me) 8) wait 8) Terminal will become unresponsive within a few minu
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
Hello bugproxy, or anyone else affected, Accepted nfs-utils into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nfs- utils/1:1.2.8-6ubuntu1.2 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: nfs-utils (Ubuntu Trusty) Status: In Progress => Fix Committed ** Tags added: verification-needed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Invalid Status in nfs-utils package in Ubuntu: Fix Released Status in linux source package in Trusty: Invalid Status in nfs-utils source package in Trusty: Fix Committed Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Test Case] 1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old. 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports 3) sudo exportfs -a && sudo service nfs-kernel-server restart Client: 1) sudo apt-get install nfs-common acct 2) sudo mkdir /account 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account 4) sudo touch /account/pacct 5) sudo accton /account/pacct 6) cd /account/ 7) run ls and other operations. *(dpkg operations seemed to do it for me) 8) wait 8) Terminal will become unresponsive within a few minutes, and new logins will not be possible [Regression Potential] * There are a total of 28 other references to request_key in the kernel. It is possible that previous failures to request_key may now be passing which may result in alternate code paths being taken. Those ke
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Description changed: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Test Case] 1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old. 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports 3) sudo exportfs -a && sudo service nfs-kernel-server restart Client: 1) sudo apt-get install nfs-common acct 2) sudo mkdir /account 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account - 4) sudo touch /account/pacct - 5) sudo accton /account/pacct - 6) cd /account/ - 7) run ls and other operations. *(dpkg operations seemed to do it for me) - 8) wait - 8) Terminal will become unresponsive within a few minutes, and new logins will not be possible + 4) sudo touch /account/pacct + 5) sudo accton /account/pacct + 6) cd /account/ + 7) run ls and other operations. *(dpkg operations seemed to do it for me) + 8) wait + 8) Terminal will become unresponsive within a few minutes, and new logins will not be possible [Regression Potential] - * Minimal this was applied to vivid+wily via bug 1449074. + * There are a total of 28 other references to request_key in the kernel. It is possible that previous failures to request_key may now be passing which may result in alternate code paths being taken. Those kernel subsystems that don't already have an explicit obvious dependency on keyutils are. +* drivers/staging/lustre +* fs/afs +* fs/fscache +* fs/nfs +* lib/digsig.c +* net/ceph +* net/dns_resolver +* net/rxrpc +* security/integrity +* security/keys + * Minimal this was applied to vivid+wily via bug 1449074. [Other Info] * Sulution is to add keyutils as Required to nfs-common. Original Description __ ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like t
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
This dependency is already present in vivid and later. ** Also affects: nfs-utils (Ubuntu Trusty) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Trusty) Importance: Undecided Status: New ** Changed in: nfs-utils (Ubuntu Trusty) Importance: Undecided => Medium ** Changed in: nfs-utils (Ubuntu Trusty) Status: New => In Progress ** Changed in: nfs-utils (Ubuntu Trusty) Assignee: (unassigned) => Dave Chiluk (chiluk) ** Changed in: nfs-utils (Ubuntu) Status: In Progress => Fix Released ** Changed in: linux (Ubuntu Trusty) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Invalid Status in nfs-utils package in Ubuntu: Fix Released Status in linux source package in Trusty: Invalid Status in nfs-utils source package in Trusty: In Progress Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Test Case] 1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old. 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports 3) sudo exportfs -a && sudo service nfs-kernel-server restart Client: 1) sudo apt-get install nfs-common acct 2) sudo mkdir /account 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account 4) sudo touch /account/pacct 5) sudo accton /account/pacct 6) cd /account/ 7) run ls and other operations. *(dpkg operations seemed to do it for me) 8) wait 8) Terminal will become unresponsive within a few minutes, and new logins will not be possible [Regression Potential] * Minimal this was applied to vivid+wily via bug 1449074. [Other Info] * Sulution is to add keyutils as Required to nfs-common. Original Description __ ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount.
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Description changed: [Impact] - * Programs accessing nfsv4 mounts will hang on request_key interface + * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. - * INFO: task ls:2101 blocked for more than 120 seconds. - Not tainted 3.13.0-66-generic #108-Ubuntu + * INFO: task ls:2101 blocked for more than 120 seconds. + Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 - 88007b14d630 0086 8800374e6000 88007b14dfd8 - 00013180 00013180 8800374e6000 88007b14d6b0 - 88007ffd1460 0002 812d0ce0 88007b14d6a0 + 88007b14d630 0086 8800374e6000 88007b14dfd8 + 00013180 00013180 8800374e6000 88007b14d6b0 + 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: - [] ? umh_keys_init+0x20/0x20 - [] schedule+0x29/0x70 - [] key_wait_bit+0xe/0x20 - [] __wait_on_bit+0x62/0x90 - [] ? umh_keys_init+0x20/0x20 - [] out_of_line_wait_on_bit+0x77/0x90 - [] ? autoremove_wake_function+0x40/0x40 - [] wait_for_key_construction+0x6e/0x80 - [] request_key+0x5c/0xa0 - [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] - [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] - [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] - [] ? sched_clock+0x9/0x10 - [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] - [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] - [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] - [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] - [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] - [] call_decode+0x1df/0x870 [sunrpc] - [] ? call_refreshresult+0x170/0x170 [sunrpc] - [] ? call_refreshresult+0x170/0x170 [sunrpc] - [] __rpc_execute+0x84/0x400 [sunrpc] - [] rpc_execute+0x5e/0xa0 [sunrpc] - [] rpc_run_task+0x70/0x90 [sunrpc] - [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] - [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] - [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] - [] __nfs_revalidate_inode+0xbf/0x310 [nfs] - [] nfs_opendir+0xe3/0x100 [nfs] - [] do_dentry_open+0x233/0x2e0 - [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] - [] vfs_open+0x49/0x50 - [] do_last+0x564/0x1240 - [] ? link_path_walk+0x256/0x880 - [] ? apparmor_file_alloc_security+0x5b/0x180 - [] ? security_file_alloc+0x16/0x20 - [] path_openat+0xbb/0x650 - [] do_filp_open+0x3a/0x90 - [] ? do_mmap_pgoff+0x34e/0x3d0 - [] ? __alloc_fd+0xa7/0x130 - [] do_sys_open+0x129/0x280 - [] ? do_page_fault+0x1a/0x70 - [] SyS_openat+0x14/0x20 - [] system_call_fastpath+0x1a/0x1f + [] ? umh_keys_init+0x20/0x20 + [] schedule+0x29/0x70 + [] key_wait_bit+0xe/0x20 + [] __wait_on_bit+0x62/0x90 + [] ? umh_keys_init+0x20/0x20 + [] out_of_line_wait_on_bit+0x77/0x90 + [] ? autoremove_wake_function+0x40/0x40 + [] wait_for_key_construction+0x6e/0x80 + [] request_key+0x5c/0xa0 + [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] + [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] + [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] + [] ? sched_clock+0x9/0x10 + [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] + [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] + [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] + [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] + [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] + [] call_decode+0x1df/0x870 [sunrpc] + [] ? call_refreshresult+0x170/0x170 [sunrpc] + [] ? call_refreshresult+0x170/0x170 [sunrpc] + [] __rpc_execute+0x84/0x400 [sunrpc] + [] rpc_execute+0x5e/0xa0 [sunrpc] + [] rpc_run_task+0x70/0x90 [sunrpc] + [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] + [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] + [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] + [] __nfs_revalidate_inode+0xbf/0x310 [nfs] + [] nfs_opendir+0xe3/0x100 [nfs] + [] do_dentry_open+0x233/0x2e0 + [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] + [] vfs_open+0x49/0x50 + [] do_last+0x564/0x1240 + [] ? link_path_walk+0x256/0x880 + [] ? apparmor_file_alloc_security+0x5b/0x180 + [] ? security_file_alloc+0x16/0x20 + [] path_openat+0xbb/0x650 + [] do_filp_open+0x3a/0x90 + [] ? do_mmap_pgoff+0x34e/0x3d0 + [] ? __alloc_fd+0xa7/0x130 + [] do_sys_open+0x129/0x280 + [] ? do_page_fault+0x1a/0x70 + [] SyS_openat+0x14/0x20 + [] system_call_fastpath+0x1a/0x1f + + [Test Case] + + 1) Install an nfs server that does not support sec=sys, such as centos 6 or others that are old. + 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports + 3) sudo exportfs -a && sudo service nfs-kernel-server restart + + Client: + 1) sudo apt-get install nfs-common acct + 2) sudo mkdir /account + 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account + 4) sudo touch /acc
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Tags added: sts -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Invalid Status in nfs-utils package in Ubuntu: In Progress Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Regression Potential] * Minimal this was applied to vivid+wily via bug 1449074. [Other Info] * Sulution is to add keyutils as Required to nfs-common. Original Description __ ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c0
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
Yeah, I think so, if you are "creative" enough to put your accounting on your nfs mount, and then have your nfs service fail, you should expect your machine to stop functioning. Otherwise it could be a potential accounting/security hole. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Invalid Status in nfs-utils package in Ubuntu: In Progress Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Regression Potential] * Minimal this was applied to vivid+wily via bug 1449074. [Other Info] * Sulution is to add keyutils as Required to nfs-common. Original Description __ ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Description changed: + [Impact] + + * Programs accessing nfsv4 mounts will hang on request_key interface + with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on + usermodehelper provided by keyutils. + + * INFO: task ls:2101 blocked for more than 120 seconds. + Not tainted 3.13.0-66-generic #108-Ubuntu + "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. + ls D 88007fd13180 0 2101 1215 0x0004 + 88007b14d630 0086 8800374e6000 88007b14dfd8 + 00013180 00013180 8800374e6000 88007b14d6b0 + 88007ffd1460 0002 812d0ce0 88007b14d6a0 + Call Trace: + [] ? umh_keys_init+0x20/0x20 + [] schedule+0x29/0x70 + [] key_wait_bit+0xe/0x20 + [] __wait_on_bit+0x62/0x90 + [] ? umh_keys_init+0x20/0x20 + [] out_of_line_wait_on_bit+0x77/0x90 + [] ? autoremove_wake_function+0x40/0x40 + [] wait_for_key_construction+0x6e/0x80 + [] request_key+0x5c/0xa0 + [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] + [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] + [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] + [] ? sched_clock+0x9/0x10 + [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] + [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] + [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] + [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] + [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] + [] call_decode+0x1df/0x870 [sunrpc] + [] ? call_refreshresult+0x170/0x170 [sunrpc] + [] ? call_refreshresult+0x170/0x170 [sunrpc] + [] __rpc_execute+0x84/0x400 [sunrpc] + [] rpc_execute+0x5e/0xa0 [sunrpc] + [] rpc_run_task+0x70/0x90 [sunrpc] + [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] + [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] + [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] + [] __nfs_revalidate_inode+0xbf/0x310 [nfs] + [] nfs_opendir+0xe3/0x100 [nfs] + [] do_dentry_open+0x233/0x2e0 + [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] + [] vfs_open+0x49/0x50 + [] do_last+0x564/0x1240 + [] ? link_path_walk+0x256/0x880 + [] ? apparmor_file_alloc_security+0x5b/0x180 + [] ? security_file_alloc+0x16/0x20 + [] path_openat+0xbb/0x650 + [] do_filp_open+0x3a/0x90 + [] ? do_mmap_pgoff+0x34e/0x3d0 + [] ? __alloc_fd+0xa7/0x130 + [] do_sys_open+0x129/0x280 + [] ? do_page_fault+0x1a/0x70 + [] SyS_openat+0x14/0x20 + [] system_call_fastpath+0x1a/0x1f + + [Regression Potential] + + * Minimal this was applied to vivid+wily via bug 1449074. + + [Other Info] + + * Sulution is to add keyutils as Required to nfs-common. + + Original Description + __ + ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" - #0 [c01fd274a950] __switch_to at c0015934 - #1 [c01fd274ab20] __switch_to at c0015934 - #2 [c01fd274ab80] __schedule at c0a11de8 - #3 [c01fd274ada0] schedule_timeout at c0a16284 - #4 [c01fd274ae90] wait_for_common at c0a1360c - #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 - #6 [c01fd274af70] call_sbin_request_key at c0429258 - #7 [c01fd274b100] request_key_and_link at c042983c - #8 [c01fd274b200] request_key at c0429978 - #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] + #0 [c01fd274a950] __switch_to at c0015934 + #1 [c01fd274ab20] __switch_to at c0015934 + #2 [c01fd274ab80] __schedule at c0a11de8 + #3 [c01fd274ada0] schedule_timeout at c0a16284 + #4 [c01fd274ae90] wait_for_common at c0a1360c + #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 + #6 [c01fd274af70] call_sbin_request_key at c0429258 + #7 [c01fd274b100] request_key_and_link at c042983c + #8 [c01fd274b200] request_key at c0429978 + #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
Add keyutils to Depends: for nfs-common ** Patch added: "lp1509120.trusty.debdiff" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1509120/+attachment/4512348/+files/lp1509120.trusty.debdiff ** Changed in: nfs-utils (Ubuntu) Importance: Undecided => Medium ** Changed in: linux (Ubuntu) Importance: High => Medium ** Changed in: linux (Ubuntu) Status: In Progress => Invalid ** Changed in: nfs-utils (Ubuntu) Status: New => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Invalid Status in nfs-utils package in Ubuntu: In Progress Bug description: [Impact] * Programs accessing nfsv4 mounts will hang on request_key interface with nfs4 + sec=sys with old nfsv4 hosts. Kernel is waiting on usermodehelper provided by keyutils. * INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f [Regression Potential] * Minimal this was applied to vivid+wily via bug 1449074. [Other Info] * Sulution is to add keyutils as Required to nfs-common. Original Description __ ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
I verified that keyutils is already included in vivid+ in order to fix 1449074. So this shouldn't be too much of an issue to fix in trusty as well. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Status in nfs-utils package in Ubuntu: New Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned j
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Changed in: nfs-utils (Ubuntu) Assignee: (unassigned) => Dave Chiluk (chiluk) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Status in nfs-utils package in Ubuntu: New Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fine, then when I ran a CUDA test program it didn?t retur
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
I created a centos 6 nfsv4 server, and went through the above recreate procedure with a trusty guest, and could successfully mount, and do file operations with accton. However after a few minutes, the console hung, and most tasks reported the following stack traces in /var/log/kern.log INFO: task ls:2101 blocked for more than 120 seconds. Not tainted 3.13.0-66-generic #108-Ubuntu "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ls D 88007fd13180 0 2101 1215 0x0004 88007b14d630 0086 8800374e6000 88007b14dfd8 00013180 00013180 8800374e6000 88007b14d6b0 88007ffd1460 0002 812d0ce0 88007b14d6a0 Call Trace: [] ? umh_keys_init+0x20/0x20 [] schedule+0x29/0x70 [] key_wait_bit+0xe/0x20 [] __wait_on_bit+0x62/0x90 [] ? umh_keys_init+0x20/0x20 [] out_of_line_wait_on_bit+0x77/0x90 [] ? autoremove_wake_function+0x40/0x40 [] wait_for_key_construction+0x6e/0x80 [] request_key+0x5c/0xa0 [] nfs_idmap_get_key+0xaf/0x1c0 [nfsv4] [] nfs_map_name_to_uid+0xef/0x150 [nfsv4] [] decode_getfattr_attrs+0xe47/0x14b0 [nfsv4] [] ? sched_clock+0x9/0x10 [] decode_getfattr_generic.constprop.102+0x8c/0xf0 [nfsv4] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] nfs4_xdr_dec_getattr+0x70/0x80 [nfsv4] [] rpcauth_unwrap_resp+0x86/0xd0 [sunrpc] [] ? nfs4_xdr_dec_access+0xa0/0xa0 [nfsv4] [] call_decode+0x1df/0x870 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] ? call_refreshresult+0x170/0x170 [sunrpc] [] __rpc_execute+0x84/0x400 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_getattr+0xbe/0xd0 [nfsv4] [] nfs4_proc_getattr+0x5a/0xd0 [nfsv4] [] __nfs_revalidate_inode+0xbf/0x310 [nfs] [] nfs_opendir+0xe3/0x100 [nfs] [] do_dentry_open+0x233/0x2e0 [] ? nfs_readdir_clear_array+0x70/0x70 [nfs] [] vfs_open+0x49/0x50 [] do_last+0x564/0x1240 [] ? link_path_walk+0x256/0x880 [] ? apparmor_file_alloc_security+0x5b/0x180 [] ? security_file_alloc+0x16/0x20 [] path_openat+0xbb/0x650 [] do_filp_open+0x3a/0x90 [] ? do_mmap_pgoff+0x34e/0x3d0 [] ? __alloc_fd+0xa7/0x130 [] do_sys_open+0x129/0x280 [] ? do_page_fault+0x1a/0x70 [] SyS_openat+0x14/0x20 [] system_call_fastpath+0x1a/0x1f I then went and installed keyutils and retested, and could not reproduce the hung tasks. At first look this looks to be the endpoint for request_key_and_link out of the kernel. I will verify this tomorrow, and I will look at fixing the dependency tomorrow. Thank you, ** Also affects: nfs-utils (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Status in nfs-utils package in Ubuntu: New Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c00
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Changed in: linux (Ubuntu) Assignee: Chris J Arges (arges) => Dave Chiluk (chiluk) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fine, then when I ran a CUDA test program it didn?t return and I verified that I was no longer ab
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
So I did some more searching on this including running an tcpdump in my environment, and it appears as if idmapd is not making any requests for me. Hence the reason we aren't hitting the deadlock. Did you do anything specific in the area of getting idmapd up and configured? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, wh
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
I also completed the same test as Chris above, without failure. I was using Ubuntu trusty+3.13 as the server and Trusty +3.19 as the guest. At the moment I don't see any reason to assume that there is anything power specific in this issue, so I attempted on purely x86_64 hardware. Also can we get the output of the mount command from the client so we can see verify that all /etc/fstab options were respected? It's possible that mount options were passed to the server, but were overrode by server capability bits. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
I'm having an issue reproducing this problem; perhaps I'm missing something so I'll explain my reproduction steps in detail. Server: 1) sudo apt-get install nfs-kernel-server 2) echo '/export *(rw,sync,no_root_squash,no_subtree_check,fsid=299)' > /etc/exports 3) sudo exportfs -a && sudo service nfs-kernel-server restart Client (power8 machine): 1) sudo apt-get install nfs-common acct 2) sudo mkdir /account 3) sudo mount -t nfs4 -o rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=2,sec=sys,minorversion=0,local_lock=none 10.245.80.76:/export /account 4) sudo touch /account/pacct 5) sudo accton /account/pacct 6) cd /account/ 7) ls 8) cat pacct Any suggestions would be welcome. Thanks -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c H
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
My apologies, I just noticed the deadlock on the mutex comment. Is this reproducible with the upstream kernel on the client? We provide mainline kernel builds for testing of this nature using our mainline build repositories. http://kernel.ubuntu.com/~kernel-ppa/mainline/ Also keep in mind, that we are working on reproducing this in-house as well. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Changed in: linux (Ubuntu) Assignee: Canonical Kernel Team (canonical-kernel-team) => Chris J Arges (arges) ** Changed in: linux (Ubuntu) Status: Triaged => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fine
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
Is this testcase reproducible using an nfs server with a recent kernel? Lots of improvements have been made to NFS in the many years since 2.6.32, including some concurrency improvements. Thanks, Dave. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: In Progress Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that r
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Changed in: linux (Ubuntu) Status: New => Triaged ** Changed in: linux (Ubuntu) Assignee: Taco Screen team (taco-screen-team) => Canonical Kernel Team (canonical-kernel-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: Triaged Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fi
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: New Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fine, then when I ran a CUDA test program it didn?t return and I verified that I was no longer able
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Changed in: linux (Ubuntu) Importance: Undecided => High -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: New Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fine, then when I ran a CUDA test program it didn?t return and I verified that I was no longer able to login to the node. I let it
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: New Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fine, then when I ran a CUDA test program it didn?t return and I verified that I was no longer able to login to the node. I let it sit for awhile and