[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
The workaround as suggested it to use gssproxy Unfortunately that also has a bug https://bugs.launchpad.net/ubuntu/+source/gssproxy/+bug/1788459 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Manually compiling and installing the latest version of gssproxy did solve the issue for me -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Why not use gss-proxy instead of rpc.svcgssd for NFS also in Ubuntu, like other distress do? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
@urraca: I don't believe they're directly related. There were no BUGs reported on that bug, and the issues on that bug are already fixed. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
@sforshee: before I open another bug, could you have a quick look at 1650336 (comment #10) and check if that might be related (it does smell like it big time really...)? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
N.B. that according to the changelog of the 4.4.0-70 kernel package, the patch has only been applied to -67 (thus effectively to -70)! We'll have to re-try if that one makes our environment stable... :/ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
@urraca: I suspect that what you're seeing is not the same problem as the original report. Please file a new bug against the linux package and include the data from comment #8. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Well, we were hoping that 4.4.0-66 would fix the issue, as https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650336 looked pretty similar (and would have made sense). Nastily, also patched and rebooted servers crash on us randomly. This is getting urgent, as 12.04 is running out of support really soon! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Any update on this? We have reduced the # of assigned groups per user to a sane level (~7) and still see random crashes (4 since last Saturday). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Not 100% sure if we're seeing the same bug (we use plain Kerberos, no AD or Samba involved). However, since we started rolling out 14.04 and 16.04 in bigger numbers, we get literally hundreds of BUG messages (see below) and a few kernel panics a week (console output showing among other "oops_end" and "rpc_pipe_read", so we're pretty sure there's a direct connection between what we have in the logs and the panics). The kernel BUGs _all_ look like this: NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [stat:59963] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [stat:59963] Modules linked in: cpuid 8021q garp mrp stp llc cts nfsv4 ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_conntrack nf_conntrack xt_multiport iptable_filter ip_tables x_tables autofs4 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif irqbypass ipmi_devintf crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac dcdbas ipmi_si wmi mei_me ipmi_msghandler edac_core mei shpchp 8250_fintek lpc_ich acpi_power_meter mac_hid rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl lp nfs parport lockd grace sunrpc fscache tg3 ahci megaraid_sas ptp libahci pps_core fjes CPU: 5 PID: 59963 Comm: stat Tainted: G D W 4.4.0-53-generic #74~14.04.1-Ubuntu Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.2.10 03/09/2015 task: 880b2b0b6040 ti: 8809e1d0 task.ti: 8809e1d0 RIP: 0010:[] [] native_queued_spin_lock_slowpath+0x160/0x170 RSP: 0018:8809e1d03960 EFLAGS: 0202 RAX: 0101 RBX: 8808d6319600 RCX: 0001 RDX: 0101 RSI: 0001 RDI: 881056f679c8 RBP: 8809e1d03960 R08: 0101 R09: R10: R11: ea00415b4a00 R12: 881056f67900 R13: 881056f679c8 R14: 8808d631976b R15: 8808d6319600 FS: 7fdcb0f39840() GS:88105e48() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fdcb0604160 CR3: 0008dd211000 CR4: 001406e0 Stack: 8809e1d03970 81180e47 8809e1d03980 817fe4d0 8809e1d039d0 c0114d50 8808d6319740 c0129060 8810560eaf00 0001 81ef6bc0 Call Trace: [] queued_spin_lock_slowpath+0xb/0xf [] _raw_spin_lock+0x20/0x30 [] gss_setup_upcall+0x160/0x390 [auth_rpcgss] [] gss_cred_init+0xce/0x350 [auth_rpcgss] [] ? prepare_to_wait_event+0xf0/0xf0 [] rpcauth_lookup_credcache+0x1e3/0x280 [sunrpc] [] gss_lookup_cred+0xe/0x10 [auth_rpcgss] [] rpcauth_lookupcred+0x7c/0xb0 [sunrpc] [] rpcauth_refreshcred+0x12a/0x1a0 [sunrpc] [] ? call_bc_transmit+0x1a0/0x1a0 [sunrpc] [] ? call_bc_transmit+0x1a0/0x1a0 [sunrpc] [] ? call_retry_reserve+0x60/0x60 [sunrpc] [] ? call_retry_reserve+0x60/0x60 [sunrpc] [] call_refresh+0x3c/0x70 [sunrpc] [] __rpc_execute+0x86/0x440 [sunrpc] [] rpc_execute+0x5e/0xb0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [] _nfs4_proc_statfs+0xb8/0xd0 [nfsv4] [] nfs4_proc_statfs+0x49/0x70 [nfsv4] [] nfs_statfs+0x59/0x170 [nfs] [] statfs_by_dentry+0x9b/0x120 [] vfs_statfs+0x1b/0xb0 [] user_statfs+0x49/0x80 [] SYSC_statfs+0x15/0x30 [] SyS_statfs+0xe/0x10 [] entry_SYSCALL_64_fastpath+0x16/0x75 Code: 8b 01 48 85 c0 75 0a f3 90 48 8b 01 48 85 c0 74 f6 c7 40 08 01 00 00 00 e9 61 ff ff ff 83 fa 01 75 07 e9 c2 fe ff ff f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 5d c3 66 90 0f 1f 44 00 00 This is from yesterday, Sunday. Nobody was logged in to the machine (according to wtmp & wtmp.1, not since at least Feb 01, but those could of course have been borked by the panic). Nevertheless there are 676 "gss_setup_upcall" lines between 2017-02-19T09:06:46.535855+01:00 and 2017-02-19T09:41:02.900460+01:00. And sure enough, after the last occurence the server PANIC'd. In this paricular case it is 14.04.5, Kernel 4.4.0-53-generic, but we've seen this with basically every 4.4.0-* flavour on 14.04 and 16.04. This config has basically been in use (on 12.04) for a few years: * /etc/krb5.conf (excerpt): [libdefaults] dns_lookup_realm = true dns_lookup_kdc = true kdc_timesync = 1 ccache_type = 4 forwardable = true proxiable = true * * autofs used for NFS directories, with options -fstype=nfs,intr,hard,fg,rsize=16384,wsize=16384,proto=tcp,timeo=600,retrans=3,port=2049,nfsvers=4,sec=krb5p,nodev,nosuid * * # grep -v '^#' /etc/default/autofs MASTER_MAP_NAME=/etc/auto.master TIMEOUT=300 BROWSE_MODE=yes LOGGING=none USE_MISC_DEVICE=yes * * # grep -v '^#' /etc/default/nfs-common NEED_STATD=no STATDOPTS= NEED_GSSD=yes NEED_IDMAPD=yes * * /etc/idmapd.conf: [General] Verbosity = 0 Pipefs-Directory
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
** Changed in: nfs-utils (Ubuntu) Importance: Undecided => Medium -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
This problem is still present in Ubuntu 16.04: # Linux version 4.4.0-36-generic (buildd@lcy01-01) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2) ) #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016 The workaround from comment #2 (disable PAC) still works dmesg output: [349643.292059] [] ? entry_SYSCALL_64_fastpath+0x16/0x71 [349668.156063] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [rpc.svcgssd:1342] [349668.156063] Modules linked in: cts rpcsec_gss_krb5 intel_rapl x86_pkg_temp_thermal coretemp nfsd nfs_acl lockd auth_rpcgss grace sunrpc xenfs xen_privcmd autofs4 [349668.156063] CPU: 1 PID: 1342 Comm: rpc.svcgssd Tainted: G L 4.4.0-36-generic #55-Ubuntu [349668.156063] task: 880006ce2580 ti: 88007bb78000 task.ti: 88007bb78000 [349668.156063] RIP: e030:[] [] qword_addhex+0x6c/0xe0 [sunrpc] [349668.156063] RSP: e02b:88007bb7bdc8 EFLAGS: 0202 [349668.156063] RAX: 8800074089c3 RBX: 88003ed81980 RCX: 033f [349668.156063] RDX: 88000740f4de RSI: 88007bb7be34 RDI: 88007bb7be38 [349668.156063] RBP: 88007bb7bdc8 R08: R09: 063d [349668.156063] R10: 0001 R11: 0246 R12: 88007bb7be38 [349668.156063] R13: 88007bb7be34 R14: 8800794356a8 R15: fff5 [349668.156063] FS: 7fa8c93ed740() GS:88007d30() knlGS: [349668.156063] CS: e033 DS: ES: CR0: 8005003b [349668.156063] CR2: 7f261bf81000 CR3: 7813a000 CR4: 2660 [349668.156063] Stack: [349668.156063] 88007bb7bdf0 c009045b 1000 8800793ce940 [349668.156063] 88007b8c3320 88007bb7be70 c0039101 8c6f5a6d [349668.156063] 0001 88007b1d7e10 880007235bb0 880079435600 [349668.156063] Call Trace: [349668.156063] [] rsi_request+0x3b/0x50 [auth_rpcgss] [349668.156063] [] cache_read.isra.19+0x2b1/0x400 [sunrpc] [349668.156063] [] cache_read_procfs+0x31/0x40 [sunrpc] [349668.156063] [] proc_reg_read+0x42/0x70 [349668.156063] [] __vfs_read+0x18/0x40 [349668.156063] [] vfs_read+0x86/0x130 [349668.156063] [] SyS_read+0x55/0xc0 [349668.156063] [] entry_SYSCALL_64_fastpath+0x16/0x71 [349668.156063] Code: 00 00 84 c0 0f 84 84 00 00 00 4c 89 d0 eb 05 45 84 c0 74 5d 44 0f b6 02 48 83 c0 02 41 83 e9 02 45 89 c2 41 83 e0 0f 41 c0 ea 04 <45> 0f b6 80 80 4a a5 81 41 83 e2 0f 45 0f b6 92 80 4a a5 81 44 ** Tags added: xenial -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Hello, For me, the problem is solved with downgrade my kernel to 12.04 "standard" kernel (Linux xxx 3.2.0-99-generic #139-Ubuntu) and delete HWE. The problem is not reappeared since my change 3 weeks ago with 700 users. Best regards -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
hello, Similar bug here with Linux xxx 3.19.0-51-generic #58~14.04.1-Ubuntu SMP Fri Feb 26 22:02:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Best regards -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Hello, I have a server in 12.04.5 LTS with kernel 3.13.0-74 (HWE) and packages nfs-common in version 1:1.2.5-3ubuntu3.2 and nfs-kernel-server in version 1:1.2.5-3ubuntu3.2 I have the same problem. I tested today with the new version of kernel 3.13.0-76, I have always the problem of CPU stuck and service rpc.svcgssd -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: nfs-utils (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1466654] Re: kernel soft lockup on nfs server when using a kerberos mount
I worked around this by setting NO_AUTH_DATA_REQUIRED on the userAccountControl attribute in ldap for the server account to prevent the PAC from being added to the kerberos ticket. I guess maybe when svcgssd gets a kerberos ticket that is too large it gets unhappy and stuck in a loop? A few references: https://lists.samba.org/archive/samba/2013-June/174045.html http://blog.evad.io/2014/11/04/kerberos-protected-nfs-with-active-directory-and-the-pac/ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1466654 Title: kernel soft lockup on nfs server when using a kerberos mount To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1466654/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs