[Kernel-packages] [Bug 1886277] Re: Regression on NFS: unable to handle page fault in mempool_alloc_slab

2020-09-28 Thread Andrew Conway
5.4.0-48-generic seems to have fixed this problem for me, thanks!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886277

Title:
  Regression on NFS: unable to handle page fault in mempool_alloc_slab

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Released

Bug description:
  On kernel 5.4.0-40-generic in focal I'm getting errors like this on
  several machines with different hardware in the first hour after boot:

  Jul 04 16:58:32 hostname kernel: BUG: unable to handle page fault for 
address: 9083e222e632
  Jul 04 16:58:32 hostname kernel: #PF: supervisor read access in kernel mode
  Jul 04 16:58:32 hostname kernel: #PF: error_code(0x) - not-present page
  Jul 04 16:58:32 hostname kernel: PGD 3ac205067 P4D 3ac205067 PUD 0
  Jul 04 16:58:32 hostname kernel: Oops:  [#1] SMP NOPTI
  Jul 04 16:58:32 hostname kernel: CPU: 4 PID: 289 Comm: kworker/u16:4 Tainted: 
G   OE 5.4.0-40-generic #44-Ubuntu
  Jul 04 16:58:32 hostname kernel: Hardware name: LENOVO 20N2CTO1WW/20N2CTO1WW, 
BIOS N2IET88W (1.66 ) 04/22/2020
  Jul 04 16:58:32 hostname kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c1436a94
  Jul 04 16:58:32 hostname kernel: R10: 908368178d2c R11: 0018 
R12: 9083e222e632
  Jul 04 16:58:32 hostname kernel: R13: 00092800 R14: 908367ca6140 
R15: 908367ca6140
  Jul 04 16:58:32 hostname kernel: FS:  () 
GS:90836c30() knlGS:
  Jul 04 16:58:32 hostname kernel: CS:  0010 DS:  ES:  CR0: 
80050033
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632 CR3: 0003ab80a003 
CR4: 003606e0
  Jul 04 16:58:32 hostname kernel: Call Trace:
  Jul 04 16:58:32 hostname kernel:  ? mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc+0x64/0x180
  Jul 04 16:58:32 hostname kernel:  rpc_malloc+0xa1/0xb0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  call_allocate+0xd1/0x1b0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  ? call_refreshresult+0x100/0x100 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  __rpc_execute+0x8c/0x3a0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  rpc_async_schedule+0x30/0x50 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  process_one_work+0x1eb/0x3b0
  Jul 04 16:58:32 hostname kernel:  worker_thread+0x4d/0x400
  Jul 04 16:58:32 hostname kernel:  kthread+0x104/0x140
  Jul 04 16:58:32 hostname kernel:  ? process_one_work+0x3b0/0x3b0
  Jul 04 16:58:32 hostname kernel:  ? kthread_park+0x90/0x90
  Jul 04 16:58:32 hostname kernel:  ret_from_fork+0x35/0x40
  Jul 04 16:58:32 hostname kernel: Modules linked in: rfcomm rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs lockd grace fscache vboxnetadp(OE) vboxnetflt(OE) 
vboxdrv(OE) msr ccm cmac algif_hash algif_skcipher af_alg aufs bnep overlay 
nls_iso8859_1 mei_hdcp intel_rapl_msr snd_s>
  Jul 04 16:58:32 hostname kernel:  nvram ledtrig_audio mei_me cfg80211 mei 
processor_thermal_device snd_seq ucsi_acpi typec_ucsi intel_rapl_common 
intel_soc_dts_iosf snd_seq_device typec intel_pch_thermal snd_timer snd 
int3403_thermal soundcore int340x_thermal_zone i>
  Jul 04 16:58:32 hostname kernel:  pinctrl_cannonlake video pinctrl_intel
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632
  Jul 04 16:58:32 hostname kernel: ---[ end trace cbbaed921eb439ce ]---
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c1436a94
  Jul 04 16:58:32 hostname kernel: R10: 908368178d2c R11: 0018 
R12: 9083e222e6

[Kernel-packages] [Bug 1886277] Re: Regression on NFS: unable to handle page fault in mempool_alloc_slab

2020-07-24 Thread Andrew Conway
For what it is worth, I also have the same enncryption aes256-cts-hmac-
sha1-96 (and same problem).  The tickets come from MIT Kerberos on
Ubuntu 18.04; the NFS servers are Ubuntu 18.04 using krb5p security
option.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886277

Title:
  Regression on NFS: unable to handle page fault in mempool_alloc_slab

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  On kernel 5.4.0-40-generic in focal I'm getting errors like this on
  several machines with different hardware in the first hour after boot:

  Jul 04 16:58:32 hostname kernel: BUG: unable to handle page fault for 
address: 9083e222e632
  Jul 04 16:58:32 hostname kernel: #PF: supervisor read access in kernel mode
  Jul 04 16:58:32 hostname kernel: #PF: error_code(0x) - not-present page
  Jul 04 16:58:32 hostname kernel: PGD 3ac205067 P4D 3ac205067 PUD 0
  Jul 04 16:58:32 hostname kernel: Oops:  [#1] SMP NOPTI
  Jul 04 16:58:32 hostname kernel: CPU: 4 PID: 289 Comm: kworker/u16:4 Tainted: 
G   OE 5.4.0-40-generic #44-Ubuntu
  Jul 04 16:58:32 hostname kernel: Hardware name: LENOVO 20N2CTO1WW/20N2CTO1WW, 
BIOS N2IET88W (1.66 ) 04/22/2020
  Jul 04 16:58:32 hostname kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c1436a94
  Jul 04 16:58:32 hostname kernel: R10: 908368178d2c R11: 0018 
R12: 9083e222e632
  Jul 04 16:58:32 hostname kernel: R13: 00092800 R14: 908367ca6140 
R15: 908367ca6140
  Jul 04 16:58:32 hostname kernel: FS:  () 
GS:90836c30() knlGS:
  Jul 04 16:58:32 hostname kernel: CS:  0010 DS:  ES:  CR0: 
80050033
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632 CR3: 0003ab80a003 
CR4: 003606e0
  Jul 04 16:58:32 hostname kernel: Call Trace:
  Jul 04 16:58:32 hostname kernel:  ? mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc+0x64/0x180
  Jul 04 16:58:32 hostname kernel:  rpc_malloc+0xa1/0xb0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  call_allocate+0xd1/0x1b0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  ? call_refreshresult+0x100/0x100 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  __rpc_execute+0x8c/0x3a0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  rpc_async_schedule+0x30/0x50 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  process_one_work+0x1eb/0x3b0
  Jul 04 16:58:32 hostname kernel:  worker_thread+0x4d/0x400
  Jul 04 16:58:32 hostname kernel:  kthread+0x104/0x140
  Jul 04 16:58:32 hostname kernel:  ? process_one_work+0x3b0/0x3b0
  Jul 04 16:58:32 hostname kernel:  ? kthread_park+0x90/0x90
  Jul 04 16:58:32 hostname kernel:  ret_from_fork+0x35/0x40
  Jul 04 16:58:32 hostname kernel: Modules linked in: rfcomm rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs lockd grace fscache vboxnetadp(OE) vboxnetflt(OE) 
vboxdrv(OE) msr ccm cmac algif_hash algif_skcipher af_alg aufs bnep overlay 
nls_iso8859_1 mei_hdcp intel_rapl_msr snd_s>
  Jul 04 16:58:32 hostname kernel:  nvram ledtrig_audio mei_me cfg80211 mei 
processor_thermal_device snd_seq ucsi_acpi typec_ucsi intel_rapl_common 
intel_soc_dts_iosf snd_seq_device typec intel_pch_thermal snd_timer snd 
int3403_thermal soundcore int340x_thermal_zone i>
  Jul 04 16:58:32 hostname kernel:  pinctrl_cannonlake video pinctrl_intel
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632
  Jul 04 16:58:32 hostname kernel: ---[ end trace cbbaed921eb439ce ]---
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c1436a94
  Jul 04 

[Kernel-packages] [Bug 1886775] Re: kernel 5.4.0-40 hangs system when using nfs home directories

2020-07-08 Thread Andrew Conway
#3: Agreed that it is a duplicate of lp: #1886277 . Sorry, I looked for
similar bugs but did a lousy job it appears. I just made a comment to
this effect in #1886277.

#2: I believe the apport files are attached in comment #1, though it is
the first time I have used it and may be confusing it.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886775

Title:
  kernel 5.4.0-40 hangs system when using nfs home directories

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  We use nfs mounted (using autofs), kerberos authenticated home
  directories for most users.

  Booting with kernel 5.4.0-40, users with nfs mounted home directories
  find the system freezes not long after use, somewhat randomly. Power
  off is then the only thing to do. Some specific things that caused
  crashes - opening a second tab on firefox;  opening a terminal and
  running "cat" on log files, and running ubuntu-bug linux to try to
  generate this report :-(

  Sometimes before the crash just one window freezes, and the rest of
  the GUI is responsive. A full freeze usually occurs within several
  seconds.

  No such crashes were observed using an account without nfs mounted
  home directories (and the output from "ubuntu-bug linux" for one of
  these working users is at the end of this report).

  Reverting to 5.4.0-39, everything is good.

  Exactly the same behaviour is observed on a modern AMD Zen2 processor
  with a graphics card, and a several year old Intel processor with
  integrated graphics.

  Looking at /var/log/syslog there are several suspicious messages like
  the one below. The general protection fault occurs always just before
  the freeze, and occasionally some times before.

  Jul  4 16:23:37 emu kernel: [  350.263903] [ cut here 
]
  Jul  4 16:23:37 emu kernel: [  350.263904] virt_to_cache: Object is not a 
Slab page!
  Jul  4 16:23:37 emu kernel: [  350.263917] WARNING: CPU: 13 PID: 4009 at 
mm/slab.h:473 kmem_cache_free+0x237/0x2b0
  Jul  4 16:23:37 emu kernel: [  350.263917] Modules linked in: rfcomm 
rpcsec_gss_krb5 nfsv4 nfs fscache vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) 
edac_mce_amd kvm_amd xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle 
iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 
nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bpfilter cmac 
algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi binfmt_misc 
nvidia_uvm(OE) kvm nvidia_drm(POE) nvidia_modeset(POE) iwlmvm 
snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel 
snd_intel_dspcfg snd_hda_codec nls_iso8859_1 snd_hda_core snd_hwdep snd_pcm 
btusb btrtl btbcm btintel snd_seq_midi mac80211 bluetooth snd_seq_midi_event 
crct10dif_pclmul snd_rawmidi bridge ecdh_generic stp ghash_clmulni_intel llc 
libarc4 input_leds joydev ecc nvidia(POE) snd_seq iwlwifi aesni_intel 
crypto_simd cryptd glue_helper drm_kms_helper snd_seq_device cfg80211 snd_timer 
ipmi_devintf
  Jul  4 16:23:37 emu kernel: [  350.263952]  wmi_bmof ipmi_msghandler snd 
fb_sys_fops syscopyarea sysfillrect sysimgblt soundcore k10temp ccp mac_hid 
sch_fq_codel parport_pc ppdev lp parport drm nfsd nfs_acl auth_rpcgss lockd 
grace sunrpc ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igb 
i2c_piix4 ahci i2c_algo_bit nvme libahci dca nvme_core wmi
  Jul  4 16:23:37 emu kernel: [  350.263971] CPU: 13 PID: 4009 Comm: 
kworker/u64:4 Tainted: P   OE 5.4.0-40-generic #44-Ubuntu
  Jul  4 16:23:37 emu kernel: [  350.263972] Hardware name: Gigabyte Technology 
Co., Ltd. X570 I AORUS PRO WIFI/X570 I AORUS PRO WIFI, BIOS F4h 07/17/2019
  Jul  4 16:23:37 emu kernel: [  350.263986] Workqueue: rpciod 
rpc_async_schedule [sunrpc]
  Jul  4 16:23:37 emu kernel: [  350.263989] RIP: 
0010:kmem_cache_free+0x237/0x2b0
  Jul  4 16:23:37 emu kernel: [  350.263990] Code: ff ff ff 80 3d 16 4f 56 01 
00 0f 85 39 ff ff ff 48 c7 c6 20 44 67 86 48 c7 c7 08 25 98 86 c6 05 fb 4e 56 
01 01 e8 64 8a df ff <0f> 0b e9 18 ff ff ff 48 8b 57 58 49 8b 4f 58 48 c7 c6 30 
44 67 86
  Jul  4 16:23:37 emu kernel: [  350.263991] RSP: 0018:c1ebc3077d20 EFLAGS: 
00010282
  Jul  4 16:23:37 emu kernel: [  350.263993] RAX:  RBX: 
a040c01358e2 RCX: 0006
  Jul  4 16:23:37 emu kernel: [  350.263993] RDX: 0007 RSI: 
0092 RDI: a040beb578c0
  Jul  4 16:23:37 emu kernel: [  350.263994] RBP: c1ebc3077d48 R08: 
0506 R09: 0004
  Jul  4 16:23:37 emu kernel: [  350.263995] R10:  R11: 
0001 R12: a041401358e2
  Jul  4 16:23:37 emu kernel: [  350.263995] R13:  R14: 
a040a7e47600 R15: a04065a99cb0
  Jul  4 16:23:37 emu kernel: [  35

[Kernel-packages] [Bug 1886277] Re: unable to handle page fault in mempool_alloc_slab

2020-07-08 Thread Andrew Conway
I also have this problem, which I reported as a new bug 1886775 which is
probably just a duplicate of this bug. Same issue, -40 dies with NFS
with similar stack trace and similar timing, -39 is fine, and multiple
hardware has the identical issues.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886277

Title:
  unable to handle page fault in mempool_alloc_slab

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  On kernel 5.4.0-40-generic in focal I'm getting errors like this on
  several machines with different hardware in the first hour after boot:

  Jul 04 16:58:32 hostname kernel: BUG: unable to handle page fault for 
address: 9083e222e632
  Jul 04 16:58:32 hostname kernel: #PF: supervisor read access in kernel mode
  Jul 04 16:58:32 hostname kernel: #PF: error_code(0x) - not-present page
  Jul 04 16:58:32 hostname kernel: PGD 3ac205067 P4D 3ac205067 PUD 0
  Jul 04 16:58:32 hostname kernel: Oops:  [#1] SMP NOPTI
  Jul 04 16:58:32 hostname kernel: CPU: 4 PID: 289 Comm: kworker/u16:4 Tainted: 
G   OE 5.4.0-40-generic #44-Ubuntu
  Jul 04 16:58:32 hostname kernel: Hardware name: LENOVO 20N2CTO1WW/20N2CTO1WW, 
BIOS N2IET88W (1.66 ) 04/22/2020
  Jul 04 16:58:32 hostname kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c1436a94
  Jul 04 16:58:32 hostname kernel: R10: 908368178d2c R11: 0018 
R12: 9083e222e632
  Jul 04 16:58:32 hostname kernel: R13: 00092800 R14: 908367ca6140 
R15: 908367ca6140
  Jul 04 16:58:32 hostname kernel: FS:  () 
GS:90836c30() knlGS:
  Jul 04 16:58:32 hostname kernel: CS:  0010 DS:  ES:  CR0: 
80050033
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632 CR3: 0003ab80a003 
CR4: 003606e0
  Jul 04 16:58:32 hostname kernel: Call Trace:
  Jul 04 16:58:32 hostname kernel:  ? mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc+0x64/0x180
  Jul 04 16:58:32 hostname kernel:  rpc_malloc+0xa1/0xb0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  call_allocate+0xd1/0x1b0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  ? call_refreshresult+0x100/0x100 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  __rpc_execute+0x8c/0x3a0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  rpc_async_schedule+0x30/0x50 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  process_one_work+0x1eb/0x3b0
  Jul 04 16:58:32 hostname kernel:  worker_thread+0x4d/0x400
  Jul 04 16:58:32 hostname kernel:  kthread+0x104/0x140
  Jul 04 16:58:32 hostname kernel:  ? process_one_work+0x3b0/0x3b0
  Jul 04 16:58:32 hostname kernel:  ? kthread_park+0x90/0x90
  Jul 04 16:58:32 hostname kernel:  ret_from_fork+0x35/0x40
  Jul 04 16:58:32 hostname kernel: Modules linked in: rfcomm rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs lockd grace fscache vboxnetadp(OE) vboxnetflt(OE) 
vboxdrv(OE) msr ccm cmac algif_hash algif_skcipher af_alg aufs bnep overlay 
nls_iso8859_1 mei_hdcp intel_rapl_msr snd_s>
  Jul 04 16:58:32 hostname kernel:  nvram ledtrig_audio mei_me cfg80211 mei 
processor_thermal_device snd_seq ucsi_acpi typec_ucsi intel_rapl_common 
intel_soc_dts_iosf snd_seq_device typec intel_pch_thermal snd_timer snd 
int3403_thermal soundcore int340x_thermal_zone i>
  Jul 04 16:58:32 hostname kernel:  pinctrl_cannonlake video pinctrl_intel
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632
  Jul 04 16:58:32 hostname kernel: ---[ end trace cbbaed921eb439ce ]---
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c14

[Kernel-packages] [Bug 1886775] [NEW] kernel 5.4.0-40 hangs system when using nfs home directories

2020-07-08 Thread Andrew Conway
Public bug reported:

We use nfs mounted (using autofs), kerberos authenticated home
directories for most users.

Booting with kernel 5.4.0-40, users with nfs mounted home directories
find the system freezes not long after use, somewhat randomly. Power off
is then the only thing to do. Some specific things that caused crashes -
opening a second tab on firefox;  opening a terminal and running "cat"
on log files, and running ubuntu-bug linux to try to generate this
report :-(

Sometimes before the crash just one window freezes, and the rest of the
GUI is responsive. A full freeze usually occurs within several seconds.

No such crashes were observed using an account without nfs mounted home
directories (and the output from "ubuntu-bug linux" for one of these
working users is at the end of this report).

Reverting to 5.4.0-39, everything is good.

Exactly the same behaviour is observed on a modern AMD Zen2 processor
with a graphics card, and a several year old Intel processor with
integrated graphics.

Looking at /var/log/syslog there are several suspicious messages like
the one below. The general protection fault occurs always just before
the freeze, and occasionally some times before.

Jul  4 16:23:37 emu kernel: [  350.263903] [ cut here ]
Jul  4 16:23:37 emu kernel: [  350.263904] virt_to_cache: Object is not a Slab 
page!
Jul  4 16:23:37 emu kernel: [  350.263917] WARNING: CPU: 13 PID: 4009 at 
mm/slab.h:473 kmem_cache_free+0x237/0x2b0
Jul  4 16:23:37 emu kernel: [  350.263917] Modules linked in: rfcomm 
rpcsec_gss_krb5 nfsv4 nfs fscache vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) 
edac_mce_amd kvm_amd xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle 
iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 
nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter bpfilter cmac 
algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi binfmt_misc 
nvidia_uvm(OE) kvm nvidia_drm(POE) nvidia_modeset(POE) iwlmvm 
snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel 
snd_intel_dspcfg snd_hda_codec nls_iso8859_1 snd_hda_core snd_hwdep snd_pcm 
btusb btrtl btbcm btintel snd_seq_midi mac80211 bluetooth snd_seq_midi_event 
crct10dif_pclmul snd_rawmidi bridge ecdh_generic stp ghash_clmulni_intel llc 
libarc4 input_leds joydev ecc nvidia(POE) snd_seq iwlwifi aesni_intel 
crypto_simd cryptd glue_helper drm_kms_helper snd_seq_device cfg80211 snd_timer 
ipmi_devintf
Jul  4 16:23:37 emu kernel: [  350.263952]  wmi_bmof ipmi_msghandler snd 
fb_sys_fops syscopyarea sysfillrect sysimgblt soundcore k10temp ccp mac_hid 
sch_fq_codel parport_pc ppdev lp parport drm nfsd nfs_acl auth_rpcgss lockd 
grace sunrpc ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igb 
i2c_piix4 ahci i2c_algo_bit nvme libahci dca nvme_core wmi
Jul  4 16:23:37 emu kernel: [  350.263971] CPU: 13 PID: 4009 Comm: 
kworker/u64:4 Tainted: P   OE 5.4.0-40-generic #44-Ubuntu
Jul  4 16:23:37 emu kernel: [  350.263972] Hardware name: Gigabyte Technology 
Co., Ltd. X570 I AORUS PRO WIFI/X570 I AORUS PRO WIFI, BIOS F4h 07/17/2019
Jul  4 16:23:37 emu kernel: [  350.263986] Workqueue: rpciod rpc_async_schedule 
[sunrpc]
Jul  4 16:23:37 emu kernel: [  350.263989] RIP: 0010:kmem_cache_free+0x237/0x2b0
Jul  4 16:23:37 emu kernel: [  350.263990] Code: ff ff ff 80 3d 16 4f 56 01 00 
0f 85 39 ff ff ff 48 c7 c6 20 44 67 86 48 c7 c7 08 25 98 86 c6 05 fb 4e 56 01 
01 e8 64 8a df ff <0f> 0b e9 18 ff ff ff 48 8b 57 58 49 8b 4f 58 48 c7 c6 30 44 
67 86
Jul  4 16:23:37 emu kernel: [  350.263991] RSP: 0018:c1ebc3077d20 EFLAGS: 
00010282
Jul  4 16:23:37 emu kernel: [  350.263993] RAX:  RBX: 
a040c01358e2 RCX: 0006
Jul  4 16:23:37 emu kernel: [  350.263993] RDX: 0007 RSI: 
0092 RDI: a040beb578c0
Jul  4 16:23:37 emu kernel: [  350.263994] RBP: c1ebc3077d48 R08: 
0506 R09: 0004
Jul  4 16:23:37 emu kernel: [  350.263995] R10:  R11: 
0001 R12: a041401358e2
Jul  4 16:23:37 emu kernel: [  350.263995] R13:  R14: 
a040a7e47600 R15: a04065a99cb0
Jul  4 16:23:37 emu kernel: [  350.263997] FS:  () 
GS:a040beb4() knlGS:
Jul  4 16:23:37 emu kernel: [  350.263997] CS:  0010 DS:  ES:  CR0: 
80050033
Jul  4 16:23:37 emu kernel: [  350.263998] CR2: 7fe66802dfe0 CR3: 
000717722000 CR4: 00340ee0
Jul  4 16:23:37 emu kernel: [  350.263999] Call Trace:
Jul  4 16:23:37 emu kernel: [  350.264005]  mempool_free_slab+0x17/0x20
Jul  4 16:23:37 emu kernel: [  350.264007]  mempool_free+0x2f/0x80
Jul  4 16:23:37 emu kernel: [  350.264018]  rpc_free+0x47/0x60 [sunrpc]
Jul  4 16:23:37 emu kernel: [  350.264028]  xprt_release+0x91/0x1a0 [sunrpc]
Jul  4 16:23:37 emu kernel: [  350.264037]  
rpc_release_resources_task+0x13/0x50 [sunrp