[Kernel-packages] [Bug 1886277] Re: Regression on NFS: unable to handle page fault in mempool_alloc_slab

2020-09-28 Thread Marc Kolly
mainline 5.4.60-050460-generic #202008210836 did indeed fix the setup
for me, I just forget to get back to you guys here.

Hello Martin,
It looks like the fix had not been included in the 5.4.0-47 release, but only 
in the following 5.4.0-48 release.

Have you had the time to update to the updated kernel included with
Ubuntu 20.04 and see if it has been fixed for you?

On a possibly connected note, may I ask you how you are mounting the nfs share 
on the client?
My server: Linux storage 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1 
(2020-06-07) x86_64 GNU/Linux with nfs-kernel-server 1:1.3.4-2.5+deb10u1.

I switched from mounting the drives via fstab to a systemd mount after 14.04 
with the following options: _netdev,sec=krb5p,vers=4.1,auto
I have to use NFS4.1 with vers=4.1, because whenever I omit it, the client will 
still freeze when a file is being copied or cut (Ctrl+C/X).
Do you run into the same issue and has this been fixed for you on 5.4.0-48 as 
well?

Kind regards,
Marc

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886277

Title:
  Regression on NFS: unable to handle page fault in mempool_alloc_slab

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Released

Bug description:
  On kernel 5.4.0-40-generic in focal I'm getting errors like this on
  several machines with different hardware in the first hour after boot:

  Jul 04 16:58:32 hostname kernel: BUG: unable to handle page fault for 
address: 9083e222e632
  Jul 04 16:58:32 hostname kernel: #PF: supervisor read access in kernel mode
  Jul 04 16:58:32 hostname kernel: #PF: error_code(0x) - not-present page
  Jul 04 16:58:32 hostname kernel: PGD 3ac205067 P4D 3ac205067 PUD 0
  Jul 04 16:58:32 hostname kernel: Oops:  [#1] SMP NOPTI
  Jul 04 16:58:32 hostname kernel: CPU: 4 PID: 289 Comm: kworker/u16:4 Tainted: 
G   OE 5.4.0-40-generic #44-Ubuntu
  Jul 04 16:58:32 hostname kernel: Hardware name: LENOVO 20N2CTO1WW/20N2CTO1WW, 
BIOS N2IET88W (1.66 ) 04/22/2020
  Jul 04 16:58:32 hostname kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c1436a94
  Jul 04 16:58:32 hostname kernel: R10: 908368178d2c R11: 0018 
R12: 9083e222e632
  Jul 04 16:58:32 hostname kernel: R13: 00092800 R14: 908367ca6140 
R15: 908367ca6140
  Jul 04 16:58:32 hostname kernel: FS:  () 
GS:90836c30() knlGS:
  Jul 04 16:58:32 hostname kernel: CS:  0010 DS:  ES:  CR0: 
80050033
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632 CR3: 0003ab80a003 
CR4: 003606e0
  Jul 04 16:58:32 hostname kernel: Call Trace:
  Jul 04 16:58:32 hostname kernel:  ? mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc+0x64/0x180
  Jul 04 16:58:32 hostname kernel:  rpc_malloc+0xa1/0xb0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  call_allocate+0xd1/0x1b0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  ? call_refreshresult+0x100/0x100 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  __rpc_execute+0x8c/0x3a0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  rpc_async_schedule+0x30/0x50 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  process_one_work+0x1eb/0x3b0
  Jul 04 16:58:32 hostname kernel:  worker_thread+0x4d/0x400
  Jul 04 16:58:32 hostname kernel:  kthread+0x104/0x140
  Jul 04 16:58:32 hostname kernel:  ? process_one_work+0x3b0/0x3b0
  Jul 04 16:58:32 hostname kernel:  ? kthread_park+0x90/0x90
  Jul 04 16:58:32 hostname kernel:  ret_from_fork+0x35/0x40
  Jul 04 16:58:32 hostname kernel: Modules linked in: rfcomm rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs lockd grace fscache vboxnetadp(OE) vboxnetflt(OE) 
vboxdrv(OE) msr ccm cmac algif_hash algif_skcipher af_alg aufs bnep overlay 
nls_iso8859_1 mei_hdcp intel_rapl_msr snd_s>
  Jul 04 16:58:32 hostname kernel:  nvram ledtrig_audio mei_me cfg80211 mei 
processor_thermal_device snd_seq ucsi_acpi typec_ucsi intel_rapl_common 
intel_soc_dts_iosf snd_seq_device typec intel_pch_thermal snd_timer snd 
int3403_thermal soundcore int340x_thermal_zone i>
  Jul 04 16:58:32 hostname kernel:  pinctrl_cannonlake video 

[Kernel-packages] [Bug 1886277] Re: Regression on NFS: unable to handle page fault in mempool_alloc_slab

2020-08-23 Thread Marc Kolly
That is great to hear @marianrh!

I switched the entire workforce to the 5.8 kernel after the bug was
introduced on those machines that are still on 18.04 and have HWE
installed. Now I installed two machines for testing purposes with Ubuntu
20.04 and the problem reappeared quite quickly, so I put them on the 5.8
kernel as well.

I will try 5.4.60-050460-generic #202008210836 now and see if it works
as well.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886277

Title:
  Regression on NFS: unable to handle page fault in mempool_alloc_slab

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  On kernel 5.4.0-40-generic in focal I'm getting errors like this on
  several machines with different hardware in the first hour after boot:

  Jul 04 16:58:32 hostname kernel: BUG: unable to handle page fault for 
address: 9083e222e632
  Jul 04 16:58:32 hostname kernel: #PF: supervisor read access in kernel mode
  Jul 04 16:58:32 hostname kernel: #PF: error_code(0x) - not-present page
  Jul 04 16:58:32 hostname kernel: PGD 3ac205067 P4D 3ac205067 PUD 0
  Jul 04 16:58:32 hostname kernel: Oops:  [#1] SMP NOPTI
  Jul 04 16:58:32 hostname kernel: CPU: 4 PID: 289 Comm: kworker/u16:4 Tainted: 
G   OE 5.4.0-40-generic #44-Ubuntu
  Jul 04 16:58:32 hostname kernel: Hardware name: LENOVO 20N2CTO1WW/20N2CTO1WW, 
BIOS N2IET88W (1.66 ) 04/22/2020
  Jul 04 16:58:32 hostname kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04 16:58:32 hostname kernel: RDX: 0009 RSI: 00092800 
RDI: 00031fb0
  Jul 04 16:58:32 hostname kernel: RBP: bc38c046fcf8 R08: 90836c331fb0 
R09: c1436a94
  Jul 04 16:58:32 hostname kernel: R10: 908368178d2c R11: 0018 
R12: 9083e222e632
  Jul 04 16:58:32 hostname kernel: R13: 00092800 R14: 908367ca6140 
R15: 908367ca6140
  Jul 04 16:58:32 hostname kernel: FS:  () 
GS:90836c30() knlGS:
  Jul 04 16:58:32 hostname kernel: CS:  0010 DS:  ES:  CR0: 
80050033
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632 CR3: 0003ab80a003 
CR4: 003606e0
  Jul 04 16:58:32 hostname kernel: Call Trace:
  Jul 04 16:58:32 hostname kernel:  ? mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc_slab+0x17/0x20
  Jul 04 16:58:32 hostname kernel:  mempool_alloc+0x64/0x180
  Jul 04 16:58:32 hostname kernel:  rpc_malloc+0xa1/0xb0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  call_allocate+0xd1/0x1b0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  ? call_refreshresult+0x100/0x100 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  __rpc_execute+0x8c/0x3a0 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  rpc_async_schedule+0x30/0x50 [sunrpc]
  Jul 04 16:58:32 hostname kernel:  process_one_work+0x1eb/0x3b0
  Jul 04 16:58:32 hostname kernel:  worker_thread+0x4d/0x400
  Jul 04 16:58:32 hostname kernel:  kthread+0x104/0x140
  Jul 04 16:58:32 hostname kernel:  ? process_one_work+0x3b0/0x3b0
  Jul 04 16:58:32 hostname kernel:  ? kthread_park+0x90/0x90
  Jul 04 16:58:32 hostname kernel:  ret_from_fork+0x35/0x40
  Jul 04 16:58:32 hostname kernel: Modules linked in: rfcomm rpcsec_gss_krb5 
auth_rpcgss nfsv4 nfs lockd grace fscache vboxnetadp(OE) vboxnetflt(OE) 
vboxdrv(OE) msr ccm cmac algif_hash algif_skcipher af_alg aufs bnep overlay 
nls_iso8859_1 mei_hdcp intel_rapl_msr snd_s>
  Jul 04 16:58:32 hostname kernel:  nvram ledtrig_audio mei_me cfg80211 mei 
processor_thermal_device snd_seq ucsi_acpi typec_ucsi intel_rapl_common 
intel_soc_dts_iosf snd_seq_device typec intel_pch_thermal snd_timer snd 
int3403_thermal soundcore int340x_thermal_zone i>
  Jul 04 16:58:32 hostname kernel:  pinctrl_cannonlake video pinctrl_intel
  Jul 04 16:58:32 hostname kernel: CR2: 9083e222e632
  Jul 04 16:58:32 hostname kernel: ---[ end trace cbbaed921eb439ce ]---
  Jul 04 16:58:32 hostname kernel: RIP: 0010:kmem_cache_alloc+0x7e/0x230
  Jul 04 16:58:32 hostname kernel: Code: 99 01 00 00 4d 8b 07 65 49 8b 50 08 65 
4c 03 05 40 9d 56 44 4d 8b 20 4d 85 e4 0f 84 85 01 00 00 41 8b 47 20 49 8b 3f 
4c 01 e0 <48> 8b 18 48 89 c1 49 33 9f 70 01 00 00 4c 89 e0 48 0f c9 48 31 cb
  Jul 04 16:58:32 hostname kernel: RSP: 0018:bc38c046fcc8 EFLAGS: 00010282
  Jul 04 16:58:32 hostname kernel: RAX: 9083e222e632 RBX:  
RCX: 0002
  Jul 04