[Kernel-packages] [Bug 1972898] Re: Kernel Bug: 22.04, EXT4, samba (smbd) on MDADM raid6: Copying large volume of files.

2022-05-10 Thread Mathew Moore
apport-collect 1972898  Cannot run command. needs browser authorization.
No gui

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1972898

Title:
  Kernel Bug: 22.04,EXT4, samba (smbd)  on MDADM raid6: Copying large
  volume of files.

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  60 Drive MDADM Raid 6, ext4, Ubuntu 22.04.  Issue reproduced on both 
Supermicro
  SSG-6048R and HP ProLiant DL380 servers.

  System was stable on Ubuntu 20.04.  Unstable following upgrade to 22.04  
(kernel version 5.15)
  To reproduce kernel error,  copy thousands of files (~1tb of data) to 
samba-share from any windows computer. After some time(seconds to minutes), a 
Kernel error is thrown,  smbd process is unresponsive and cannot be killed, 
file transfer stops,  the mounted drive freezes (directory operations including 
ls,mv,cp on the mount are not possible) and the system needs to be 
hard-rebooted.  Quite an unhappy outcome :) 

  I then moved the 60 drives to an external enclosure, and connected to
  a new computer (HP ProLiant DL380). After assembling the raid drive,
  with a fresh install of Ubuntu 22.04 on the new system, the kernel
  error was reproduced. I cannot reproduce the error copying via nfs or
  copying files on the drive itself. Single files or small transfers
  proceed without error. Filesystem passes fsck.

  Happy to assist in troubleshooting in any way.

  Kernel error message from both systems follows.

  **New System (HP ProLiant DL380) Kernel Error**
  May 10 01:32:49 nas3 kernel: [ 1463.900175] [ cut here 
]
  May 10 01:32:49 nas3 kernel: [ 1463.900179] kernel BUG at 
fs/ext4/xattr.c:2071!
  May 10 01:32:49 nas3 kernel: [ 1463.900214] invalid opcode:  [#1] SMP PTI
  May 10 01:32:49 nas3 kernel: [ 1463.900233] CPU: 0 PID: 5989 Comm: smbd Not 
tainted 5.15.0-27-generic #28-Ubuntu
  May 10 01:32:49 nas3 kernel: [ 1463.900939] Hardware name: HP ProLiant DL380 
Gen9/ProLiant DL380 Gen9, BIOS P89 04/25/2017
  May 10 01:32:49 nas3 kernel: [ 1463.901560] RIP: 
0010:ext4_xattr_block_set+0xbba/0xbd0
  May 10 01:32:49 nas3 kernel: [ 1463.902190] Code: c7 45 8c f4 ff ff ff eb b4 
48 8b 7d 90 48 c7 c1 7f 12 61 a8 ba 2d 08 00 00 48 c7 c6 d0 3c 25 a8 e8 9b 6f 
ff ff e9 a5 fe ff ff <0>
  May 10 01:32:49 nas3 kernel: [ 1463.903445] RSP: 0018:a59e0b51f9c0 
EFLAGS: 00010206
  May 10 01:32:49 nas3 kernel: [ 1463.904080] RAX: 0003 RBX: 
97aa0490b680 RCX: a860a8e7
  May 10 01:32:49 nas3 kernel: [ 1463.904727] RDX: 0261 RSI: 
 RDI: 0003cca0
  May 10 01:32:49 nas3 kernel: [ 1463.905384] RBP: a59e0b51fa70 R08: 
97aa21824138 R09: 
  May 10 01:32:49 nas3 kernel: [ 1463.906051] R10: 97aa0f6e87e0 R11: 
97aae9073ff0 R12: 
  May 10 01:32:49 nas3 kernel: [ 1463.906738] R13: 97ada77feac0 R14: 
0003165b R15: 
  May 10 01:32:49 nas3 kernel: [ 1463.907411] FS:  7f06ceb61a40() 
GS:97b93f80() knlGS:
  May 10 01:32:49 nas3 kernel: [ 1463.908049] CS:  0010 DS:  ES:  CR0: 
80050033
  May 10 01:32:49 nas3 kernel: [ 1463.908697] CR2: 55c076d0d4f8 CR3: 
00029e6fe003 CR4: 001706f0
  May 10 01:32:49 nas3 kernel: [ 1463.909349] Call Trace:
  May 10 01:32:49 nas3 kernel: [ 1463.909989]  
  May 10 01:32:49 nas3 kernel: [ 1463.910624]  ? 
jbd2_journal_get_write_access+0x43/0x90
  May 10 01:32:49 nas3 kernel: [ 1463.911360]  ext4_xattr_set_handle+0x487/0x620
  May 10 01:32:49 nas3 kernel: [ 1463.912032]  __ext4_set_acl+0xc1/0x130
  May 10 01:32:49 nas3 kernel: [ 1463.912689]  ext4_init_acl+0xe8/0x160
  May 10 01:32:49 nas3 kernel: [ 1463.913327]  __ext4_new_inode+0xf60/0x14e0
  May 10 01:32:49 nas3 kernel: [ 1463.913962]  ? path_parentat+0x4c/0x90
  May 10 01:32:49 nas3 kernel: [ 1463.914595]  ext4_mkdir+0x157/0x330
  May 10 01:32:49 nas3 kernel: [ 1463.915265]  vfs_mkdir+0x142/0x200
  May 10 01:32:49 nas3 kernel: [ 1463.915883]  do_mkdirat+0x120/0x140
  May 10 01:32:49 nas3 kernel: [ 1463.916501]  __x64_sys_mkdirat+0x51/0x70
  May 10 01:32:49 nas3 kernel: [ 1463.917115]  do_syscall_64+0x5c/0xc0
  May 10 01:32:49 nas3 kernel: [ 1463.917733]  ? 
exit_to_user_mode_prepare+0x37/0xb0
  May 10 01:32:49 nas3 kernel: [ 1463.918365]  ? 
syscall_exit_to_user_mode+0x27/0x50
  May 10 01:32:49 nas3 kernel: [ 1463.919035]  ? __x64_sys_newfstatat+0x1c/0x20
  May 10 01:32:49 nas3 kernel: [ 1463.919665]  ? do_syscall_64+0x69/0xc0
  May 10 01:32:49 nas3 kernel: [ 1463.920300]  ? __x64_sys_newfstatat+0x1c/0x20
  May 10 01:32:49 nas3 kernel: [ 1463.920929]  ? do_syscall_64+0x69/0xc0
  May 10 01:32:49 nas3 kernel: [ 1463.921534]  ? __x64_sys_newfstatat+0x1c/0x20
  May 10 01:32:49 nas3 kernel: [ 1463.922121]  ? 

[Kernel-packages] [Bug 1972898] Re: Kernel Bug: 22.04, EXT4, samba (smbd) on MDADM raid6: Copying large volume of files.

2022-05-10 Thread Mathew Moore
Pls let me know how else I can get the information.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1972898

Title:
  Kernel Bug: 22.04,EXT4, samba (smbd)  on MDADM raid6: Copying large
  volume of files.

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  60 Drive MDADM Raid 6, ext4, Ubuntu 22.04.  Issue reproduced on both 
Supermicro
  SSG-6048R and HP ProLiant DL380 servers.

  System was stable on Ubuntu 20.04.  Unstable following upgrade to 22.04  
(kernel version 5.15)
  To reproduce kernel error,  copy thousands of files (~1tb of data) to 
samba-share from any windows computer. After some time(seconds to minutes), a 
Kernel error is thrown,  smbd process is unresponsive and cannot be killed, 
file transfer stops,  the mounted drive freezes (directory operations including 
ls,mv,cp on the mount are not possible) and the system needs to be 
hard-rebooted.  Quite an unhappy outcome :) 

  I then moved the 60 drives to an external enclosure, and connected to
  a new computer (HP ProLiant DL380). After assembling the raid drive,
  with a fresh install of Ubuntu 22.04 on the new system, the kernel
  error was reproduced. I cannot reproduce the error copying via nfs or
  copying files on the drive itself. Single files or small transfers
  proceed without error. Filesystem passes fsck.

  Happy to assist in troubleshooting in any way.

  Kernel error message from both systems follows.

  **New System (HP ProLiant DL380) Kernel Error**
  May 10 01:32:49 nas3 kernel: [ 1463.900175] [ cut here 
]
  May 10 01:32:49 nas3 kernel: [ 1463.900179] kernel BUG at 
fs/ext4/xattr.c:2071!
  May 10 01:32:49 nas3 kernel: [ 1463.900214] invalid opcode:  [#1] SMP PTI
  May 10 01:32:49 nas3 kernel: [ 1463.900233] CPU: 0 PID: 5989 Comm: smbd Not 
tainted 5.15.0-27-generic #28-Ubuntu
  May 10 01:32:49 nas3 kernel: [ 1463.900939] Hardware name: HP ProLiant DL380 
Gen9/ProLiant DL380 Gen9, BIOS P89 04/25/2017
  May 10 01:32:49 nas3 kernel: [ 1463.901560] RIP: 
0010:ext4_xattr_block_set+0xbba/0xbd0
  May 10 01:32:49 nas3 kernel: [ 1463.902190] Code: c7 45 8c f4 ff ff ff eb b4 
48 8b 7d 90 48 c7 c1 7f 12 61 a8 ba 2d 08 00 00 48 c7 c6 d0 3c 25 a8 e8 9b 6f 
ff ff e9 a5 fe ff ff <0>
  May 10 01:32:49 nas3 kernel: [ 1463.903445] RSP: 0018:a59e0b51f9c0 
EFLAGS: 00010206
  May 10 01:32:49 nas3 kernel: [ 1463.904080] RAX: 0003 RBX: 
97aa0490b680 RCX: a860a8e7
  May 10 01:32:49 nas3 kernel: [ 1463.904727] RDX: 0261 RSI: 
 RDI: 0003cca0
  May 10 01:32:49 nas3 kernel: [ 1463.905384] RBP: a59e0b51fa70 R08: 
97aa21824138 R09: 
  May 10 01:32:49 nas3 kernel: [ 1463.906051] R10: 97aa0f6e87e0 R11: 
97aae9073ff0 R12: 
  May 10 01:32:49 nas3 kernel: [ 1463.906738] R13: 97ada77feac0 R14: 
0003165b R15: 
  May 10 01:32:49 nas3 kernel: [ 1463.907411] FS:  7f06ceb61a40() 
GS:97b93f80() knlGS:
  May 10 01:32:49 nas3 kernel: [ 1463.908049] CS:  0010 DS:  ES:  CR0: 
80050033
  May 10 01:32:49 nas3 kernel: [ 1463.908697] CR2: 55c076d0d4f8 CR3: 
00029e6fe003 CR4: 001706f0
  May 10 01:32:49 nas3 kernel: [ 1463.909349] Call Trace:
  May 10 01:32:49 nas3 kernel: [ 1463.909989]  
  May 10 01:32:49 nas3 kernel: [ 1463.910624]  ? 
jbd2_journal_get_write_access+0x43/0x90
  May 10 01:32:49 nas3 kernel: [ 1463.911360]  ext4_xattr_set_handle+0x487/0x620
  May 10 01:32:49 nas3 kernel: [ 1463.912032]  __ext4_set_acl+0xc1/0x130
  May 10 01:32:49 nas3 kernel: [ 1463.912689]  ext4_init_acl+0xe8/0x160
  May 10 01:32:49 nas3 kernel: [ 1463.913327]  __ext4_new_inode+0xf60/0x14e0
  May 10 01:32:49 nas3 kernel: [ 1463.913962]  ? path_parentat+0x4c/0x90
  May 10 01:32:49 nas3 kernel: [ 1463.914595]  ext4_mkdir+0x157/0x330
  May 10 01:32:49 nas3 kernel: [ 1463.915265]  vfs_mkdir+0x142/0x200
  May 10 01:32:49 nas3 kernel: [ 1463.915883]  do_mkdirat+0x120/0x140
  May 10 01:32:49 nas3 kernel: [ 1463.916501]  __x64_sys_mkdirat+0x51/0x70
  May 10 01:32:49 nas3 kernel: [ 1463.917115]  do_syscall_64+0x5c/0xc0
  May 10 01:32:49 nas3 kernel: [ 1463.917733]  ? 
exit_to_user_mode_prepare+0x37/0xb0
  May 10 01:32:49 nas3 kernel: [ 1463.918365]  ? 
syscall_exit_to_user_mode+0x27/0x50
  May 10 01:32:49 nas3 kernel: [ 1463.919035]  ? __x64_sys_newfstatat+0x1c/0x20
  May 10 01:32:49 nas3 kernel: [ 1463.919665]  ? do_syscall_64+0x69/0xc0
  May 10 01:32:49 nas3 kernel: [ 1463.920300]  ? __x64_sys_newfstatat+0x1c/0x20
  May 10 01:32:49 nas3 kernel: [ 1463.920929]  ? do_syscall_64+0x69/0xc0
  May 10 01:32:49 nas3 kernel: [ 1463.921534]  ? __x64_sys_newfstatat+0x1c/0x20
  May 10 01:32:49 nas3 kernel: [ 1463.922121]  ? do_syscall_64+0x69/0xc0
  May 10 01:32:49 nas3 kernel: [ 1463.922703]  ?