[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2020-01-30 Thread Guilherme G. Piccoli
The new patch merge is being work on
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1856949 (thanks
Juergh and Marcelo for this work). So in case you face this issue in the
future, please use the LP #1856949 for reporting.

Thanks,


Guilherme

** Description changed:

+ NOTICE: The new patch merge is being worked on
+ https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1856949 - if you
+ face this issue, please report there!
+ 
+ 
  [Impact]
  
  * We got reports of a kernel crash in cifs module with the following
  signature:
  
  BUG: unable to handle kernel NULL pointer dereference at 0038
  IP: smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  PGD 0 P4D 0
  RIP: 0010:smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  Call Trace:
   cifs_oplock_break+0x12f/0x3d0 [cifs]
   process_one_work+0x14d/0x410
   worker_thread+0x4b/0x460
   kthread+0x105/0x140
  [...]
  
  * Low-level analysis (decodecode script output and the objdump of the
  function) revealed that we are crashing in a NULL ptr dereference when
  trying to access "cfile->tlink"; below, a snippet of the objdump at
  function smb2_push_mandatory_locks():
  
  [...]
  mov0x10(%r14),%r15   # %r15 = cifsFileInfo *cfile
  mov0x18(%r14),%rbx   # %rbx = cifsLockInfo *li = (fdlocks->locks)
  lea0x18(%r14),%r12
  mov0x90(%r15),%rax   # %rax = struct tcon_link *tlink (cfile->tlink)
  cmp%r12,%rbx
  mov0x38(%rax),%rax   # <--- TRAP [trying to get cifs_tcon *tl_tcon]
  [...]
  
  * After discussing the issue with CIFS maintainers (Steve French and
  Pavel Shilovsky) they suggested commit b98749cac4a6 ("CIFS: keep
  FileInfo handle live during oplock break")
  [http://git.kernel.org/linus/b98749cac4a6] as a fix for multiple reports
  of this kind of crash.
  
  * The fix was sent to stable kernels and is present in Ubuntu kernels
  5.0 and newer. We are requesting the SRU for this patch here in order to
  fix the crashes, after reports of successful testing with the patch (see
  below section) and since the patch is restricted to the cifs module
  scope and accepted on linux stable.
  
  * Alternatively the issue is known to be avoided when oplocks are
  disabled using "cifs.enable_oplocks=N" module parameter.
  
  [Test case]
  
  * Unfortunately we cannot reproduce the issue. The patch proposed here was
  validated by us with xfstests (instructions followed from
  https://wiki.samba.org/index.php/Xfstesting-cifs) and fio. Also, we
  have a user report of test validation using LISA 
(https://github.com/LIS/LISAv2).
  
  * Using xfstest with the exclusions proposed in the link above we
  managed to get the same results as a non-patched kernel, i.e., the same
  tests failed in both kernels, we didn't get worse results with the
  patch. Fio also didn't show noticeable performance regression with the
  patch.
  
  [Regression potential]
  
  * The patch was validated by the cifs filesystem maintainers (in fact
  they suggested its inclusion in Ubuntu) and by the aforementioned tests;
  also, the scope is restricted to cifs only so the likelihood of
  regressions is considered low.
  
  * Due to the nature of the code modification (add a new reference of a
  file handler and manipulate it in different places), I consider that if
  we have a regression it'll manifest as deadlock/blocked tasks, not
  something more serious like crashes or data corruption.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-12-30 Thread Guilherme G. Piccoli
Hi Guillaume and all involved, it seems this bug still can occur even
with the backported fix [0]. I found an upstream new fix that is quite
promising, it addresses this specific oops. In the past it was thought
by maintainers that other fix [0] could reduce the likelihood of those
crashes in smb2_push_mandatory_locks (and it may worked, reducing the
occurrence), but the fact is a proper fix was never worked until kernel
5.5.

The commit is 6f582b273ec2 ("CIFS: Fix NULL-pointer dereference in
smb2_push_mandatory_locks") [1]. The reasoning about the fix is that the
struct cifsFileInfo is initialized and ready for usage before all
members are initialized, like cifs->tlink (the one being dereferenced in
most oops reports). The maintainer then enforced full struct
initialization before it gets used.

I've built a 4.15 Bionic kernel with this fix, available in the following PPA:
launchpad.net/~gpiccoli/+archive/ubuntu/test1795659

To use this kernel, one just needs to run:
sudo add-apt-repository ppa:gpiccoli/test1795659
sudo apt-get update
sudo apt install linux-image-unsigned-4.15.0-74-generic 
linux-modules-4.15.0-74-generic linux-modules-extra-4.15.0-74-generic

Then reboot the machine and check if the right kernel is running; to verify 
that,
just run "uname -rv" and the output should be:
4.15.0-74-generic #84+TEST256303v20191229b1-Ubuntu <...>

In case anybody reproducing this issue can test the PPA kernel, I'd strongly 
appreciate it.
Cheers,


Guilherme


[0] http://git.kernel.org/linus/b98749cac4a6
[1] http://git.kernel.org/linus/6f582b273ec2

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-11-04 Thread Guilherme G. Piccoli
Hi Guillaume, thanks for your report! Do you have more details on how
did you reproduce it? It was "organic", or can you trigger that using
some workload or test pattern? Also, based on the log output that you
pasted, seems it took about 30 hours to reproduce, correct? Let me know
if possible the topology of the cifs volumes in your system (how many
mount points you have, how many targets - and where are they -, etc).

Do you think you could run a debug kernel in order to collect more data?
I'll take a look in the code and see if I can instrument the cifs code
to narrow how this race might be happening, let me know if you are able
to test that.

Thanks,


Guilherme

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-11-04 Thread Guillaume Penin
Hi,

Unfortunately, the patch does not seem to solve the problem. Running
Ubuntu 18.04 LTS with kernel 4.15.0-64, the crash still occurs with the
same signature :

Oct 14 08:00:01 uzorldsp01 kernel: [109699.415372] BUG: unable to handle kernel 
NULL pointer dereference at 0038
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] IP: 
smb2_push_mandatory_locks+0x10d/0x3c0 [cifs]
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] PGD 0 P4D 0
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] Oops:  [#1] SMP PTI
Oct 14 08:00:01 uzorldsp01 kernel: [109699.420042] Modules linked in: btrfs 
zstd_compress xor raid6_pq ufs qnx4 minix ntfs msdos jfs xfs dm_snapshot 
dm_bufio cmac arc4 md4 nls_utf8 cifs ccm fscache nf_conntr
ack_ipv4 nf_defrag_ipv4 xt_owner xt_conntrack nf_conntrack libcrc32c 
iptable_security sb_edac kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd
glue_helper cryptd intel_rapl_perf input_leds serio_raw hyperv_fb hv_balloon 
joydev mac_hid sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc 
ip_tables x_tables autofs4 hid_generic hid_hyperv hv_util
s hv_storvsc ptp hyperv_keyboard hv_netvsc hid scsi_transport_fc pps_core 
psmouse i2c_piix4 pata_acpi hv_vmbus floppy
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] CPU: 0 PID: 50766 Comm: 
kworker/0:0 Not tainted 4.15.0-64-generic #73-Ubuntu
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] Hardware name: Microsoft 
Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] Workqueue: cifsoplockd 
cifs_oplock_break [cifs]
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RIP: 
0010:smb2_push_mandatory_locks+0x10d/0x3c0 [cifs]
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RSP: :ab360da2bdd8 
EFLAGS: 00010246
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RAX:  RBX: 
9887646617d8 RCX: 
Oct 14 08:00:01 uzorldsp01 kernel: [109699.472261] RDX: 1000 RSI: 
 RDI: 98876d006b80
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] RBP: ab360da2be28 R08: 
988568596000 R09: 98876d006b80
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] R10: ab360da2bd98 R11: 
988568596000 R12: 00aa
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] R13: 9887646617d8 R14: 
9887646617c0 R15: 988764d7f200
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] FS:  () 
GS:98876d60() knlGS:
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] CS:  0010 DS:  ES:  
CR0: 80050033
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] CR2: 0038 CR3: 
00046d80a003 CR4: 001606f0
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] Call Trace:
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  
cifs_oplock_break+0x131/0x410 [cifs]
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  process_one_work+0x1de/0x420
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  worker_thread+0x32/0x410
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  kthread+0x121/0x140
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  ? 
process_one_work+0x420/0x420
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  ? 
kthread_create_worker_on_cpu+0x70/0x70
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261]  ret_from_fork+0x35/0x40
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] Code: c8 00 00 00 00 48 89 
45 b0 49 39 c6 0f 84 e5 00 00 00 4d 89 fb 4d 8b 7e 10 49 8b 5e 18 4d 8d 6e 18 
49 8b 87 90 00 00 00 4c 39 eb <48> 8b 40 38 48 89 4
5 d0 0f 84 ae 00 00 00 45 31 d2 4c 89 75 b8
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] RIP: 
smb2_push_mandatory_locks+0x10d/0x3c0 [cifs] RSP: ab360da2bdd8
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] CR2: 0038
Oct 14 08:00:02 uzorldsp01 kernel: [109699.472261] ---[ end trace 
ee6628b4e2b5174b ]---

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-08-07 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
xenial' to 'verification-done-xenial'. If the problem still exists,
change the tag 'verification-needed-xenial' to 'verification-failed-
xenial'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-31 Thread Guilherme G. Piccoli
I've validated the -proposed kernel for Bionic (4.15.0-56) using the
xfstests suite mentioned in the description. Interestingly, the patch
seems to fix the test generic/504, which failed both in smb2 and smb3
only in kernel 4.15.0-55. Other than that, the same amount of tests
failed in both cases, and no significant performance impact was noticed.

Cheers,


Guilherme

** Tags removed: verification-needed-bionic
** Tags added: verification-done-bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-31 Thread Guilherme G. Piccoli
** Changed in: linux-azure (Ubuntu)
   Status: Confirmed => Fix Released

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-29 Thread Launchpad Bug Tracker
This bug was fixed in the package linux-azure - 4.15.0-1052.57

---
linux-azure (4.15.0-1052.57) xenial; urgency=medium

  * xenial/linux-azure: 4.15.0-1052.57 -proposed tracker (LP: #1837632)

  * kernel panic using CIFS share in smb2_push_mandatory_locks() (LP: #1795659)
- CIFS: keep FileInfo handle live during oplock break

 -- Marcelo Henrique Cerri   Tue, 23 Jul
2019 13:19:53 -0300

** Changed in: linux-azure (Ubuntu Xenial)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-25 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
bionic' to 'verification-done-bionic'. If the problem still exists,
change the tag 'verification-needed-bionic' to 'verification-failed-
bionic'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-24 Thread Brad Figg
** Tags added: cscc

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-23 Thread Marcelo Cerri
** Changed in: linux-azure (Ubuntu Xenial)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-23 Thread Marcelo Cerri
** Also affects: linux-azure (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: linux-azure (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Xenial)
   Status: New => Invalid

** Changed in: linux-azure (Ubuntu Bionic)
   Status: New => Invalid

** Changed in: linux-azure (Ubuntu Cosmic)
   Status: New => Invalid

** Changed in: linux-azure (Ubuntu Disco)
   Status: New => Invalid

** Changed in: linux-azure (Ubuntu)
   Importance: Undecided => Critical

** Changed in: linux-azure (Ubuntu)
   Status: New => Confirmed

** Changed in: linux-azure (Ubuntu Xenial)
   Importance: Undecided => Critical

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-18 Thread Khaled El Mously
** Changed in: linux (Ubuntu Bionic)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-17 Thread Guilherme G. Piccoli
SRU request sent to kernel-team mailing list:
https://lists.ubuntu.com/archives/kernel-team/2019-July/102354.html

Cheers,


Guilherme

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-17 Thread Guilherme G. Piccoli
** Summary changed:

- kernel panic using CIFS share smb2_push_mandatory_locks
+ kernel panic using CIFS share in smb2_push_mandatory_locks()

** Description changed:

- Description:Ubuntu 16.04.5 LTS
- Release:16.04
+ [Impact]
  
- Kernel: 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC
- 2018 x86_64 x86_64 x86_64 GNU/Linux
+ * We got reports of a kernel crash in cifs module with the following
+ signature:
  
- Under load, getting a system crash when accessing files on an SMB3
- share. dmesg from crash attached. I can upload the crash dump if needed.
+ BUG: unable to handle kernel NULL pointer dereference at 0038
+ IP: smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
+ PGD 0 P4D 0
+ RIP: 0010:smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
+ Call Trace:
+  cifs_oplock_break+0x12f/0x3d0 [cifs]
+  process_one_work+0x14d/0x410
+  worker_thread+0x4b/0x460
+  kthread+0x105/0x140
+ [...]
  
- Share is mounted with the following options:
+ Low-level analysis (decodecode script output and the objdump of the
+ function) revealed that we are crashing in a NULL ptr dereference when
+ trying to access "cfile->tlink"; below, a snippet of the objdump at
+ function smb2_push_mandatory_locks():
  
- 
"ro,_netdev,username=*,password=*,domain=*,vers=3.02,sec=ntlmsspi,nounix,noserverino,nobrl,cache=none"
+ [...]
+ mov0x10(%r14),%r15   # %r15 = cifsFileInfo *cfile
+ mov0x18(%r14),%rbx   # %rbx = cifsLockInfo *li = (fdlocks->locks)
+ lea0x18(%r14),%r12
+ mov0x90(%r15),%rax   # %rax = struct tcon_link *tlink (cfile->tlink)
+ cmp%r12,%rbx
+ mov0x38(%rax),%rax   # <--- TRAP [trying to get cifs_tcon *tl_tcon]
+ [...]
  
- Dmesg points to the cifs module
+ * After discussing the issue with CIFS maintainers (Steve French and
+ Pavel Shilovsky) they suggested commit b98749cac4a6 ("CIFS: keep
+ FileInfo handle live during oplock break")
+ [http://git.kernel.org/linus/b98749cac4a6] as a fix for multiple reports
+ of this kind of crash.
  
- [ 2192.662345] BUG: unable to handle kernel NULL pointer dereference at 
0038
- [ 2192.662407] IP: smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
+ * The fix was sent to stable kernels and is present in Ubuntu kernels
+ 5.0 and newer. We are requesting the SRU for this patch here in order to
+ fix the crashes, after reports of successful testing with the patch (see
+ below section) and since the patch is restricted to the cifs module
+ scope and accepted on linux stable.
+ 
+ * Alternatively the issue is known to be avoided when oplocks are
+ disabled using "cifs.enable_oplocks=N" module parameter.
+ 
+ [Test case]
+ 
+ * Unfortunately we cannot reproduce the issue. The patch proposed here was
+ validated by us with xfstests (instructions followed from 
+ https://wiki.samba.org/index.php/Xfstesting-cifs) and fio. Also, we
+ have a user report of test validation using LISA 
(https://github.com/LIS/LISAv2).
+ 
+ * Using xfstest with the exclusions proposed in the link above we
+ managed to get the same results as a non-patched kernel, i.e., the same
+ tests failed in both kernels, we didn't get worse results with the
+ patch. Fio also didn't show noticeable performance regression with the
+ patch.
+ 
+ [Regression potential]
+ 
+ * The patch was validated by the cifs filesystem maintainers (in fact
+ they suggested its inclusion in Ubuntu) and by the aforementioned tests;
+ also, the scope is restricted to cifs only so the likelihood of
+ regressions is considered low.
+ 
+ *Due to the nature of the code modification (add a new reference of a
+ file handler and manipulate it in different places), I consider that if
+ we have a regression it'll manifest as deadlock/blocked tasks, not
+ something more serious like crashes or data corruption.

** Description changed:

  [Impact]
  
  * We got reports of a kernel crash in cifs module with the following
  signature:
  
  BUG: unable to handle kernel NULL pointer dereference at 0038
  IP: smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  PGD 0 P4D 0
  RIP: 0010:smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  Call Trace:
-  cifs_oplock_break+0x12f/0x3d0 [cifs]
-  process_one_work+0x14d/0x410
-  worker_thread+0x4b/0x460
-  kthread+0x105/0x140
+  cifs_oplock_break+0x12f/0x3d0 [cifs]
+  process_one_work+0x14d/0x410
+  worker_thread+0x4b/0x460
+  kthread+0x105/0x140
  [...]
  
- Low-level analysis (decodecode script output and the objdump of the
+ * Low-level analysis (decodecode script output and the objdump of the
  function) revealed that we are crashing in a NULL ptr dereference when
  trying to access "cfile->tlink"; below, a snippet of the objdump at
  function smb2_push_mandatory_locks():
  
  [...]
  mov0x10(%r14),%r15   # %r15 = cifsFileInfo *cfile
  mov0x18(%r14),%rbx   # %rbx = cifsLockInfo *li = (fdlocks->locks)
  lea0x18(%r14),%r12
  mov0x90(%r15),%rax   # %rax = struct tcon_link *tlink (cfile->tlink)
  cmp%r12,%rbx
  mov0x38(%rax),%rax