** Changed in: linux (Ubuntu Artful)
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1744300

Title:
  bt_iter() crash due to NULL pointer

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Artful:
  Fix Committed

Bug description:
  SRU Justification:

  
  [Impact]
  The following crash was observed in Ubuntu 16.04 running linux-gcp kernel 
version 4.13 (specifically 4.13.0-1006.9):

  [ 10.972644] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000030 
  [ 10.980708] IP: bt_iter+0x31/0x50 
  [ 10.984310] PGD 0 
  [ 10.984310] P4D 0 
  [ 10.986439] 
  [ 10.990190] Oops: 0000 [#1] SMP PTI 
  [ 11.016282] Workqueue: kblockd blk_mq_timeout_work 
  [ 11.021196] task: ffff8e7c2e700000 task.stack: ffffb8d4c67a8000 
  [ 11.027234] RIP: 0010:bt_iter+0x31/0x50 
  [ 11.031187] RSP: 0018:ffffb8d4c67abda0 EFLAGS: 00010206 
  [ 11.037730] RAX: ffffb8d4c67abdd0 RBX: 0000000000000180 RCX: 
0000000000000000 
  [ 11.045172] RDX: ffff8e7c34c8d280 RSI: 0000000000000000 RDI: 
ffff8e7c32dd8000 
  [ 11.053321] RBP: ffffb8d4c67abe20 R08: 0000000000000000 R09: 
0000200000000100 
  [ 11.060582] R10: 0000000000000130 R11: 00000000fffee5bf R12: 
ffff8e7c3572c790 
  [ 11.068094] R13: ffff8e7c3572c780 R14: 0000000000000008 R15: 
ffff8e7c35e7c180 
  [ 11.075522] FS: 0000000000000000(0000) GS:ffff8e7c3a4c0000(0000) 
knlGS:0000000000000000 
  [ 11.083721] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
  [ 11.089593] CR2: 0000000000000030 CR3: 000000009e20a003 CR4: 
00000000001606e0 
  [ 11.096871] Call Trace: 
  [ 11.099468] ? blk_mq_queue_tag_busy_iter+0xe2/0x1f0 
  [ 11.104558] ? blk_mq_rq_timed_out+0x70/0x70 
  [ 11.109130] ? blk_mq_rq_timed_out+0x70/0x70 
  [ 11.114933] blk_mq_timeout_work+0xbb/0x170 
  [ 11.119408] process_one_work+0x156/0x410 
  [ 11.123641] worker_thread+0x4b/0x460 
  [ 11.127827] kthread+0x109/0x140 
  [ 11.131186] ? process_one_work+0x410/0x410 
  [ 11.135499] ? kthread_create_on_node+0x70/0x70 
  [ 11.140408] ret_from_fork+0x1f/0x30 
  [ 11.144110] Code: 89 d0 48 8b 3a 0f b6 48 18 48 8b 97 30 01 00 00 84 c9 75 
03 03 72 04 48 8b 92 80 00 00 00 89 f6 48 8b 34 f2 48 8b 97 c0 00 00 00 <48> 39 
56 30 74 06 b8 01 00 00 00 c3 55 48 8b 50 10 48 89 e5 ff 
  [ 11.167573] RIP: bt_iter+0x31/0x50 RSP: ffffb8d4c67abda0 
  [ 11.173028] CR2: 0000000000000030 
  [ 11.176515] ---[ end trace 2f8e5b1cf4139fec ]--- 
  [ 11.182589] Kernel panic - not syncing: Fatal exception 

  Basically, we have a NULL pointer dereference while in bt_iter()
  function - this is caused because after the merge of blk-mq scheduler
  capability on Linux kernel , tags->rqs[] array has been dinamically
  assigned and there's a small window of time in which the bit is set
  but tags->rqs[] array wasn't allocated yet. This was reported to
  happen in about 5% of test runs (more details on test section).

  
  [Fix]
  The fix is small and simple, and it's upstream already. Basically, it adds a 
NULL pointer check on bt_iter() and bt_tags_iter() functions.

  The fix is: 7f5562d5ecc4 ("blk-mq-tag: check for NULL rq when iterating 
tags"), by Jens Axboe.
  
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f5562d5ecc4)

  
  [Testcase] 
  Since the problem manifests in a small non-deterministic time window, there's 
no easy test to reproduce this. In our case, it was observed while testing a 
large number of CPU's and attached disks (>200 disks, >150 cores), trying to 
exercise all CPUs and disks (the disks with quick dd commands). In this test 
scenario, as already mentioned, issue occured in about 5% of the runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1744300/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to