** Changed in: linux (Ubuntu Bionic)
       Status: Confirmed => Fix Committed

** Changed in: linux (Ubuntu Cosmic)
       Status: Confirmed => Fix Committed

You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.

  ldisc crash on reopened tty

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Trusty:
  Won't Fix
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Fix Committed

Bug description:

  * Line discipline code is racy when we have buffer being flush while
  the tty is being initialized or reinitialized. For the first problem,
  we have an upstream patch since January 2018: b027e2298bd5 ("tty: fix
  data race between tty_init_dev and flush of buf") - although it is not
  in Ubuntu kernel 4.4, only in kernels 4.15 and subsequent ones.

  * For the race between the buffer flush while tty is being reopened,
  we have a patch that addresses this issue recently merged for 5.0-rc1:
  83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()"). No
  Ubuntu kernel currently contains this patch, hence we're hereby
  submitting the SRU request. The upstream complete patch series for
  this is in [0].

  * The approach of both patches are similar - they rely in locking/semaphore 
to prevent race conditions. Some additional patches are
  necessary to prevent correlated issues, like preventing a potential deadlock 
due to bad prioritization in servicing I/O over releasing
  tty_ldisc_lock() - refer to c96cf923a98d ("tty: Don't block on IO when ldisc 
change is pending"). All the necessary fixes are grouped here in this SRU 

  * The symptom of the race condition between the buffer flush and the
  tty reopen routine is a kernel crash with the following trace:

  BUG: unable to handle kernel paging request at 0000000000002268
  IP: [<addr>] n_tty_receive_buf_common+0x6a/0xae0
  Call Trace:
  [<addr>] ? kvm_sched_clock_read+0x1e/0x30
  [<addr>] n_tty_receive_buf2+0x14/0x20
  [<addr>] flush_to_ldisc+0xd5/0x120
  [<addr>] process_one_work+0x156/0x400
  [<addr>] worker_thread+0x11a/0x480

  * A kernel crash was collected from an user, analysis is present in
  comment #4 in this LP.

  [Test Case]

  * It is not trivial to trigger this fault, but the usual recipe is to
  keep accessing a machine through SSH (or keep killing getty when in
  IPMI serial console) and in some way run commands before the terminal
  is ready in that machine (like hacking some echo into ttySx or pts in
  an infinite loop).

  * We have reports of users that could reproduce this issue in their
  production environment, and with the patches present in this SRU
  request the problem was fixed.

  [Regression Potential]

  * tty subsystem is highly central and patches in that area are always
  delicate. For example, the upstream series [0] is a re-spin (V6) due
  to a hard to reproduce issue reported in the PA-RISC architecture,
  which was found in the V5 iteration [1] but was fixed by the patch
  c96cf923a98d, present in this SRU request.

  * The patchset [0] is present in tty-next tree since mid-November, and
  the patch b027e2298bd5 is available upstream since January/2018 (it's
  available in both Ubuntu kernels 4.15 and 4.18), so the overall
  likelihood of regressions is low.

  * These patches were sniff-tested for the 3 versions (4.4, 4.15 and
  4.18) and didn't show any issues.

  [0] https://marc.info/?l=linux-kernel&m=154103190111795
  [1] https://marc.info/?l=linux-kernel&m=153737852618183

To manage notifications about this bug go to:

Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to