Kernel core dump analysis:

crash> set 
PID: 23697 
COMMAND: "kworker/u82:0" 
TASK: 88370bcfaa80 [THREAD_INFO: 883708104000] 
CPU: 33 
STATE: TASK_RUNNING (PANIC) 

crash> bt 
PID: 23697 TASK: 88370bcfaa80 CPU: 33 COMMAND: "kworker/u82:0" 
[...] 
#3 [883708107b78] __bad_area_nosemaphore at 8106a889 
#4 [883708107bc0] bad_area_nosemaphore at 8106a9a3 
#5 [883708107bd0] __do_page_fault at 8106aff2 
#6 [883708107c30] trace_do_page_fault at 8106b3f7 
#7 [883708107c60] do_async_page_fault at 81063e19 
#8 [883708107c70] async_page_fault at 81822958 
#9 [883708107cf8] n_tty_receive_buf_common at 814e676a 
#10 [883708107dc8] n_tty_receive_buf2 at 814e71f4 
#11 [883708107dd8] flush_to_ldisc at 814e9be5 
#12 [883708107e20] process_one_work at 8109b0b6 
#13 [883708107e68] worker_thread at 8109ba9a 
[...] 

crash> bt -f 
[...] 
#9 [883708107cf8] n_tty_receive_buf_common at 814e676a 
[...] 
883708107d80: 881bc78e4c20 0000000000000000 
883708107d90: 0000000000000014 881bc78e4c00 
883708107da0: 881bc78e7800 881cba173d80 
883708107db0: 883706cae828 883706cae808 
883708107dc0: 883708107dd0 814e71f4 
#10 [883708107dc8] n_tty_receive_buf2 at 814e71f4 
[...] 

>From the stack frame, we can infer that "struct tty_struct" is at 
0x881bc78e7800 : 

crash> tty_struct -x 881bc78e7800 | grep name 
name = "pts3\000... 

Also, from the stack frame we see a value 0x14 there, which represents 
the count value in the function: 

static int n_tty_receive_buf2(struct tty_struct *tty, const unsigned char *cp, 
char *fp, int count) 
{ 
return n_tty_receive_buf_common(tty, cp, fp, count, 1); 
} 

Since 0x14 mean 20 in decimal, I'd expect a 20 characters string, 
which turns to be true ( char *cp is at 881bc78e4c20): 

crash> rd -a 881bc78e4c20 
881bc78e4c20: source /root/openrc 

Something is issuing the command "source /root/openrc" to PTS/3.

Checking the "files" command, we get:

crash> foreach files -R dev/pts/3 
PID: 2288 TASK: 883786e2ea40 CPU: 29 COMMAND: "sshd" 
ROOT: / CWD: / 
FD FILE DENTRY INODE TYPE PATH 
9 8839b4ce9a00 881a0ba4da40 8838f71fcf88 CHR /dev/pts/3 

And checking ssh processes:

crash> ps|grep ssh 
2236 7180 18 8836f9dd2a80 IN 0.0 149480 8636 sshd 
2275 2274 37 883706e01540 IN 0.0 37836 4940 ssh 
2288 2236 29 883786e2ea40 UN 0.0 149480 1372 sshd 
7180 1 17 881cb99b6a40 IN 0.0 57204 5240 sshd 
14319 7180 2 8836dfd91540 IN 0.0 149480 8460 sshd 

All except 2288 are scheduled after a select() syscall. 
PID 2288 looks interesting: 

crash> bt 2288 
PID: 2288 TASK: 883786e2ea40 CPU: 29 COMMAND: "sshd" 
[...] 
#4 [88373e0bfb48] down_write at 8181e42d 
#5 [88373e0bfb60] tty_unthrottle at 814e75be 
#6 [88373e0bfb80] n_tty_open at 814e4fb9 
#7 [88373e0bfba0] tty_ldisc_open at 814e8bd5 
#8 [88373e0bfbc0] tty_ldisc_reinit at 814e9112 
#9 [88373e0bfbf0] tty_reopen at 814df50a 
#10 [88373e0bfc08] tty_open at 814e318e 
#11 [88373e0bfc70] chrdev_open at 8120d5c4 
#12 [88373e0bfcb0] do_dentry_open at 81206a6a 
#13 [88373e0bfcf8] vfs_open at 81207b35 
#14 [88373e0bfd20] path_openat at 81215fed 
[...] 

This is the one that seems to be racing with the flush_work from CPU 33 
that led to the crash. 
Since we have the tty_reopen() in the call trace, it's clear it's re-opening 
an existing tty after it was closed, but the flush work was running and 
the crash happened.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1791758

Title:
  ldisc crash on reopened tty

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Trusty:
  Confirmed
Status in linux source package in Xenial:
  Confirmed
Status in linux source package in Bionic:
  Confirmed

Bug description:
  [Impact]

  The following Oops was discovered by user:

  [684766.666639] BUG: unable to handle kernel paging request at 
0000000000002268
  [684766.667642] IP: [<ffffffff814e2a5a>] n_tty_receive_buf_common+0x6a/0xae0
  [684766.668487] PGD 80000019574fe067 PUD 19574ff067 PMD 0
  [684766.669194] Oops: 0000 [#1] SMP
  [684766.669687] Modules linked in: xt_nat dccp_diag dccp tcp_diag udp_diag 
inet_diag unix_diag xt_connmark ipt_REJECT nf_reject_ipv4 nf_conntrack_netlink 
nfnetlink veth ip6table_filter ip6_tables xt_tcpmss xt_multiport xt_conntrack 
iptable_filter xt_CHECKSUM xt_tcpudp iptable_mangle xt_CT iptable_raw 
ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment iptable_nat ip_tables x_tables 
target_core_mod configfs softdog scini(POE) ib_iser rdma_cm iw_cm ib_cm ib_sa 
ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 
openvswitch(OE) nf_nat_ipv6 nf_nat_ipv4 nf_nat gre kvm_intel kvm irqbypass ttm 
crct10dif_pclmul drm_kms_helper crc32_pclmul ghash_clmulni_intel drm 
aesni_intel aes_x86_64 i2c_piix4 lrw gf128mul fb_sys_fops syscopyarea 
glue_helper sysfillrect ablk_helper cryptd sysimgblt joydev
  [684766.679406]  input_leds mac_hid serio_raw 8250_fintek br_netfilter bridge 
stp llc nf_conntrack_proto_gre nf_conntrack_ipv6 nf_defrag_ipv6 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack xfs raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 psmouse multipath floppy pata_acpi linear dm_multipath
  [684766.683585] CPU: 15 PID: 7470 Comm: kworker/u40:1 Tainted: P           OE 
  4.4.0-124-generic #148~14.04.1-Ubuntu
  [684766.684967] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
  [684766.686062] Workqueue: events_unbound flush_to_ldisc
  [684766.686703] task: ffff88165e5d8000 ti: ffff88170dc2c000 task.ti: 
ffff88170dc2c000
  [684766.687670] RIP: 0010:[<ffffffff814e2a5a>]  [<ffffffff814e2a5a>] 
n_tty_receive_buf_common+0x6a/0xae0
  [684766.688870] RSP: 0018:ffff88170dc2fd28  EFLAGS: 00010202
  [684766.689521] RAX: 0000000000000000 RBX: ffff88162c895000 RCX: 
0000000000000001
  [684766.690488] RDX: 0000000000000000 RSI: ffff88162c895020 RDI: 
ffff8819c2d3d4d8
  [684766.691518] RBP: ffff88170dc2fdc0 R08: 0000000000000001 R09: 
ffffffff81ec2ba0
  [684766.692480] R10: 0000000000000004 R11: 0000000000000000 R12: 
ffff8819c2d3d400
  [684766.693423] R13: ffff8819c45b2670 R14: ffff8816a358c028 R15: 
ffff8819c2d3d400
  [684766.694390] FS:  0000000000000000(0000) GS:ffff8819d73c0000(0000) 
knlGS:0000000000000000
  [684766.695484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [684766.696182] CR2: 0000000000002268 CR3: 0000001957520000 CR4: 
0000000000360670
  [684766.697141] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [684766.698114] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  [684766.699079] Stack:
  [684766.699412]  0000000000000000 ffff8819c2d3d4d8 0000000000000000 
ffff8819c2d3d648
  [684766.700467]  ffff8819c2d3d620 ffff8819c9c10400 ffff88170dc2fd68 
ffffffff8106312e
  [684766.701501]  ffff88170dc2fd78 0000000000000001 0000000000000000 
ffff88162c895020
  [684766.702534] Call Trace:
  [684766.702905]  [<ffffffff8106312e>] ? kvm_sched_clock_read+0x1e/0x30
  [684766.703685]  [<ffffffff814e34e4>] n_tty_receive_buf2+0x14/0x20
  [684766.704505]  [<ffffffff814e5f05>] flush_to_ldisc+0xd5/0x120
  [684766.705269]  [<ffffffff81099506>] process_one_work+0x156/0x400
  [684766.706008]  [<ffffffff81099eea>] worker_thread+0x11a/0x480
  [684766.706686]  [<ffffffff81099dd0>] ? rescuer_thread+0x310/0x310
  [684766.707386]  [<ffffffff8109f3b8>] kthread+0xd8/0xf0
  [684766.707993]  [<ffffffff8109f2e0>] ? kthread_park+0x60/0x60
  [684766.708664]  [<ffffffff8181a9b5>] ret_from_fork+0x55/0x80
  [684766.709335]  [<ffffffff8109f2e0>] ? kthread_park+0x60/0x60
  [684766.709998] Code: 85 70 ff ff ff e8 97 5f 33 00 49 8d 87 20 02 00 00 c7 
45 b4 00 00 00 00 48 89 45 88 49 8d 87 48 02 00 00 48 89 45 80 48 8b 45 b8 <48> 
8b b0 68 22 00 00 48 8b 08 89 f0 29 c8 41 f6 87 30 01 00 00
  [684766.713290] RIP  [<ffffffff814e2a5a>] n_tty_receive_buf_common+0x6a/0xae0
  [684766.714105]  RSP <ffff88170dc2fd28>
  [684766.714609] CR2: 0000000000002268

  The issue happened in a VM
  KDUMP was configured, so a full Kernel crashdump was created

  User has Ubuntu Trusty, Kernel 4.4.0-124 on its VM

  [Test Case]

  * Deploy a Trusty KVM instance with a LTS Xenial kernel (v4.4 series)
  * SSH in frequently while system is under load, send commands before the 
prompt has returned.
  ----

  Check comment #5 for a summary about the upstream proposals to resolve
  this issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1791758/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to