[Kernel-packages] [Bug 1748342] Re: cgroup: remove cgroup directory leading kernel crash in kill_css

2018-02-13 Thread Joseph Salisbury
** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: linux (Ubuntu)
   Status: Incomplete => Triaged

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Xenial)
   Status: New => Triaged

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1748342

Title:
  cgroup: remove cgroup directory leading kernel crash in kill_css

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  Triaged

Bug description:
  We got feedback from customer that cvm(cloud virtual machine) crashed when 
using kubelet updating container-service in ubuntu xenial. Logs show as follow. 
  We find a patch (commit 33c35aa4817864e056fd772230b0c6b552e36ea2) in linux 
mainline, which can indeed fix this bug. But ubuntu-xenial.git has not merged 
it yet. 

  Do you guys have a plan for merging?

  --panic log-
  [2018-02-02 10:21:48][4397731.721563] BUG: unable to handle kernel paging 
request at 0001005c
  [2018-02-02 10:40:50][4397731.722666] IP: css_clear_dir+0x5/0x70
  [2018-02-02 10:40:50][4397731.723261] PGD a12b067 
  [2018-02-02 10:40:50][4397731.723261] PUD 0 
  [2018-02-02 10:40:50][4397731.723628] 
  [2018-02-02 10:40:50][4397731.724004] Oops:  [#1] SMP
  [2018-02-02 10:40:50][4397731.724004] Modules linked in: xt_statistic 
nf_conntrack_netlink ebt_ip ebtable_filter ebtables veth xt_set ip_set_hash_net 
ip_set nfnetlink xt_nat xt_recent xt_mark ipt_REJ[2018-02-02 10:40:50]ECT 
nf_reject_ipv4 xt_tcpudp xt_comment ipt_MASQUERADE nf_nat_masquerade_ipv4 
xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
xt_addrtype iptable_fil[2018-02-02 10:40:50]ter ip_tables xt_conntrack x_tables 
nf_nat nf_conntrack br_netfilter bridge stp llc aufs ppdev sb_edac edac_core 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev input_le[2018-02-02 
10:40:50]ds serio_raw parport_pc parport i2c_piix4 mac_hid ib_iser rdma_cm 
iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi autofs4 btrfs raid10 raid456 a[2018-02-02 
10:40:50]sync_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath
  [2018-02-02 10:40:50][4397731.724004]  linear cirrus ttm drm_kms_helper 
syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops aes_x86_64 
crypto_simd cryptd glue_helper psmouse virtio_blk virtio_n[2018-02-02 
10:40:50]et drm pata_acpi floppy
  [2018-02-02 10:40:50][4397731.724004] CPU: 0 PID: 23347 Comm: kubelet Not 
tainted 4.10.0-32-generic #36~16.04.1-Ubuntu
  [2018-02-02 10:40:50][4397731.724004] Hardware name: Bochs Bochs, BIOS Bochs 
01/01/2011
  [2018-02-02 10:40:50][4397731.724004] task: 92abde59 task.stack: 
baa94165c000
  [2018-02-02 10:40:50][4397731.724004] RIP: 0010:css_clear_dir+0x5/0x70
  [2018-02-02 10:40:50][4397731.724004] RSP: 0018:baa94165fe10 EFLAGS: 
00010206
  [2018-02-02 10:40:50][4397731.724004] RAX: 47fd40005d7b RBX: 
ffe8 RCX: 92abffc0fcec
  [2018-02-02 10:40:50][4397731.724004] RDX: 9b070800 RSI: 
0206 RDI: ffe8
  [2018-02-02 10:40:50][4397731.724004] RBP: baa94165fe20 R08: 
c8b18701 R09: 000180220017
  [2018-02-02 10:40:50][4397731.724004] R10: 92abc8b187f8 R11: 
92abf7751d00 R12: 92abd5601000
  [2018-02-02 10:40:50][4397731.724004] R13:  R14: 
92abd5601150 R15: 
  [2018-02-02 10:40:50][4397731.724004] FS:  7f6f92ffd700() 
GS:92abffc0() knlGS:
  [2018-02-02 10:40:50][4397731.724004] CS:  0010 DS:  ES:  CR0: 
80050033
  [2018-02-02 10:40:50][4397731.724004] CR2: 0001005c CR3: 
280cb000 CR4: 000406f0
  [2018-02-02 10:40:50][4397731.724004] Call Trace:
  [2018-02-02 10:40:50][4397731.724004]  ? kill_css+0x12/0x60
  [2018-02-02 10:40:50][4397731.724004]  cgroup_destroy_locked+0xa5/0xf0
  [2018-02-02 10:40:50][4397731.724004]  cgroup_rmdir+0x2c/0x90
  [2018-02-02 10:40:50][4397731.724004]  kernfs_iop_rmdir+0x4d/0x80
  [2018-02-02 10:40:50][4397731.724004]  vfs_rmdir+0xb4/0x130
  [2018-02-02 10:40:50][4397731.724004]  do_rmdir+0x1c7/0x1e0
  [2018-02-02 10:40:50][4397731.724004]  SyS_unlinkat+0x22/0x30
  [2018-02-02 10:40:50][4397731.724004]  entry_SYSCALL_64_fastpath+0x1e/0xad
  [2018-02-02 10:40:50][4397731.724004] RIP: 0033:0x481bd4
  [2018-02-02 10:40:50][4397731.724004] RSP: 002b:00c422893af0 EFLAGS: 
0246 ORIG_RAX: 0107
  [2018-02-02 10:40:50][4397731.724004] RAX: ffda RBX: 
 RCX: 00481bd4
  [2018-02-02 10:40:50][4397731.724004] RDX: 0200 RSI: 
00c421c7ef00 RDI: ff9c
  [2018-02-02 

[Kernel-packages] [Bug 1748342] Re: cgroup: remove cgroup directory leading kernel crash in kill_css

2018-02-11 Thread Daniel Axtens
Hi,

I'm happy to submit this patch to the kernel team, but I wanted to talk
about the kernel process and ask a question first.

The way this process usually works is:
 - patch submitted to kernel team
 - kernel team checks patch and if they are happy with it, applies it to the 
kernel
 - this is built into a "proposed" kernel.
 - the bug is updated with the proposed kernel.
 - someone - usually the bug reporter - must verify that the proposed kernel 
fixes the bug. There is usually a 5 working day window to do this.
 - if the verification is done, the new kernel contains the fix. If 
verification is not done, the patch is not included in the released kernel.

I am not able to do the verification. If the kernel team provides a
proposed kernel, are you or your customer able to verify it?

Regards,
Daniel

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1748342

Title:
  cgroup: remove cgroup directory leading kernel crash in kill_css

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  We got feedback from customer that cvm(cloud virtual machine) crashed when 
using kubelet updating container-service in ubuntu xenial. Logs show as follow. 
  We find a patch (commit 33c35aa4817864e056fd772230b0c6b552e36ea2) in linux 
mainline, which can indeed fix this bug. But ubuntu-xenial.git has not merged 
it yet. 

  Do you guys have a plan for merging?

  --panic log-
  [2018-02-02 10:21:48][4397731.721563] BUG: unable to handle kernel paging 
request at 0001005c
  [2018-02-02 10:40:50][4397731.722666] IP: css_clear_dir+0x5/0x70
  [2018-02-02 10:40:50][4397731.723261] PGD a12b067 
  [2018-02-02 10:40:50][4397731.723261] PUD 0 
  [2018-02-02 10:40:50][4397731.723628] 
  [2018-02-02 10:40:50][4397731.724004] Oops:  [#1] SMP
  [2018-02-02 10:40:50][4397731.724004] Modules linked in: xt_statistic 
nf_conntrack_netlink ebt_ip ebtable_filter ebtables veth xt_set ip_set_hash_net 
ip_set nfnetlink xt_nat xt_recent xt_mark ipt_REJ[2018-02-02 10:40:50]ECT 
nf_reject_ipv4 xt_tcpudp xt_comment ipt_MASQUERADE nf_nat_masquerade_ipv4 
xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
xt_addrtype iptable_fil[2018-02-02 10:40:50]ter ip_tables xt_conntrack x_tables 
nf_nat nf_conntrack br_netfilter bridge stp llc aufs ppdev sb_edac edac_core 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev input_le[2018-02-02 
10:40:50]ds serio_raw parport_pc parport i2c_piix4 mac_hid ib_iser rdma_cm 
iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi autofs4 btrfs raid10 raid456 a[2018-02-02 
10:40:50]sync_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath
  [2018-02-02 10:40:50][4397731.724004]  linear cirrus ttm drm_kms_helper 
syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops aes_x86_64 
crypto_simd cryptd glue_helper psmouse virtio_blk virtio_n[2018-02-02 
10:40:50]et drm pata_acpi floppy
  [2018-02-02 10:40:50][4397731.724004] CPU: 0 PID: 23347 Comm: kubelet Not 
tainted 4.10.0-32-generic #36~16.04.1-Ubuntu
  [2018-02-02 10:40:50][4397731.724004] Hardware name: Bochs Bochs, BIOS Bochs 
01/01/2011
  [2018-02-02 10:40:50][4397731.724004] task: 92abde59 task.stack: 
baa94165c000
  [2018-02-02 10:40:50][4397731.724004] RIP: 0010:css_clear_dir+0x5/0x70
  [2018-02-02 10:40:50][4397731.724004] RSP: 0018:baa94165fe10 EFLAGS: 
00010206
  [2018-02-02 10:40:50][4397731.724004] RAX: 47fd40005d7b RBX: 
ffe8 RCX: 92abffc0fcec
  [2018-02-02 10:40:50][4397731.724004] RDX: 9b070800 RSI: 
0206 RDI: ffe8
  [2018-02-02 10:40:50][4397731.724004] RBP: baa94165fe20 R08: 
c8b18701 R09: 000180220017
  [2018-02-02 10:40:50][4397731.724004] R10: 92abc8b187f8 R11: 
92abf7751d00 R12: 92abd5601000
  [2018-02-02 10:40:50][4397731.724004] R13:  R14: 
92abd5601150 R15: 
  [2018-02-02 10:40:50][4397731.724004] FS:  7f6f92ffd700() 
GS:92abffc0() knlGS:
  [2018-02-02 10:40:50][4397731.724004] CS:  0010 DS:  ES:  CR0: 
80050033
  [2018-02-02 10:40:50][4397731.724004] CR2: 0001005c CR3: 
280cb000 CR4: 000406f0
  [2018-02-02 10:40:50][4397731.724004] Call Trace:
  [2018-02-02 10:40:50][4397731.724004]  ? kill_css+0x12/0x60
  [2018-02-02 10:40:50][4397731.724004]  cgroup_destroy_locked+0xa5/0xf0
  [2018-02-02 10:40:50][4397731.724004]  cgroup_rmdir+0x2c/0x90
  [2018-02-02 10:40:50][4397731.724004]  kernfs_iop_rmdir+0x4d/0x80
  [2018-02-02 10:40:50][4397731.724004]  vfs_rmdir+0xb4/0x130
  [2018-02-02 10:40:50][4397731.724004]  do_rmdir+0x1c7/0x1e0
  [2018-02-02 10:40:50][4397731.724004]  SyS_unlinkat+0x22/0x30
  [2018-02-02 10:40:50][4397731.724004]