[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-08-23 Thread bugproxy
--- Comment From hbath...@in.ibm.com 2019-08-23 07:42 EDT---
The issue is not seen anymore with 4.15.0-59.66~16.04.1

** Tags removed: verification-needed-xenial
** Tags added: verification-done-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-08-22 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
xenial' to 'verification-done-xenial'. If the problem still exists,
change the tag 'verification-needed-xenial' to 'verification-failed-
xenial'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-08-05 Thread bugproxy
** Tags removed: targetmilestone-inin---
** Tags added: targetmilestone-inin18042

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-08-05 Thread Frank Heimes
** Changed in: ubuntu-power-systems
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-07-24 Thread Brad Figg
** Tags added: cscc

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] [c001f5f1fa90] [c0179a70] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-07-22 Thread Manoj Iyer
** Changed in: linux (Ubuntu)
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-07-22 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 4.15.0-55.60

---
linux (4.15.0-55.60) bionic; urgency=medium

  * linux: 4.15.0-55.60 -proposed tracker (LP: #1834954)

  * Request backport of ceph commits into bionic (LP: #1834235)
- ceph: use atomic_t for ceph_inode_info::i_shared_gen
- ceph: define argument structure for handle_cap_grant
- ceph: flush pending works before shutdown super
- ceph: send cap releases more aggressively
- ceph: single workqueue for inode related works
- ceph: avoid dereferencing invalid pointer during cached readdir
- ceph: quota: add initial infrastructure to support cephfs quotas
- ceph: quota: support for ceph.quota.max_files
- ceph: quota: don't allow cross-quota renames
- ceph: fix root quota realm check
- ceph: quota: support for ceph.quota.max_bytes
- ceph: quota: update MDS when max_bytes is approaching
- ceph: quota: add counter for snaprealms with quota
- ceph: avoid iput_final() while holding mutex or in dispatch thread

  * QCA9377 isn't being recognized sometimes (LP: #1757218)
- SAUCE: USB: Disable USB2 LPM at shutdown

  * hns: fix ICMP6 neighbor solicitation messages discard problem (LP: #1833140)
- net: hns: fix ICMP6 neighbor solicitation messages discard problem
- net: hns: fix unsigned comparison to less than zero

  * Fix occasional boot time crash in hns driver (LP: #1833138)
- net: hns: Fix probabilistic memory overwrite when HNS driver initialized

  *  use-after-free in hns_nic_net_xmit_hw (LP: #1833136)
- net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw()

  * hns: attempt to restart autoneg when disabled should report error
(LP: #1833147)
- net: hns: Restart autoneg need return failed when autoneg off

  * systemd 237-3ubuntu10.14 ADT test failure on Bionic ppc64el (test-seccomp)
(LP: #1821625)
- powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
- powerpc: sys_pkey_mprotect() system call

  * [UBUNTU] pkey: Indicate old mkvp only if old and curr. mkvp are different
(LP: #1832625)
- pkey: Indicate old mkvp only if old and current mkvp are different

  * [UBUNTU] kernel: Fix gcm-aes-s390 wrong scatter-gather list processing
(LP: #1832623)
- s390/crypto: fix gcm-aes-s390 selftest failures

  * System crashes on hot adding a core with drmgr command (4.15.0-48-generic)
(LP: #1833716)
- powerpc/numa: improve control of topology updates
- powerpc/numa: document topology_updates_enabled, disable by default

  * Kernel modules generated incorrectly when system is localized to a non-
English language (LP: #1828084)
- scripts: override locale from environment when running recordmcount.pl

  * [UBUNTU] kernel: Fix wrong dispatching for control domain CPRBs
(LP: #1832624)
- s390/zcrypt: Fix wrong dispatching for control domain CPRBs

  * CVE-2019-11815
- net: rds: force to destroy connection if t_sock is NULL in
  rds_tcp_kill_sock().

  * Sound device not detected after resume from hibernate (LP: #1826868)
- drm/i915: Force 2*96 MHz cdclk on glk/cnl when audio power is enabled
- drm/i915: Save the old CDCLK atomic state
- drm/i915: Remove redundant store of logical CDCLK state
- drm/i915: Skip modeset for cdclk changes if possible

  * Handle overflow in proc_get_long of sysctl (LP: #1833935)
- sysctl: handle overflow in proc_get_long

  * Dell XPS 13 (9370) defaults to s2idle sleep/suspend instead of deep, NVMe
drains lots of power under s2idle (LP: #1808957)
- Revert "UBUNTU: SAUCE: pci/nvme: prevent WDC PC SN720 NVMe from entering 
D3
  and being disabled"
- Revert "UBUNTU: SAUCE: nvme: add quirk to not call disable function when
  suspending"
- Revert "UBUNTU: SAUCE: pci: prevent Intel NVMe SSDPEKKF from entering D3"
- Revert "SAUCE: nvme: add quirk to not call disable function when 
suspending"
- Revert "SAUCE: pci: prevent sk hynix nvme from entering D3"
- PCI: PM: Avoid possible suspend-to-idle issue
- PCI: PM: Skip devices in D0 for suspend-to-idle
- nvme-pci: Sync queues on reset
- nvme: Export get and set features
- nvme-pci: Use host managed power state for suspend

  * linux v4.15 ftbfs on a newer host kernel (e.g. hwe) (LP: #1823429)
- selinux: use kernel linux/socket.h for genheaders and mdp

  * 32-bit x86 kernel 4.15.0-50 crash in vmalloc_sync_all (LP: #1830433)
- x86/mm/pat: Disable preemption around __flush_tlb_all()
- x86/mm: Drop usage of __flush_tlb_all() in kernel_physical_mapping_init()
- x86/mm: Disable ioremap free page handling on x86-PAE
- ioremap: Update pgtable free interfaces with addr
- x86/mm: Add TLB purge to free pmd/pte page interfaces
- x86/init: fix build with CONFIG_SWAP=n
- x86/mm: provide pmdp_establish() helper
- x86/mm: Use WRITE_ONCE() when setting PTEs

  * hinic: fix oops due to race in set_rx_mode (LP: #1832048)
- hinic: fix a bug in 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-07-08 Thread Manoj Iyer
ubuntu@P8lpar4:~$ uname -a 
Linux P8lpar4 4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:21:40 UTC 2019 
ppc64le ppc64le ppc64le GNU/Linux
ubuntu@P8lpar4:~$
ubuntu@P8lpar4:~$ apt policy linux-image-generic
linux-image-generic:
  Installed: 4.15.0.55.57
  Candidate: 4.15.0.55.57
  Version table:
 *** 4.15.0.55.57 500
500 http://ports.ubuntu.com/ubuntu-ports bionic-proposed/main ppc64el 
Packages
100 /var/lib/dpkg/status
 4.15.0.54.56 500
500 http://ports.ubuntu.com/ubuntu-ports bionic-updates/main ppc64el 
Packages
500 http://ports.ubuntu.com/ubuntu-ports bionic-security/main ppc64el 
Packages
 4.15.0.20.23 500
500 http://ports.ubuntu.com/ubuntu-ports bionic/main ppc64el Packages
ubuntu@P8lpar4:~$
ubuntu@P8lpar4:~$ sudo su
root@P8lpar4:/home/ubuntu#  drmgr -c cpu -r -q 1
Validating CPU DLPAR capability...yes.
CPU 121
root@P8lpar4:/home/ubuntu# drmgr -c cpu -a -q 1
Validating CPU DLPAR capability...yes.
CPU 121
root@P8lpar4:/home/ubuntu#

-- dmesg --
[  476.574556] cpu 120 (hwid 120) Ready to die...
[  476.647003] cpu 121 (hwid 121) Ready to die...
[  476.710155] cpu 122 (hwid 122) Ready to die...
[  476.766678] cpu 123 (hwid 123) Ready to die...
[  476.829791] cpu 124 (hwid 124) Ready to die...
[  476.883594] cpu 125 (hwid 125) Ready to die...
[  476.933738] cpu 126 (hwid 126) Ready to die...
[  476.986045] cpu 127 (hwid 127) Ready to die...
root@P8lpar4:/home/ubuntu# 

** Tags removed: verification-needed-bionic
** Tags added: verification-done-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-07-08 Thread Manoj Iyer
** Changed in: ubuntu-power-systems
   Status: In Progress => Fix Committed

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Manoj Iyer (manjo)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-07-03 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
bionic' to 'verification-done-bionic'. If the problem still exists,
change the tag 'verification-needed-bionic' to 'verification-failed-
bionic'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-07-01 Thread Kleber Sacilotto de Souza
** Changed in: linux (Ubuntu Bionic)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-28 Thread Stefan Bader
** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  New

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-25 Thread Manoj Iyer
The issue does not reproduce on Disco.
root@P8lpar4:/home/ubuntu# uname -a 
Linux P8lpar4 5.0.0-19-generic #20-Ubuntu SMP Wed Jun 19 21:50:53 UTC 2019 
ppc64le ppc64le ppc64le GNU/Linux
root@P8lpar4:/home/ubuntu# 

root@P8lpar4:/home/ubuntu# drmgr -c cpu -r -q 1
Validating CPU DLPAR capability...yes.
CPU 121
root@P8lpar4:/home/ubuntu# drmgr -c cpu -a -q 1
Validating CPU DLPAR capability...yes.
CPU 121
root@P8lpar4:/home/ubuntu#

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-24 Thread Manoj Iyer
** Changed in: ubuntu-power-systems
   Importance: High => Critical

** Changed in: linux (Ubuntu)
   Importance: High => Critical

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-24 Thread Manoj Iyer
SRU submitted: https://lists.ubuntu.com/archives/kernel-
team/2019-June/101539.html

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress

Bug description:
  [Impact]
  On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash. The patches identified to fix these issues disables changing the NUMA 
associations for CPUs and Memory at runtime by default.

  [Test]
  # drmgr -c cpu -r -q 1
  # drmgr -c cpu -a -q 1
  Test kernel available in ppa:ubuntu-power-triage/lp1833716
  Please see comment #2 for before and after results with the patches applied.

  [Fix]
  558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
  2d4d9b308f8f powerpc/numa: improve control of topology updates

  [Regression Potential]
  The two patches relate to powerpc/numa and does not impact other 
architectures or platform code. Regression potential is low.

  [Other Information]
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
  [  218.555789] GPR04: 0001  0008 
0010
  [  218.555789] GPR08: 0018  c001f7bfd408 

  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
  [  218.555789] GPR24:  e830 ec30 
12af
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] [c001f5f1fa90] [c0179a70] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-24 Thread Manoj Iyer
== Bionic GA 4.15.0 kernel ==

ubuntu@P8lpar4:~$ sudo su
root@P8lpar4:/home/ubuntu# drmgr -c cpu -r -q 1
Validating CPU DLPAR capability...yes.
CPU 121
root@P8lpar4:/home/ubuntu# drmgr -c cpu -a -q 1

[262984.440091] BUG: arch topology borken   
[262984.440110]  the DIE domain not a subset of the NODE domain
[262984.440114] BUG: arch topology borken
[262984.440116]  the DIE domain not a subset of the NODE domain
[262984.440120] BUG: arch topology borken
[262984.440122]  the DIE domain not a subset of the NODE domain 
 
[262984.443241] Unable to handle kernel paging request for data at address 
0x50dbee4a
[262984.443261] Faulting instruction address: 0xc017982c
[262984.443281] LE SMP NR_CPUS=2048 NUMA pSeries
 
m iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 
ip_tables
ync_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
ibmvscs
[262984.443323] CPU: 120 PID: 1467 Comm: kworker/120:2 Not tainted 
4.15.0-52-generic
#56-Ubuntu  
 
[262984.443331] Workqueue: events cpuset_hotplug_workfn 
 
[262984.443334] NIP:  c017982c LR: c0179908 CTR: 

[262984.443341] MSR:  80009033   CR: 22824228  
XER: 000
5
  [262984.443349] GPR00: c0179908 c00fcec0b7b0 c16eb800 
c00f7c4
[262984.443349] GPR04: 0001  0008 
000
[262984.443349] GPR08: 0018  c00fe22f5808 
000
1c600   
 
[262984.443349] GPR16: 03c0 c1722494 c00fd249d800 
000
1   
 
[262984.443349] GPR20: c00ff7068580 0008 c11d8580 
c17
[262984.443349] GPR24:  e830 ec30 
000
0102f   
 
65d75   
 
[262984.443389] LR [c0179908] destroy_sched_domain+0x38/0xc0
 
[262984.443392] Call Trace: 
 
[262984.443395] [c00fcec0b7b0] [ec30] 0xec30 
(unreliable)
[262984.443400] [c00fcec0b7f0] [c0179908] 
destroy_sched_domain+0x38/0xc0
[262984.443404] [c00fcec0b820] [c0179e10] 
cpu_attach_domain+0xf0/0x870
[262984.443408] [c00fcec0b960] [c017b7e4] 
build_sched_domains+0x1254/0x12
f0  
 
[262984.443412] [c00fcec0ba90] [c017c9d0] 
partition_sched_domains+0x2d0/0
x410
 
[262984.443416] [c00fcec0bb20] [c0202b20] 
rebuild_sched_domains_locked+0x
60/0x80 
 
[262984.443420] [c00fcec0bb50] [c0205e28] 
rebuild_sched_domains+0x38/0x60
[262984.443424] [c00fcec0bb80] [c0205f88] 
cpuset_hotplug_workfn+0x138/0xb
 60 
  
[262984.443429] [c00fcec0bc90] [c0138798] 
process_one_work+0x298/0x5a0
[262984.443433] [c00fcec0bd20] [c0138b38] worker_thread+0x98/0x630  
 
[262984.443436] [c00fcec0bdc0] [c0141728] kthread+0x1a8/0x1b0   
 
[262984.451901] [c00fcec0be30] [c000b65c] 
ret_from_kernel_thread+0x5c/0x8
0   
 
[262984.451904] Instruction dump:   
 
[262984.451906] 7d908026 fbe1fff8 91810008 f8010010 f821ffc1 7c7d1b78 2e24 
7c7f1b
78  
 
[262984.451912] 4810 7fbee840 7fdff378 419e0074  4192002c 
7c0004ac e95f
0010
 
[262984.451919] ---[ end trace b6da6a7114e365f9 ]---
 
[262984.454656]

== After the patches are applied 4.15.0 ==
  ubuntu@P8lpar4:~$ uname -a
Linux P8lpar4 4.15.0-53-generic #57~lp1833716+build.1-Ubuntu SMP Fri Jun 21 
15:18:40 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux
ubuntu@P8lpar4:~$

ubuntu@P8lpar4:~$ sudo su
root@P8lpar4:/home/ubuntu# drmgr -c cpu -r -q 1
Validating CPU DLPAR capability...yes.
CPU 121
root@P8lpar4:/home/ubuntu# drmgr -c cpu -a -q 1
Validating CPU DLPAR capability...yes.
CPU 121
root@P8lpar4:/home/ubuntu#

 Ubuntu 18.04.2 LTS P8lpar4 hvc0


P8lpar4 login: [   25.687520] cloud-init[3439]: Cloud-init v. 
19.1-1-gbaa47854-0ubunt

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-24 Thread Manoj Iyer
** Description changed:

+ [Impact]
+ On Bionic GA kernel (4.15.0), hot add of cpu with drmgr causes the kernel to 
crash.
+ 
+ [Test]
+ # drmgr -c cpu -r -q 1
+ # drmgr -c cpu -a -q 1
+ 
+ [Fix]
+ 558f86493df0 powerpc/numa: document topology_updates_enabled, disable by 
default
+ 
+ 2d4d9b308f8f powerpc/numa: improve control of topology updates
+ 
+ 
+ [Regression Potential]
+ The two patches 
+ 
+ 
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:
  
  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
- root@ubuntu:~# 
- root@ubuntu:~# 
- root@ubuntu:~# 
- root@ubuntu:~# 
- root@ubuntu:~# 
+ root@ubuntu:~#
+ root@ubuntu:~#
+ root@ubuntu:~#
+ root@ubuntu:~#
+ root@ubuntu:~#
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
- [  218.555789] CFAR: c0176920 SOFTE: 1 
- [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200 
- [  218.555789] GPR04: 0001  0008 
0010 
- [  218.555789] GPR08: 0018  c001f7bfd408 
 
- [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900 
- [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001 
- [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78 
- [  218.555789] GPR24:  e830 ec30 
12af 
- [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f 
+ [  218.555789] CFAR: c0176920 SOFTE: 1
+ [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200
+ [  218.555789] GPR04: 0001  0008 
0010
+ [  218.555789] GPR08: 0018  c001f7bfd408 

+ [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900
+ [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001
+ [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78
+ [  218.555789] GPR24:  e830 ec30 
12af
+ [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-24 Thread Andrew Cloke
** Changed in: ubuntu-power-systems
   Status: New => In Progress

** Changed in: linux (Ubuntu)
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress

Bug description:
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1 
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200 
  [  218.555789] GPR04: 0001  0008 
0010 
  [  218.555789] GPR08: 0018  c001f7bfd408 
 
  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900 
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001 
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78 
  [  218.555789] GPR24:  e830 ec30 
12af 
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f 
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] [c001f5f1fa90] [c0179a70] 
partition_sched_domains+0x2d0/0x410
  [  218.555916] [c001f5f1fb20] [c01ffb60] 
rebuild_sched_domains_locked+0x60/0x80
  [  218.555924] [c001f5f1fb50] [c0202e68] 
rebuild_sched_domains+0x38/0x60
  [  218.555932] [c001f5f1fb80] [c0202fc8] 
cpuset_hotplug_workfn+0x138/0xb60
  [  218.555941] [c001f5f1fc90] [c0135858] 
process_one_work+0x298/0x5a0
  [  218.555949] [c001f5f1fd20] [c0135bf8] worker_thread+0x98/0x630
  [  218.555956] [c001f5f1fdc0] [c013e7e8] kthread+0x1a8/0x1b0
  [  218.555964] [c001f5f1fe30] [c000b658] 
ret_from_kernel_thread+0x5c/0x84
  [  218.555971] Instruction dump:
  [  218.555975] 7d908026 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-21 Thread Manoj Iyer
** Changed in: linux (Ubuntu)
 Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => 
Canonical Kernel Team (canonical-kernel-team)

** Changed in: ubuntu-power-systems
 Assignee: (unassigned) => Canonical Kernel Team (canonical-kernel-team)

** Changed in: linux (Ubuntu)
 Assignee: Canonical Kernel Team (canonical-kernel-team) => Manoj Iyer 
(manjo)

** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: ubuntu-power-systems
 Assignee: Canonical Kernel Team (canonical-kernel-team) => Manoj Iyer 
(manjo)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1 
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200 
  [  218.555789] GPR04: 0001  0008 
0010 
  [  218.555789] GPR08: 0018  c001f7bfd408 
 
  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900 
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001 
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78 
  [  218.555789] GPR24:  e830 ec30 
12af 
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f 
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] [c001f5f1fa90] [c0179a70] 
partition_sched_domains+0x2d0/0x410
  [  218.555916] [c001f5f1fb20] [c01ffb60] 
rebuild_sched_domains_locked+0x60/0x80
  [  218.555924] [c001f5f1fb50] [c0202e68] 
rebuild_sched_domains+0x38/0x60
  [  218.555932] [c001f5f1fb80] [c0202fc8] 

[Kernel-packages] [Bug 1833716] Re: System crashes on hot adding a core with drmgr command (4.15.0-48-generic)

2019-06-21 Thread Frank Heimes
** Package changed: kernel-package (Ubuntu) => linux (Ubuntu)

** Also affects: ubuntu-power-systems
   Importance: Undecided
   Status: New

** Tags added: powervm

** Changed in: ubuntu-power-systems
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833716

Title:
  System crashes on hot adding a core with drmgr command
  (4.15.0-48-generic)

Status in The Ubuntu-power-systems project:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Hari Krishna Bathini  - 2019-05-07 
13:18:35 ==
  ---Problem Description---
  On 4.15.0-48-generic kernel, hot adding a cpu with drmgr is crashing the 
kernel
  with below traces:

  ---
  root@ubuntu:~# drmgr -c cpu -r -q 1
  Validating CPU DLPAR capability...yes.
  CPU 9
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# 
  root@ubuntu:~# drmgr -c cpu -a -q 1
  Validating CPU DLPAR capability...yes.
  [  218.555493] BUG: arch topology borken
  [  218.03]  the DIE domain not a subset of the NODE domain
  [  218.12] BUG: arch topology borken
  [  218.16]  the DIE domain not a subset of the NODE domain
  [  218.23] BUG: arch topology borken
  [  218.28]  the DIE domain not a subset of the NODE domain
  [  218.35] BUG: arch topology borken
  [  218.39]  the DIE domain not a subset of the NODE domain
  [  218.45] BUG: arch topology borken
  [  218.50]  the DIE domain not a subset of the NODE domain
  [  218.56] BUG: arch topology borken
  [  218.60]  the DIE domain not a subset of the NODE domain
  [  218.67] BUG: arch topology borken
  [  218.71]  the DIE domain not a subset of the NODE domain
  [  218.77] BUG: arch topology borken
  [  218.81]  the DIE domain not a subset of the NODE domain
  [  218.555672] Unable to handle kernel paging request for data at address 
0x9332ae80f961139f
  [  218.555679] Faulting instruction address: 0xc01768cc
  [  218.555686] Oops: Kernel access of bad area, sig: 11 [#1]
  [  218.555691] LE SMP NR_CPUS=2048 NUMA pSeries
  [  218.555699] Modules linked in: vmx_crypto crct10dif_vpmsum sch_fq_codel 
ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear ibmvscsi ibmveth crc32c_vpmsum
  [  218.555745] CPU: 8 PID: 276 Comm: kworker/8:1 Not tainted 
4.15.0-48-generic #51-Ubuntu
  [  218.555757] Workqueue: events cpuset_hotplug_workfn
  [  218.555763] NIP:  c01768cc LR: c01769a8 CTR: 

  [  218.555770] REGS: c001f5f1f530 TRAP: 0380   Not tainted  
(4.15.0-48-generic)
  [  218.555776] MSR:  80009033   CR: 22824228  
XER: 0004
  [  218.555789] CFAR: c0176920 SOFTE: 1 
  [  218.555789] GPR00: c01769a8 c001f5f1f7b0 c16eb400 
c001f7bfd200 
  [  218.555789] GPR04: 0001  0008 
0010 
  [  218.555789] GPR08: 0018  c001f7bfd408 
 
  [  218.555789] GPR12: 8000 c7a35800 0007 
c001f549d900 
  [  218.555789] GPR16: 0040 c1722494 c001f0f29400 
0001 
  [  218.555789] GPR20: c001ffb68580 0008 c11d8580 
c171dd78 
  [  218.555789] GPR24:  e830 ec30 
12af 
  [  218.555789] GPR28: 102f c001f7bfd200 9332ae80f961139f 
9332ae80f961139f 
  [  218.555859] NIP [c01768cc] free_sched_groups.part.2+0x4c/0xf0
  [  218.555866] LR [c01769a8] destroy_sched_domain+0x38/0xc0
  [  218.555871] Call Trace:
  [  218.555875] [c001f5f1f7b0] [ec30] 0xec30 
(unreliable)
  [  218.555884] [c001f5f1f7f0] [c01769a8] 
destroy_sched_domain+0x38/0xc0
  [  218.555892] [c001f5f1f820] [c0176eb0] 
cpu_attach_domain+0xf0/0x870
  [  218.555900] [c001f5f1f960] [c0178884] 
build_sched_domains+0x1254/0x12f0
  [  218.555908] [c001f5f1fa90] [c0179a70] 
partition_sched_domains+0x2d0/0x410
  [  218.555916] [c001f5f1fb20] [c01ffb60] 
rebuild_sched_domains_locked+0x60/0x80
  [  218.555924] [c001f5f1fb50] [c0202e68] 
rebuild_sched_domains+0x38/0x60
  [  218.555932] [c001f5f1fb80] [c0202fc8] 
cpuset_hotplug_workfn+0x138/0xb60
  [  218.555941] [c001f5f1fc90] [c0135858] 
process_one_work+0x298/0x5a0
  [  218.555949] [c001f5f1fd20] [c0135bf8] worker_thread+0x98/0x630
  [  218.555956] [c001f5f1fdc0] [c013e7e8] kthread+0x1a8/0x1b0
  [  218.555964] [c001f5f1fe30] [c000b658]