subject:"\[Kernel\-packages\] \[Bug 1670634\] Comment bridged from LTC Bugzilla"

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-10-11 Thread bugproxy

--- Comment From heinz-werner_se...@de.ibm.com 2017-10-11 03:36 EDT---
IBM Bugzilla status -> closed, Fix Released within Xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-09-26 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-09-26 07:41 EDT---
Ok, just checked kernel 4.4.0-97 ... that looks much better:

root@x:~# uname -a
Linux mclint 4.4.0-97-generic #120-Ubuntu SMP Tue Sep 19 17:27:01 UTC 2017 
s390x s390x s390x GNU/Linux
root@x:~# systool -v -m scsi_mod
Module = "scsi_mod"

Attributes:
uevent  = 

Parameters:
default_dev_flags   = "0"
eh_deadline = "-1"
inq_timeout = "20"
max_luns= "512"
scan= "async"
scsi_logging_level  = "0"
use_blk_mq  = "N"
root@x:~# systool -v -m dm_mod
Module = "dm_mod"

Attributes:
uevent  = 

Parameters:
reserved_bio_based_ios= "16"
reserved_rq_based_ios= "256"
stats_current_allocated_bytes= "0"
use_blk_mq  = "N"

blk_mq() is now turned off by default and that was our main concern! On
top, I also started two "big" pdebuild processes (firefox) that so far
had the potential to drive the system right into the hang scenario that
was the origin cause to write this ticket. The build did not succeed,
but the point is that the system did not run into the typical hang -> so
I think you can consider this problem being solved!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-09-22 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-09-22 04:36 EDT---
Just checked kernel 4.4.0-96. blk_mq() is still activated as default:

root@x:~# systool -v -m dm_mod
Module = "dm_mod"

Attributes:
uevent  = 

Parameters:
reserved_bio_based_ios= "16"
reserved_rq_based_ios= "256"
stats_current_allocated_bytes= "0"
use_blk_mq  = "Y"
root@x:~# systool -v -m scsi_mod
Module = "scsi_mod"

Attributes:
uevent  = 

Parameters:
default_dev_flags   = "0"
eh_deadline = "-1"
inq_timeout = "20"
max_luns= "512"
scan= "async"
scsi_logging_level  = "0"
use_blk_mq  = "Y"

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-09-11 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-09-11 10:46 EDT---
Ok, I think then the discussion is back at the point I addressed with my 
comment from June 6th (comment 36 or 38)! If the Multi-Queue feature doesn't 
work with Kernel 4.4 yet, then you must not deliver it (that's exactly why SUSE 
did not activate that feature in their 4.4-Kernel package)!
And then again with Kernel 4.11 or 4.12 you can turn that option for those 
kernel packages on as finally there the feature seems to be mature and can be 
rolled out.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-09-11 Thread bugproxy

--- Comment From heinz-werner_se...@de.ibm.com 2017-09-11 09:52 EDT---
No specific requirement for s390 know, for enabled this config options.
They can be set to N

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-09-04 Thread bugproxy

--- Comment From heinz-werner_se...@de.ibm.com 2017-09-04 04:52 EDT---
no specific cpu (un)plug tests where executed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-08-30 Thread bugproxy

--- Comment From heinz-werner_se...@de.ibm.com 2017-08-30 05:15 EDT---
This function worked with SLES 12 SP2  kernel 4.4.74.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-08-22 Thread bugproxy

--- Comment From heinz-werner_se...@de.ibm.com 2017-08-22 03:42 EDT---
@Juerg. Now you should have access to the box folder...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [ 5281.179615]  [<03ff805df114>] xfs_log_force+0x44/0x100 [xfs]
  [ 5281.179640]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-07-17 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-07-17 12:10 EDT---
@jsalisbury:

To get your point: If 4.11rc8 was a "good kernel" and 4.11rc7 the last "bad 
kernel" you could try to get that very diff that fixes this problem and could 
apply it to the stable 4.4-Kernel line, I agree that's a feasible approach.
However, we must first make sure that 4.11rc8 is a "good kernel" and thatfore 
someone must look into the dumps I uploaded.
And the arch specific linux-headers-x.x.x-x-generic packages must be fixed for 
s390x. Without that package I can't build my DKMS-packages.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-07-13 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-07-13 10:26 EDT---
Just wanted to try the 4.11rc7 kernel and my DKMS-OpenAFS packages don't build 
on it. I already experienced this once with a package from the kernel-ppa 
repository:

root@mclint:~# file 
/usr/src/linux-headers-4.11.0-041100rc7-generic/arch/s390/tools/gen_facilities
/usr/src/linux-headers-4.11.0-041100rc7-generic/arch/s390/tools/gen_facilities: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, 
BuildID[sha1]=4590a42ae8b1bf9b2bd8d05e332e59bc7f47aa93, not stripped

You put x86-ELFs in your s390x linux-headers packages I need
appropriate linux-headers packages to make repros.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-07-12 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-07-12 10:45 EDT---
@jsalisbury:

Pardon to object in some respect, but I can't get the point why I should try 
another 4.11rc-Kernel and what you refer to with a "bad kernel" and a "good 
kernel".
The 4.11rc8 kernel seemed to be fine and I just got into a hang when the system 
was shut down by me. I just want to know whether that hang is related to 
blk_mq() or not and one glance into the crash-dump I uploaded to box will 
answer that question. I could also look into the crash-dump by myself, I just 
don't have the debug symbols for kernel 4.11rc8-s390x.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-07-10 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-07-10 04:34 EDT---
@jsalisbury:

I tried the kernel and didn't experience a hang as the system ran, but I ran 
into a hang when I shut the system down after my test. The dump of that very 
hang was uploaded from me to our Box account and Frank Heimes downloaded it to 
forward it to the Kernel team.
I'd like to have their (the kernel team's) statement whether in the dump they 
see a relation to blk_mq() or not before we close this item!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-06-08 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-06-08 10:28 EDT---
I shut down the test system I used for the repros here yesterday. And all of a 
sudden during the shutdown process the system ran into a hang scenario again. I 
don't know whether that hang is related to the problem addressed here 
(blk_mq()), but I took a dump and uploaded it to Box as a precaution though ->

mclint_20170607_kernel_4_11_0-041100rc8_without_openafs.dump.bz2

General dump info:
Dump format: s390mv
Version: 5
Dump created...: Wed, 07 Jun 2017 16:56:28 +0200
Dump ended.: Wed, 07 Jun 2017 16:57:06 +0200
Dump CPU ID: 9efc729648000
UTS node name..: mclint
UTS kernel release.: 4.11.0-041100rc8-generic
UTS kernel version.: #201704232131 SMP Mon Apr 24 02:10:15 UTC 2017
Build arch.: s390x (64 bit)
System arch: s390x (64 bit)
CPU count (online).: 2
CPU count (real)...: 4
Dump memory range..: 8192 MB
Real memory range..: 8192 MB

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-06-07 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-06-07 07:59 EDT---
Hello Benjamin,

you're right and let me rephrase in more detail:

The multiqueue feature is questioned here, but the feature has an impact to 
multipathing, because Ubuntu-Xenial boots up and has the multiqueue feature 
turned on as default(!!). You can run "multipathing" but you will have to turn 
off the multiqueue feature explicitely if you don't want to potentially run 
into the hang scenario described here.
And my recommendation would be to turn the "multiqueue" feature off by default 
for all Kernel versions prior to 4.11  that's what I wanted to express with 
my previous post.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-06-06 Thread bugproxy

--- Comment From bbl...@de.ibm.com 2017-06-06 12:46 EDT---
@jac...@de.ibm.com

Just to prevent confusion, its probably just a typo, but I think you
mean multi-queue (blk-mq and scsi-mq) - thats a different feature.
Multipathing (dm-multipath) should certainly work regardless.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-06-06 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-06-06 12:31 EDT---
@jsalisbury

Sorry for the late late late answer, but finally I've found time and
resources to test the 4.11-rc8 Kernel. And yes, this one looks promising
:-)

I was able to build OpenAFS with the pbuilder environment and also trying to 
build the firefox packages with pbuilder did not drive me into the hang 
scenario I usually face.
I'll do some more tests but this looks promising.
Hmm, if the solution for this bug is Kernel 4.11 than we may have to speak out 
a warning to whoever is utilizing multipathing on Kernels with an earlier 
version.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-04-06 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-04-06 11:53 EDT---
Same procedure as every kernel, /var/log/syslog:

Apr  6 16:17:29 mclint multipathd[881]: mpatha: sda - tur checker timed out
Apr  6 16:17:29 mclint multipathd[881]: 8:0: reinstated
Apr  6 16:17:29 mclint multipathd[881]: mpatha: sdb - tur checker timed out
Apr  6 16:17:29 mclint multipathd[881]: 8:16: reinstated
Apr  6 16:17:29 mclint multipathd[881]: mpatha: sdd - tur checker timed out
Apr  6 16:17:29 mclint multipathd[881]: 8:48: reinstated
Apr  6 16:17:29 mclint multipathd[881]: mpatha: sdc - tur checker timed out
Apr  6 16:17:29 mclint multipathd[881]: 8:32: reinstated

/dev/sclp_line0:

?  361.418628! INFO: task kworker/1:4:860 blocked for more than 120 seconds.
?  361.418635!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  361.418637! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  361.418722! INFO: task cpuplugd:2310 blocked for more than 120 seconds.
?  361.418723!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  361.418723! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  361.418766! INFO: task irqbalance:2416 blocked for more than 120 seconds.
?  361.418767!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  361.418768! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  361.418806! INFO: task kworker/0:2H:3403 blocked for more than 120 seconds.
?  361.418807!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  361.418808! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  361.418990! INFO: task kworker/0:9:4449 blocked for more than 120 seconds.
?  361.418991!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  361.418992! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.420013! INFO: task kworker/1:4:860 blocked for more than 120 seconds.
?  481.420020!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  481.420021! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.420091! INFO: task systemd-timesyn:1766 blocked for more than 120 seconds
.
?  481.420093!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  481.420093! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.420136! INFO: task rs:main Q:Reg:2023 blocked for more than 120 seconds.
?  481.420137!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  481.420138! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.420250! INFO: task cpuplugd:2310 blocked for more than 120 seconds.
?  481.420251!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  481.420252! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.420291! INFO: task irqbalance:2416 blocked for more than 120 seconds.
?  481.420292!   Not tainted 4.4.0-72-generic #93-Ubuntu
?  481.420293! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

Dump:

KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-72-generic
DUMPFILE: mclint_20170406_kernel_4_4_0-72_without_openafs.dump
CPUS: 1
DATE: Thu Apr  6 16:43:18 2017
UPTIME: 00:29:55
LOAD AVERAGE: 9.99, 9.56, 7.51
TASKS: 403
NODENAME: mclint
RELEASE: 4.4.0-72-generic
VERSION: #93-Ubuntu SMP Fri Mar 31 14:06:48 UTC 2017
MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: bad528  [THREAD_INFO: b78000]
CPU: 0
STATE: TASK_RUNNING
INFO: no panic task found

The bz2-compressed dump is already uploaded to Box.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 -

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-03-30 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-03-30 08:50 EDT---
New Kernel, new hang ...

?  961.242228! INFO: task kworker/1:1:38 blocked for more than 120 seconds.
?  961.242235!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.242236! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.242480! INFO: task xfsaild/dm-11:1742 blocked for more than 120 seconds.
?  961.242481!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.242482! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.242933! INFO: task rs:main Q:Reg:2043 blocked for more than 120 seconds.
?  961.242934!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.242935! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.243407! INFO: task cpuplugd:2355 blocked for more than 120 seconds.
?  961.243409!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.243410! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.243544! INFO: task irqbalance:2447 blocked for more than 120 seconds.
?  961.243546!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.243546! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.243617! INFO: task kworker/0:2H:3385 blocked for more than 120 seconds.
?  961.243618!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.243619! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.243911! INFO: task kworker/0:1H:6035 blocked for more than 120 seconds.
?  961.243912!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.243913! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.244405! INFO: task kworker/0:2:22440 blocked for more than 120 seconds.
?  961.244406!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.244407! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.244543! INFO: task kworker/0:4:22938 blocked for more than 120 seconds.
?  961.244545!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.244545! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.245404! INFO: task dpkg:24617 blocked for more than 120 seconds.
?  961.245405!   Not tainted 4.4.0-71-generic #92-Ubuntu
?  961.245406! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

And the dump:

KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-71-generic
DUMPFILE: mclint_20170330_kernel_4_4_0-71_without_openafs.dump
CPUS: 1
DATE: Thu Mar 30 12:14:27 2017
UPTIME: 00:24:42
LOAD AVERAGE: 14.32, 12.51, 7.23
TASKS: 407
NODENAME: mclint
RELEASE: 4.4.0-71-generic
VERSION: #92-Ubuntu SMP Fri Mar 24 13:03:47 UTC 2017
MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: bad528  [THREAD_INFO: b78000]
CPU: 0
STATE: TASK_RUNNING
INFO: no panic task found

The dump is already uploaded to Box

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO:

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-03-28 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-03-28 12:23 EDT---
Just tried Kernel 4.4.0-70 ->

/var/log/syslog:

Mar 28 18:07:46 mclint multipathd[888]: mpatha: sda - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:0: reinstated
Mar 28 18:07:46 mclint multipathd[888]: mpatha: sdb - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:16: reinstated
Mar 28 18:07:46 mclint multipathd[888]: mpatha: sdc - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:32: reinstated
Mar 28 18:07:46 mclint multipathd[888]: mpatha: sdd - tur checker timed out
Mar 28 18:07:46 mclint multipathd[888]: 8:48: reinstated

/dev/sclp_line0:

?  459.779353! BTRFS error (device dm-1): bdev /dev/dm-1 errs: wr 0, rd 1, flush
0, corrupt 0, gen 0
?  459.779503! BTRFS error (device dm-1): bdev /dev/dm-1 errs: wr 0, rd 2, flush
0, corrupt 0, gen 0
?  481.287452! INFO: task xfsaild/dm-11:1727 blocked for more than 120 seconds.
?  481.287459!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  481.287461! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.287647! INFO: task cpuplugd:2402 blocked for more than 120 seconds.
?  481.287648!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  481.287649! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.287696! INFO: task irqbalance:2508 blocked for more than 120 seconds.
?  481.287697!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  481.287698! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.287740! INFO: task kworker/0:19:22353 blocked for more than 120 seconds.
?  481.287741!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  481.287742! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.287769! INFO: task kworker/0:21:22355 blocked for more than 120 seconds.
?  481.287770!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  481.287771! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  481.287875! INFO: task tar:22484 blocked for more than 120 seconds.
?  481.287876!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  481.287877! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  601.288111! INFO: task systemd:1 blocked for more than 120 seconds.
?  601.288118!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  601.288119! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  601.288207! INFO: task xfsaild/dm-11:1727 blocked for more than 120 seconds.
?  601.288208!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  601.288209! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  601.288372! INFO: task rs:main Q:Reg:2002 blocked for more than 120 seconds.
?  601.288374!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  601.288374! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  601.288496! INFO: task cpuplugd:2402 blocked for more than 120 seconds.
?  601.288497!   Not tainted 4.4.0-70-generic #91-Ubuntu
?  601.288497! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

crashdump-info:

KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-70-generic
DUMPFILE: mclint_20170328_kernel_4_4_0-70_without_openafs.dump
CPUS: 3
DATE: Tue Mar 28 18:15:25 2017
UPTIME: 00:14:23
LOAD AVERAGE: 13.22, 10.22, 5.28
TASKS: 408
NODENAME: mclint
RELEASE: 4.4.0-70-generic
VERSION: #91-Ubuntu SMP Wed Mar 22 12:48:02 UTC 2017
MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: bad528  (1 of 3)  [THREAD_INFO: b78000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
INFO: no panic task found

I will upload the compressed dump to Box soon ...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-03-14 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-03-14 12:07 EDT---
I was able to run into the hand now also without openafs, /var/log/syslog:

Mar 14 15:10:46 mclint multipathd[887]: mpatha: sda - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:0: reinstated
Mar 14 15:10:46 mclint multipathd[887]: mpatha: sdb - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:16: reinstated
Mar 14 15:10:46 mclint multipathd[887]: mpatha: sdc - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:32: reinstated
Mar 14 15:10:46 mclint multipathd[887]: mpatha: sdd - tur checker timed out
Mar 14 15:10:46 mclint multipathd[887]: 8:48: reinstated

On the sclp_line console:

? 9841.149452! INFO: task btrfs-transacti:634 blocked for more than 120
seconds.

? 9841.149459!   Not tainted 4.4.0-67-generic #88
? 9841.149461! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149627! INFO: task cpuplugd:2409 blocked for more than 120 seconds.
? 9841.149628!   Not tainted 4.4.0-67-generic #88
? 9841.149629! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149674! INFO: task irqbalance:2515 blocked for more than 120 seconds.
? 9841.149675!   Not tainted 4.4.0-67-generic #88
? 9841.149676! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149715! INFO: task kworker/0:2:3661 blocked for more than 120 seconds.
? 9841.149716!   Not tainted 4.4.0-67-generic #88
? 9841.149717! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9841.149752! INFO: task tar:16648 blocked for more than 120 seconds.
? 9841.149753!   Not tainted 4.4.0-67-generic #88
? 9841.149754! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149482! INFO: task btrfs-transacti:634 blocked for more than 120 seconds.

? 9961.149489!   Not tainted 4.4.0-67-generic #88
? 9961.149490! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149640! INFO: task rs:main Q:Reg:1995 blocked for more than 120 seconds.
? 9961.149642!   Not tainted 4.4.0-67-generic #88
? 9961.149642! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149727! INFO: task cpuplugd:2409 blocked for more than 120 seconds.
? 9961.149729!   Not tainted 4.4.0-67-generic #88
? 9961.149729! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149772! INFO: task irqbalance:2515 blocked for more than 120 seconds.
? 9961.149773!   Not tainted 4.4.0-67-generic #88
? 9961.149773! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 9961.149811! INFO: task kworker/0:2:3661 blocked for more than 120 seconds.
? 9961.149812!   Not tainted 4.4.0-67-generic #88
? 9961.149812! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

I just made a new dump ->
General dump info:
Dump format: s390mv
Version: 5
Dump created...: Tue, 14 Mar 2017 15:45:29 +0100
Dump ended.: Tue, 14 Mar 2017 15:46:07 +0100
Dump CPU ID: 9efc729648000
UTS node name..: mclint
UTS kernel release.: 4.4.0-67-generic
UTS kernel version.: #88 SMP Wed Mar 8 14:48:51 UTC 2017
Build arch.: s390x (64 bit)
System arch: s390x (64 bit)
CPU count (online).: 3
CPU count (real)...: 4
Dump memory range..: 8192 MB
Real memory range..: 8192 MB

The dump is currently uploaded to the Box-Folder ->
mclint_20170314_kernel_4_4_0-67_without_openafs.dump.bz2

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-03-13 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-03-13 09:57 EDT---
Next iteration: I was able to make OpenAFS for the proposed kernel fixed and so 
I was able to start the next pdebuild job  and again I run into a hang 
scenario:

Mar 13 14:30:32 mclint multipathd[881]: mpatha: sda - tur checker timed out
Mar 13 14:30:32 mclint multipathd[881]: 8:0: reinstated
Mar 13 14:30:32 mclint multipathd[881]: mpatha: sdc - tur checker timed out
Mar 13 14:30:32 mclint multipathd[881]: 8:32: reinstated
Mar 13 14:30:33 mclint multipathd[881]: mpatha: sdb - tur checker timed out
Mar 13 14:30:33 mclint multipathd[881]: 8:16: reinstated
Mar 13 14:30:36 mclint multipathd[881]: mpatha: sdd - tur checker timed out
Mar 13 14:30:36 mclint rsyslogd-2007: action 'action 10' suspended, next retry 
is Mon Mar 13 14:32:06 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 13 14:30:36 mclint multipathd[881]: 8:48: reinstated

I just pulled a dump and compress it, eventually I'll upload it to Box.
I'll go on and try to reproduce this without OpenAFS and I'm very
confident that this hang is not related to AFS at all ...

General dump info:
Dump format: s390mv
Version: 5
Dump created...: Mon, 13 Mar 2017 14:42:39 +0100
Dump ended.: Mon, 13 Mar 2017 14:43:17 +0100
Dump CPU ID: 9efc729648000
UTS node name..: mclint
UTS kernel release.: 4.4.0-67-generic
UTS kernel version.: #88 SMP Wed Mar 8 14:48:51 UTC 2017
Build arch.: s390x (64 bit)
System arch: s390x (64 bit)
CPU count (online).: 2
CPU count (real)...: 4
Dump memory range..: 8192 MB
Real memory range..: 8192 MB

Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

Dump device info:
Volume 0: 0.0.8409 (online/active)
Volume 1: 0.0.840a (online/active)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup:

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-03-09 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-03-09 11:28 EDT---
Ok, I just tested the kernel from http://kernel.ubuntu.com/~rtg/lp1670634/ and 
so far this looks good! I was able to Make the OpenAFS Ubuntu package three 
times with pdebuild without running into a hang, and this was a very good 
candidate to run into the hang scenario.
I'd like to activate OpenAFS on that system so that I'll again be able to also 
start to run jobs as users other than root and also to build more packages for 
Ubuntu, and at the moment there is only one subtle obstacle:

root@mclint:/var/lib/dkms/openafs/1.6.20.1/build# file 
/usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep
/usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep: ELF 64-bit LSB 
executable, x86-64, version 1 (SYSV), dynamically linked, interpreter 
/lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, 
BuildID[sha1]=4f746ae15cb57aa0f264c965a8061844f5f21fa2, not stripped
root@mclint:/var/lib/dkms/openafs/1.6.20.1/build# dpkg -S 
/usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep
linux-headers-4.4.0-67-generic: 
/usr/src/linux-headers-4.4.0-67-generic/scripts/basic/fixdep

Somehow, you put me the x86_64 version of "fixdep" into the linux-
headers package in the ~rtg/lp1670634 folder. I checked the same file on
the other linux-headers packages on my system and they were present and
for the s390x-architecture. I'm a little puzzled here, as the linux-
headers package is an "_all.deb" I thought those packages should be free
of binary files for a specific architecture. I'll try to replace the
fixdep program and try on ...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923]

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-03-08 Thread bugproxy

--- Comment From jac...@de.ibm.com 2017-03-08 05:56 EDT---
Just tried the newest Kernel 4.4.0-66, and I'm still running into the hang. 
Here the final statements in /var/log/syslog (the lines, that never make it out 
onto the disk):

Mar  8 11:26:31 mclint multipathd[955]: mpatha: sdb - tur checker timed out
Mar  8 11:26:31 mclint multipathd[955]: 8:16: reinstated
Mar  8 11:26:31 mclint multipathd[955]: mpatha: sdd - tur checker timed out
Mar  8 11:26:31 mclint rsyslogd-2007: action 'action 10' suspended, next retry 
is Wed Mar  8 11:27:01 2017 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar  8 11:26:31 mclint multipathd[955]: 8:48: reinstated
Mar  8 11:26:31 mclint multipathd[955]: mpatha: sdc - tur checker timed out
Mar  8 11:26:31 mclint multipathd[955]: 8:32: reinstated
Mar  8 11:26:32 mclint multipathd[955]: mpatha: sda - tur checker timed out
Mar  8 11:26:32 mclint multipathd[955]: 8:0: reinstated

And this here shows up on the sclp_line console:

?  961.419327! INFO: task cpuplugd:2604 blocked for more than 120 seconds.
?  961.419337!   Not tainted 4.4.0-66-generic #87-Ubuntu
?  961.419338! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.419404! INFO: task irqbalance:2651 blocked for more than 120 seconds.
?  961.419406!   Not tainted 4.4.0-66-generic #87-Ubuntu
?  961.419407! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.419450! INFO: task kworker/0:4:3801 blocked for more than 120 seconds.
?  961.419451!   Not tainted 4.4.0-66-generic #87-Ubuntu
?  961.419452! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.419494! INFO: task kworker/1:1:4548 blocked for more than 120 seconds.
?  961.419495!   Not tainted 4.4.0-66-generic #87-Ubuntu
?  961.419496! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.419539! INFO: task kworker/0:0H:20302 blocked for more than 120 seconds.
?  961.419540!   Not tainted 4.4.0-66-generic #87-Ubuntu
?  961.419541! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.419764! INFO: task kworker/0:0:66641 blocked for more than 120 seconds.
?  961.419766!   Not tainted 4.4.0-66-generic #87-Ubuntu
?  961.419767! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
?  961.419895! INFO: task rm:81710 blocked for more than 120 seconds.
?  961.419896!   Not tainted 4.4.0-66-generic #87-Ubuntu
?  961.419897! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419024! INFO: task systemd:1 blocked for more than 120 seconds.
? 1081.419033!   Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419035! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419148! INFO: task cpuplugd:2604 blocked for more than 120 seconds.
? 1081.419150!   Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419151! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
? 1081.419186! INFO: task irqbalance:2651 blocked for more than 120 seconds.
? 1081.419187!   Not tainted 4.4.0-66-generic #87-Ubuntu
? 1081.419188! "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.

I pulled a DASD-Dump from the system:

KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-66-generic
DUMPFILE: mclint_20170308_kernel_4_4_0-66_without_openafs.dump
CPUS: 3
DATE: Wed Mar  8 11:37:56 2017
UPTIME: 00:25:30
LOAD AVERAGE: 12.99, 11.25, 6.55
TASKS: 422
NODENAME: mclint
RELEASE: 4.4.0-66-generic
VERSION: #87-Ubuntu SMP Fri Mar 3 15:32:53 UTC 2017
MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
PANIC: ""
PID: 0
COMMAND: "swapper/0"
TASK: bb1538  (1 of 3)  [THREAD_INFO: b7c000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
INFO: no panic task found

And again I see 10 multipath-Daemons in the process list, this is my
typical hang scenario.

crash> ps | grep multipathd
955  1   0  1e49115f0 IN   0.1  335364   8316  multipathd
971  1   0  7e8b2be0  IN   0.1  335364   8316  multipathd
972  1   0  7e8b6db0  IN   0.1  335364   8316  multipathd
977  1   1  7e8b36d8  IN   0.1  335364   8316  multipathd
978  1   0  7e8b62b8  IN   0.1  335364   8316  multipathd
979  1   2  7e8b4cc8  IN   0.1  335364   8316  multipathd
81714  1   1  7cdc8000  IN   0.1  335364   8316  multipathd
81715  1   1  7cdc95f0  IN   0.1  335364   8316  multipathd
81716  1   1  7cdcc1d0  IN   0.1  335364   8316  multipathd
81717  1   1  1e6c595f0 IN   0.1  335364   8316  multipathd

I'll compress the dump and try to find ways to make it available to you
...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

2017-03-07 Thread bugproxy

--- Comment From heinz-werner_se...@de.ibm.com 2017-03-07 11:44 EDT---
Please provide Debug-Info to this IBM_BOX folder
https://ibm.box.com/s/y10o4u7bcvc6nk7rgk2gfmk039ii5d1i
After Debugging this folder will be deleted.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1670634

Title:
  blk-mq: possible deadlock on CPU hot(un)plug

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  == Comment: #0 - Carsten Jacobi  - 2017-03-07 03:35:31 ==
  I'm evaluating Ubuntu-Xenial on z for development purposes, the test system 
is installed in an LPAR with one FCP-LUN which is accessable by 4 pathes (all 
pathes are configured).
  The system hangs regularly when I make packages with "pdebuild" using the 
pbuilder packaging suit.
  The local Linux development team helped me out with a pre-analysis that I can 
post here (thanks a lot for that):

  With the default settings and under a certain workload,
  blk_mq seems to get into a presumed "deadlock".
  Possibly this happens on CPU hot(un)plug.

  After the I/O stalled, a dump was pulled manually.
  The following information is from the crash dump pre-analysis.

  $ zgetdump -i dump.0
  General dump info:
Dump format: elf
Version: 1
UTS node name..: mclint
UTS kernel release.: 4.4.0-65-generic
UTS kernel version.: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
System arch: s390x (64 bit)
CPU count (online).: 2
Dump memory range..: 8192 MB
  Memory map:
 - 0001b831afff (7043 MB)
0001b831b000 - 0001 (1149 MB)

  Things look similarly with HWE kernel ubuntu16.04-4.8.0-34.36~16.04.1.

KERNEL: vmlinux.full
  DUMPFILE: dump.0
  CPUS: 2
  DATE: Fri Mar  3 14:31:07 2017
UPTIME: 02:11:20
  LOAD AVERAGE: 13.00, 12.92, 11.37
 TASKS: 411
  NODENAME: mclint
   RELEASE: 4.4.0-65-generic
   VERSION: #86-Ubuntu SMP Thu Feb 23 17:54:37 UTC 2017
   MACHINE: s390x  (unknown Mhz)
MEMORY: 7.8 GB
 PANIC: ""
   PID: 0
   COMMAND: "swapper/0"
  TASK: bad528  (1 of 2)  [THREAD_INFO: b78000]
   CPU: 0
 STATE: TASK_RUNNING (ACTIVE)
  INFO: no panic task found

  crash> dev -d
  MAJOR GENDISKNAME   REQUEST_QUEUE  TOTAL ASYNC  SYNC   DRV
  ...
  8 1e1d6d800  sda1e1d51210  0 23151 4294944145 
N/A(MQ)
  8 1e4e06800  sdc2081b180 23148 4294944148 
N/A(MQ)
  8 1f07800sdb20c75680 23195 4294944101 
N/A(MQ)
  8 1e4e06000  sdd1e4e31210  0 23099 4294944197 
N/A(MQ)
252 1e1d6c800  dm-0   1e1d51b18  9 1 8 
N/A(MQ)
  ...

  So both dm-mpath and sd have requests pending in their block multiqueue.
  The large numbers of sd look strange and seem to be the unsigned formatting 
of the values shown for async multiplied by -1.

  [0.798256] Linux version 4.4.0-65-generic (buildd@z13-011) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #86-Ubuntu SMP Thu Feb 23 
17:54:37 UTC 2017 (Ubuntu 4.4.0-65.86-generic 4.4.49)
  [0.798262] setup: Linux is running natively in 64-bit mode
  [0.798290] setup: Max memory size: 8192MB
  [0.798298] setup: Reserving 196MB of memory at 7996MB for crashkernel 
(System RAM: 7996MB)

  [0.836923] Kernel command line: root=/dev/mapper/mclint_vg-root
  rootflags=subvol=@ crashkernel=196M BOOT_IMAGE=0

  [ 5281.179428] INFO: task xfsaild/dm-11:1604 blocked for more than 120 
seconds.
  [ 5281.179437]   Not tainted 4.4.0-65-generic #86-Ubuntu
  [ 5281.179438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 5281.179440] xfsaild/dm-11   D 007bcf52 0  1604  2 
0x
  [ 5281.179444]0001e931c230 001a6964 0001e6f9b958 
0001e6f9b9d8
0001e15795f0 0001e6f9b988 00ce8c00 
0001ea805c70
0001ea805c00 00ba5ed0 0001e931c1d0 
0001e1579b20
0001ea805c00 0001e15795f0 0001ea805c00 

007d3978 007bc9f8 0001e6f9b9d8 
0001e6f9ba40
  [ 5281.179454] Call Trace:
  [ 5281.179461] ([<007bc9f8>] __schedule+0x300/0x810)
  [ 5281.179462]  [<007bcf52>] schedule+0x4a/0xb0
  [ 5281.179465]  [<007c02aa>] schedule_timeout+0x232/0x2a8
  [ 5281.179466]  [<007bde50>] wait_for_common+0x110/0x1c8
  [ 5281.179472]  [<0017b602>] flush_work+0x42/0x58
  [ 5281.179564]  [<03ff805e14ba>] xlog_cil_force_lsn+0x7a/0x238 [xfs]
  [ 5281.179589]  [<03ff805dee82>] _xfs_log_force+0x9a/0x2e8 [xfs]
  [

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

[Kernel-packages] [Bug 1670634] Comment bridged from LTC Bugzilla

24 matches

Site Navigation

Mail list logo

Footer information