[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-12-21 Thread Frank Heimes
Hello Max, glad to read that.
That's what I hoped, after the significant patch set of LP 1887124 landed.
I'm closing this bug on our side, too.
Thx

** Changed in: linux (Ubuntu)
   Status: New => Fix Released

** Changed in: ubuntu-z-systems
   Status: Incomplete => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] 

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-12-08 Thread Frank Heimes
** Changed in: ubuntu-z-systems
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  Incomplete
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] ksys_ioctl+0x84/0xb0 
  [20832.901883]  [<01d3cbf8639a>] __s390x_sys_ioctl+0x2a/0x40 
  [20832.901885]  [<01d3cc57d5f2>] system_call+0x2a6/0x2c8 
  [20832.901885] Last Breaking-Event-Address:
  [20832.901889]  [<01d3cbd5607e>] 

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-09-16 Thread Frank Heimes
In other words it's reasonable to retry on a latest Ubuntu 20.04 kernel
(after sudo apt update && sudo apt full-upgrade and a reboot)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] ksys_ioctl+0x84/0xb0 
  [20832.901883]  [<01d3cbf8639a>] __s390x_sys_ioctl+0x2a/0x40 
  [20832.901885]  [<01d3cc57d5f2>] system_call+0x2a6/0x2c8 
  [20832.901885] Last 

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-09-16 Thread Frank Heimes
Yes, the kernel(s) were the significant set (of about 30) zFCP related patches 
were applied to,
already landed in focal (-updates) respectively the groovy kernel (indicated by 
the Fix Released status at LP 1887124 - 
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1887124)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] ksys_ioctl+0x84/0xb0 
  [20832.901883]  

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-07-14 Thread Frank Heimes
I'm wondering if it would make sense (on top of comment #7:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1881109/comments/7) to
test this again with an updated kernel (that got patched to fix DIF and
DIX), where the fix virtually updates the scsi/zfcp driver  to the
kernel 5.8 level?

I'm having this bug in mind: 
LP 1887124 "[UBUNTU 20.04] DIF and DIX support in zfcp (s390x) is broken and 
the kernel crashes unconditionally"
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1887124

A patched kernel to test is referenced here:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1887124/comments/1

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] 

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-07-09 Thread Frank Heimes
Well, we do drive our storage sub-system from time to time to the limits - 
especially if we do parallel LPAR deployments for OpenStack environments.
But that's on a z13 and a DS8k - and so far we never saw such issues in this 
environment.

Further investigations in Launchpad did not resulted in further
references to similar reports like this, with SCSI / wbt (or wbt in
general) on focal.

However, I found that there were wbt, respectively blk-wbt, issues in the past 
with kernels > 4.10 and < v4.19 that partially led to CPU hard lockups on heavy 
writes (largely reported on NVMe drives).
But those bugs where only reported on bionic (and cosmic) - which fits to the 
kernel range above - and got fixed quite some time ago.
The bionic (and cosmic) kernels where patched via backports of:
2887e41b910b - "blk-wbt: Avoid lock contention and thundering herd issue in 
wbt_wait"
38cfb5a45ee0 - "blk-wbt: improve waking of tasks"
I just double checked that the fixes from those tickets are (still) in, and 
they are.

With only having heard about this problem in this bug here, I agree that 
recommending to turn WBT off in general would not be good - even preferring 
stability over performance.
(I still have the suspicion that it could be XIV related, rather than general 
block or SCSI layer...)

However, for now we may add a statement to the s390x section of the
release notes pointing to WBT and the udev rule for disabling it for the
block-devices, in case one hits such issues under high disk I/O stress.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-07-09 Thread Frank Heimes
** Information type changed from Public Security to Public

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] ksys_ioctl+0x84/0xb0 
  [20832.901883]  [<01d3cbf8639a>] __s390x_sys_ioctl+0x2a/0x40 
  [20832.901885]  [<01d3cc57d5f2>] system_call+0x2a6/0x2c8 
  [20832.901885] Last Breaking-Event-Address:
  [20832.901889]  [<01d3cbd5607e>] 

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-07-07 Thread Frank Heimes
@Corbin, may I ask for the rationale why you changed this from Public →
Public Security ?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] ksys_ioctl+0x84/0xb0 
  [20832.901883]  [<01d3cbf8639a>] __s390x_sys_ioctl+0x2a/0x40 
  [20832.901885]  [<01d3cc57d5f2>] system_call+0x2a6/0x2c8 
  [20832.901885] Last Breaking-Event-Address:
  [20832.901889]  

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-07-07 Thread Corbin
** Information type changed from Public to Public Security

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] ksys_ioctl+0x84/0xb0 
  [20832.901883]  [<01d3cbf8639a>] __s390x_sys_ioctl+0x2a/0x40 
  [20832.901885]  [<01d3cc57d5f2>] system_call+0x2a6/0x2c8 
  [20832.901885] Last Breaking-Event-Address:
  [20832.901889]  [<01d3cbd5607e>] 

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-05-28 Thread Frank Heimes
Hi Benjamin,
if it's an issue somewhere in scsi-midlayer/block-layer/wbt wouldn't it then 
also happen with zFCP on DS8k and on other patforms?
So far we did some testing with zFCP on DS8k (the only storage sub-system we 
have) as part of the release testing and server certification and on top we 
have constantly several zFCP systems currently running on 20.04 (probably less 
big systems and/or with less load), but so far we didn't faced a single crash.
So I'm assuming more that is is XIV related, no?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] 

[Kernel-packages] [Bug 1881109] Re: [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be triggered by scsi errors.

2020-05-28 Thread Frank Heimes
** Also affects: ubuntu-z-systems
   Importance: Undecided
   Status: New

** Changed in: ubuntu-z-systems
 Assignee: (unassigned) => Skipper Bug Screeners (skipper-screen-team)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1881109

Title:
  [Ubuntu 20.04] LPAR crashes in block layer under high stress. Might be
  triggered by scsi errors.

Status in Ubuntu on IBM z Systems:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  We can reproduce a crash in the block layer with lots of stress on
  lots of SCSI disks (on an XIV storage server).

  We seem to have several scsi stalls in the logs/errors (needs to be
  analyzed further) but in the end we do crash with this this calltrace.

  [20832.901147] Failing address: 7fe00dea8000 TEID: 7fe00dea8403
  [20832.901159] Fault in home space mode while using kernel ASCE.
  [20832.901171] AS:01d3cccf400b R2:03fd0020800b R3:03fd0020c007 
S:03fc1cc78800 P:0400 
  [20832.901264] Oops: 0011 ilc:2 [#1] SMP 
  [20832.901280] Modules linked in: vhost_net vhost macvtap macvlan tap xfs 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ip6table_filter ip6_tables 
iptable_filter bpfilter bridge aufs overlay dm_service_time dm_multipath 
scsi_dh_rdac scsi_dh_emc scsi_dh_alua s390_trng chsc_sch eadm_sch vfio_ccw 
vfio_mdev mdev vfio_iommu_type1 vfio 8021q garp mrp stp llc sch_fq_codel drm 
drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress 
zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 linear dm_mirror dm_region_hash 
dm_log qeth_l2 pkey zcrypt crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 
libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common 
zfcp scsi_transport_fc dasd_eckd_mod dasd_mod qeth qdio ccwgroup
  [20832.901516] CPU: 29 PID: 389709 Comm: CPU 0/KVM Kdump: loaded Not tainted 
5.4.0-29-generic #33-Ubuntu
  [20832.901530] Hardware name: IBM 8561 T01 708 (LPAR)
  [20832.901542] Krnl PSW : 0404e0018000 01d3cbd559be 
(try_to_wake_up+0x4e/0x700)
  [20832.901575]R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
  [20832.901744] Krnl GPRS: 03fc917cd988 7fe0 7fe0001e 
0003
  [20832.901750] 004c 040003fd005a4600 
0003
  [20832.901753]0003 7fe00dea8454  
7fe00dea7b00
  [20832.901754]03f44bd9b300 01d3cc587088 7fe0021c7ae0 
7fe0021c7a60
  [20832.901767] Krnl Code: 01d3cbd559b2: 41902954la  
%r9,2388(%r2)
01d3cbd559b6: 582003acl   
%r2,940
   #01d3cbd559ba: a718lhi %r1,0
   >01d3cbd559be: ba129000cs  
%r1,%r2,0(%r9)
01d3cbd559c2: a77401c9brc 
7,01d3cbd55d54
01d3cbd559c6: e310b0080004lg  
%r1,8(%r11)
01d3cbd559cc: b9800018ngr 
%r1,%r8
01d3cbd559d0: a774001fbrc 
7,01d3cbd55a0e
  [20832.901784] Call Trace:
  [20832.901816] ([<01d3cc57e0ac>] cleanup_critical+0x0/0x474)
  [20832.901823]  [<01d3cc1d16ba>] rq_qos_wake_function+0x8a/0xa0 
  [20832.901827]  [<01d3cbd74bde>] __wake_up_common+0x9e/0x1b0 
  [20832.901829]  [<01d3cbd750e4>] __wake_up_common_lock+0x94/0xe0 
  [20832.901830]  [<01d3cbd7515a>] __wake_up+0x2a/0x40 
  [20832.901835]  [<01d3cc1e8640>] wbt_done+0x90/0xe0 
  [20832.901837]  [<01d3cc1d17be>] __rq_qos_done+0x3e/0x60 
  [20832.901841]  [<01d3cc1bd5b0>] blk_mq_free_request+0xe0/0x140 
  [20832.901848]  [<01d3cc35fc60>] dm_softirq_done+0x140/0x230 
  [20832.901849]  [<01d3cc1bbfbc>] blk_done_softirq+0xbc/0xe0 
  [20832.901850]  [<01d3cc57e710>] __do_softirq+0x100/0x360 
  [20832.901853]  [<01d3cbd2525e>] irq_exit+0x9e/0xc0 
  [20832.901856]  [<01d3cbcb0b18>] do_IRQ+0x78/0xb0 
  [20832.901859]  [<01d3cc57dc28>] ext_int_handler+0x128/0x12c 
  [20832.901860]  [<01d3cc57d306>] sie_exit+0x0/0x46 
  [20832.901866] ([<01d3cbce944a>] __vcpu_run+0x27a/0xc30)
  [20832.901870]  [<01d3cbcf29a8>] kvm_arch_vcpu_ioctl_run+0x2d8/0x840 
  [20832.901875]  [<01d3cbcdd242>] kvm_vcpu_ioctl+0x282/0x770 
  [20832.901880]  [<01d3cbf85f66>] do_vfs_ioctl+0x376/0x690 
  [20832.901881]  [<01d3cbf86304>] ksys_ioctl+0x84/0xb0 
  [20832.901883]  [<01d3cbf8639a>] __s390x_sys_ioctl+0x2a/0x40 
  [20832.901885]  [<01d3cc57d5f2>]