[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
** Changed in: ubuntu-power-systems Status: In Progress => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: Fix Released Status in linux package in Ubuntu: Fix Released Status in linux source package in Focal: Fix Released Status in linux source package in Groovy: Fix Released Status in linux source package in Hirsute: Fix Released Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
This bug was fixed in the package linux - 5.8.0-36.40+21.04.1 --- linux (5.8.0-36.40+21.04.1) hirsute; urgency=medium * Packaging resync (LP: #1786013) - update dkms package versions [ Ubuntu: 5.8.0-36.40 ] * debian/scripts/file-downloader does not handle positive failures correctly (LP: #1878897) - [Packaging] file-downloader not handling positive failures correctly [ Ubuntu: 5.8.0-35.39 ] * Packaging resync (LP: #1786013) - update dkms package versions * CVE-2021-1052 // CVE-2021-1053 - [Packaging] NVIDIA -- Add the NVIDIA 460 driver -- Kleber Sacilotto de Souza Thu, 07 Jan 2021 11:57:30 +0100 ** Changed in: linux (Ubuntu Hirsute) Status: In Progress => Fix Released ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2021-1052 ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2021-1053 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: Fix Released Status in linux source package in Focal: Fix Released Status in linux source package in Groovy: Fix Released Status in linux source package in Hirsute: Fix Released Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
This bug was fixed in the package linux - 5.8.0-31.33 --- linux (5.8.0-31.33) groovy; urgency=medium * groovy/linux: 5.8.0-31.33 -proposed tracker (LP: #1905299) * Groovy 5.8 kernel hangs on boot on CPUs with eLLC (LP: #1903397) - drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init * CVE-2020-4788 - selftests/powerpc: rfi_flush: disable entry flush if present - powerpc/64s: flush L1D on kernel entry - powerpc/64s: flush L1D after user accesses - selftests/powerpc: entry flush test linux (5.8.0-30.32) groovy; urgency=medium * groovy/linux: 5.8.0-30.32 -proposed tracker (LP: #1903194) * Update kernel packaging to support forward porting kernels (LP: #1902957) - [Debian] Update for leader included in BACKPORT_SUFFIX * Avoid double newline when running insertchanges (LP: #1903293) - [Packaging] insertchanges: avoid double newline * EFI: Fails when BootCurrent entry does not exist (LP: #183) - efivarfs: Replace invalid slashes with exclamation marks in dentries. * raid10: Block discard is very slow, causing severe delays for mkfs and fstrim operations (LP: #1896578) - md: add md_submit_discard_bio() for submitting discard bio - md/raid10: extend r10bio devs to raid disks - md/raid10: pull codes that wait for blocked dev into one function - md/raid10: improve raid10 discard request - md/raid10: improve discard request for far layout - dm raid: fix discard limits for raid1 and raid10 - dm raid: remove unnecessary discard limits for raid10 * Bionic: btrfs: kernel BUG at /build/linux- eTBZpZ/linux-4.15.0/fs/btrfs/ctree.c:3233! (LP: #1902254) - btrfs: extent_io: do extra check for extent buffer read write functions - btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() - btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref() - btrfs: ctree: check key order before merging tree blocks * Tiger Lake PMC core driver fixes (LP: #1899883) - platform/x86: intel_pmc_core: update TGL's LPM0 reg bit map name - platform/x86: intel_pmc_core: fix bound check in pmc_core_mphy_pg_show() - platform/x86: pmc_core: Use descriptive names for LPM registers - platform/x86: intel_pmc_core: Fix TigerLake power gating status map - platform/x86: intel_pmc_core: Fix the slp_s0 counter displayed value * drm/i915/dp_mst - System would hang during the boot up. (LP: #1902469) - Revert "UBUNTU: SAUCE: drm/i915/display: Fix null deref in intel_psr_atomic_check()" - drm/i915: Fix encoder lookup during PSR atomic check * Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems (LP: #1902694) - powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation - selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround * [20.04 FEAT] Support/enhancement of NVMe IPL (LP: #1902179) - s390/ipl: support NVMe IPL kernel parameters * uvcvideo: add mapping for HEVC payloads (LP: #1895803) - media: uvcvideo: Add mapping for HEVC payloads * risc-v 5.8 kernel oops on ftrace tests (LP: #1894613) - stop_machine, rcu: Mark functions as notrace * Groovy update: v5.8.17 upstream stable release (LP: #1902137) - xgb4: handle 4-tuple PEDIT to NAT mode translation - ibmveth: Switch order of ibmveth_helper calls. - ibmveth: Identify ingress large send packets. - ipv4: Restore flowi4_oif update before call to xfrm_lookup_route - mlx4: handle non-napi callers to napi_poll - net: dsa: microchip: fix race condition - net: fec: Fix phy_device lookup for phy_reset_after_clk_enable() - net: fec: Fix PHY init after phy_reset_after_clk_enable() - net: fix pos incrementment in ipv6_route_seq_next - net: ipa: skip suspend/resume activities if not set up - net: mptcp: make DACK4/DACK8 usage consistent among all subflows - net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info - net/smc: fix use-after-free of delayed events - net/smc: fix valid DMBE buffer sizes - net/tls: sendfile fails with ktls offload - net: usb: qmi_wwan: add Cellient MPL200 card - tipc: fix the skb_unshare() in tipc_buf_append() - socket: fix option SO_TIMESTAMPING_NEW - socket: don't clear SOCK_TSTAMP_NEW when SO_TIMESTAMPNS is disabled - can: m_can_platform: don't call m_can_class_suspend in runtime suspend - can: j1935: j1939_tp_tx_dat_new(): fix missing initialization of skbcnt - net: j1939: j1939_session_fresh_new(): fix missing initialization of skbcnt - net/ipv4: always honour route mtu during forwarding - net_sched: remove a redundant goto chain check - r8169: fix data corruption issue on RTL8402 - binder: fix UAF when releasing todo list - ALSA: bebob: potential info leak in hwdep_read() - ALSA: hda/hdmi: fix incorrect
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
This bug was fixed in the package linux - 5.4.0-56.62 --- linux (5.4.0-56.62) focal; urgency=medium * focal/linux: 5.4.0-56.62 -proposed tracker (LP: #1905300) * CVE-2020-4788 - selftests/powerpc: rfi_flush: disable entry flush if present - powerpc/64s: flush L1D on kernel entry - powerpc/64s: flush L1D after user accesses - selftests/powerpc: entry flush test linux (5.4.0-55.61) focal; urgency=medium * focal/linux: 5.4.0-55.61 -proposed tracker (LP: #1903175) * Update kernel packaging to support forward porting kernels (LP: #1902957) - [Debian] Update for leader included in BACKPORT_SUFFIX * Avoid double newline when running insertchanges (LP: #1903293) - [Packaging] insertchanges: avoid double newline * EFI: Fails when BootCurrent entry does not exist (LP: #183) - efivarfs: Replace invalid slashes with exclamation marks in dentries. * CVE-2020-14351 - perf/core: Fix race in the perf_mmap_close() function * raid10: Block discard is very slow, causing severe delays for mkfs and fstrim operations (LP: #1896578) - md: add md_submit_discard_bio() for submitting discard bio - md/raid10: extend r10bio devs to raid disks - md/raid10: pull codes that wait for blocked dev into one function - md/raid10: improve raid10 discard request - md/raid10: improve discard request for far layout - dm raid: fix discard limits for raid1 and raid10 - dm raid: remove unnecessary discard limits for raid10 * Bionic: btrfs: kernel BUG at /build/linux- eTBZpZ/linux-4.15.0/fs/btrfs/ctree.c:3233! (LP: #1902254) - btrfs: drop unnecessary offset_in_page in extent buffer helpers - btrfs: extent_io: do extra check for extent buffer read write functions - btrfs: extent-tree: kill BUG_ON() in __btrfs_free_extent() - btrfs: extent-tree: kill the BUG_ON() in insert_inline_extent_backref() - btrfs: ctree: check key order before merging tree blocks * Ethernet no link lights after reboot (Intel i225-v 2.5G) (LP: #1902578) - igc: Add PHY power management control * Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems (LP: #1902694) - powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation - selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround * [20.04 FEAT] Support/enhancement of NVMe IPL (LP: #1902179) - s390: nvme ipl - s390: nvme reipl - s390/ipl: support NVMe IPL kernel parameters * uvcvideo: add mapping for HEVC payloads (LP: #1895803) - media: uvcvideo: Add mapping for HEVC payloads * Focal update: v5.4.73 upstream stable release (LP: #1902115) - ibmveth: Switch order of ibmveth_helper calls. - ibmveth: Identify ingress large send packets. - ipv4: Restore flowi4_oif update before call to xfrm_lookup_route - mlx4: handle non-napi callers to napi_poll - net: fec: Fix phy_device lookup for phy_reset_after_clk_enable() - net: fec: Fix PHY init after phy_reset_after_clk_enable() - net: fix pos incrementment in ipv6_route_seq_next - net/smc: fix valid DMBE buffer sizes - net/tls: sendfile fails with ktls offload - net: usb: qmi_wwan: add Cellient MPL200 card - tipc: fix the skb_unshare() in tipc_buf_append() - socket: fix option SO_TIMESTAMPING_NEW - can: m_can_platform: don't call m_can_class_suspend in runtime suspend - can: j1935: j1939_tp_tx_dat_new(): fix missing initialization of skbcnt - net: j1939: j1939_session_fresh_new(): fix missing initialization of skbcnt - net/ipv4: always honour route mtu during forwarding - net_sched: remove a redundant goto chain check - r8169: fix data corruption issue on RTL8402 - cxgb4: handle 4-tuple PEDIT to NAT mode translation - binder: fix UAF when releasing todo list - ALSA: bebob: potential info leak in hwdep_read() - ALSA: hda/hdmi: fix incorrect locking in hdmi_pcm_close - nvme-pci: disable the write zeros command for Intel 600P/P3100 - chelsio/chtls: fix socket lock - chelsio/chtls: correct netdevice for vlan interface - chelsio/chtls: correct function return and return type - ibmvnic: save changed mac address to adapter->mac_addr - net: ftgmac100: Fix Aspeed ast2600 TX hang issue - net: hdlc: In hdlc_rcv, check to make sure dev is an HDLC device - net: hdlc_raw_eth: Clear the IFF_TX_SKB_SHARING flag after calling ether_setup - net: Properly typecast int values to set sk_max_pacing_rate - net/sched: act_tunnel_key: fix OOB write in case of IPv6 ERSPAN tunnels - nexthop: Fix performance regression in nexthop deletion - nfc: Ensure presence of NFC_ATTR_FIRMWARE_NAME attribute in nfc_genl_fw_download() - r8169: fix operation under forced interrupt threading - selftests: forwarding: Add missing 'rp_filter' configuration - tcp: fix to update snd_wl1 in bulk receiver fast
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
Thanks Waiki! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: Fix Committed Status in linux source package in Groovy: Fix Committed Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
--- Comment From wa...@us.ibm.com 2020-11-17 11:58 EDT--- Verified for focal and groovy: 20.04: waiki@ltc-wspoon3:~/DI/code/boot$ ~/mpe.py vmlinux-5.4.0-55-generic System.map-5.4.0-55-generic stvx found using register r28: c002cbfc: ce e1 00 7c stvxv0,0,r28 addi found using offset 32: c002cbf4: 20 00 81 3b addir28,r1,32 OK - offset is aligned 20.10: waiki@ltc-wspoon3:~/DI/code/boot$ ~/mpe.py vmlinux-5.8.0-30-generic System.map-5.8.0-30-generic stvx found using register r9: c0025a78: ce 49 00 7c stvxv0,0,r9 addi found using offset 32: c0025a70: 20 00 21 39 addir9,r1,32 OK - offset is aligned ** Tags removed: verification-needed-focal verification-needed-groovy ** Tags added: verification-done-focal verification-done-groovy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: Fix Committed Status in linux source package in Groovy: Fix Committed Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- groovy' to 'verification-done-groovy'. If the problem still exists, change the tag 'verification-needed-groovy' to 'verification-failed- groovy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: Fix Committed Status in linux source package in Groovy: Fix Committed Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-focal ** Tags added: verification-needed-groovy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: Fix Committed Status in linux source package in Groovy: Fix Committed Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
** Changed in: linux (Ubuntu Focal) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: Fix Committed Status in linux source package in Groovy: Fix Committed Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
** Changed in: linux (Ubuntu Groovy) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: In Progress Status in linux source package in Groovy: Fix Committed Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
** Changed in: ubuntu-power-systems Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => Patricia Domingues (patriciasd) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: In Progress Status in linux source package in Groovy: In Progress Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
SRU submission: https://lists.ubuntu.com/archives/kernel-team/2020-November/thread.html#114535 ** Changed in: linux (Ubuntu Hirsute) Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) => (unassigned) ** Changed in: linux (Ubuntu Focal) Status: New => In Progress ** Changed in: linux (Ubuntu Groovy) Status: New => In Progress ** Changed in: linux (Ubuntu Hirsute) Status: New => In Progress ** Changed in: ubuntu-power-systems Status: New => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: In Progress Status in linux source package in Focal: In Progress Status in linux source package in Groovy: In Progress Status in linux source package in Hirsute: In Progress Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1902694] Re: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems
** Also affects: linux (Ubuntu Groovy) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Hirsute) Importance: Undecided Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Status: New ** Also affects: linux (Ubuntu Focal) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1902694 Title: Undetected Data corruption in MPI workloads that use VSX for reductions on POWER9 DD2.1 systems Status in The Ubuntu-power-systems project: New Status in linux package in Ubuntu: New Status in linux source package in Focal: New Status in linux source package in Groovy: New Status in linux source package in Hirsute: New Bug description: SRU Justification: [Impact] * A data integrity issue was observed on POWER 9 (DD2.1) systems. * It affects Ubuntu 20.04 with kernel 5.4.0-52 and Ubuntu 20.10 with kernel 5.8.0-26 kernel. * The root cause is found in the compiling of p9_hmi_special_emu(). * When doing a VMX store (in __get_user_atomic_128_aligned()) to a buffer (vbuf), the buffer is not 128 bit aligned. [Fix] * 1da4a0272c54 "powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation" * d1781f237047 "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Test Case] * A POWER 9 (DD2.1) bare metal system is needed that has either Ubuntu 20.04, 20.10 or 21.04 installed. * It's best to test this based on a sample application and test case "selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround" [Regression Potential] * The regression risk is relatively moderate, because: * it only happens with special VSX (vector) instructions in use, e.g. in p9_hmi_special_emu * it happens on bare metal only and only on POWER 9 (DD2.1) * and the changes are very overseeable (in total one effective code line per patch/commit) * Since only p9_hmi_special_emu is touched, this will break in case of any regressions, but this is already broken based on this bug. [Other] * According to the reporter this affects Ubuntu 20.04 / 5.4.0-52 and 20.10 / 5.8.0-26. * Since the development of Hirsute is already open the SRU is requested for Hirsute, too. * Patches got upstream accepted in v5.10-rc1 and v5.10-rc2. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1902694/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp