[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
** Changed in: linux (Ubuntu) Status: Incomplete => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Fix Released Status in linux source package in Zesty: Fix Released Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
This bug was fixed in the package linux - 4.10.0-28.32 --- linux (4.10.0-28.32) zesty; urgency=low * linux: 4.10.0-28.32 -proposed tracker (LP: #1701013) * KILLER1435-S[0489:e0a2] BT cannot search BT 4.0 device (LP: #1699651) - Bluetooth: btusb: Add support for 0489:e0a2 QCA_ROME device * aacraid driver may return uninitialized stack data to userspace (LP: #1700077) - SAUCE: scsi: aacraid: Don't copy uninitialized stack memory to userspace * CVE-2017-9605 - drm/vmwgfx: Make sure backup_handle is always valid * CVE-2017-1000380 - ALSA: timer: Fix race between read and ioctl - ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT * XDP eBPF programs fail to verify on Zesty ppc64el (LP: #1699627) - [Config] ppc64el: build for Power8 not Power7 * AACRAID for power9 platform (LP: #1689980) - scripts/spelling.txt: add "therfore" pattern and fix typo instances - scsi: aacraid: fix PCI error recovery path - scsi: aacraid: pci_alloc_consistent() failures on ARM64 - scsi: aacraid: Remove __GFP_DMA for raw srb memory - scsi: aacraid: Fix DMAR issues with iommu=pt - scsi: aacraid: Added 32 and 64 queue depth for arc natives - scsi: aacraid: Set correct Queue Depth for HBA1000 RAW disks - scsi: aacraid: Remove reset support from check_health - scsi: aacraid: Change wait time for fib completion - scsi: aacraid: Log count info of scsi cmds before reset - scsi: aacraid: Print ctrl status before eh reset - scsi: aacraid: Using single reset mask for IOP reset - scsi: aacraid: Rework IOP reset - scsi: aacraid: Add periodic checks to see IOP reset status - scsi: aacraid: Rework SOFT reset code - scsi: aacraid: Rework aac_src_restart - scsi: aacraid: Use correct function to get ctrl health - scsi: aacraid: Make sure ioctl returns on controller reset - scsi: aacraid: Enable ctrl reset for both hba and arc - scsi: aacraid: Add reset debugging statements - scsi: aacraid: Remove reference to Series-9 - scsi: aacraid: Update driver version to 50834 * arm64 kernel crashdump support (LP: #1694859) - memblock: add memblock_clear_nomap() - memblock: add memblock_cap_memory_range() - arm64: limit memory regions based on DT property, usable-memory-range - arm64: kdump: reserve memory for crash dump kernel - arm64: mm: add set_memory_valid() - arm64: mm: use phys_addr_t instead of unsigned long in __map_memblock - arm64: kdump: protect crash dump kernel memory - arm64: hibernate: preserve kdump image around hibernation - arm64: kdump: implement machine_crash_shutdown() - arm64: kdump: add VMCOREINFO's for user-space tools - [Config] CONFIG_CRASH_DUMP=y on arm64 - arm64: kdump: provide /proc/vmcore file - Documentation: kdump: describe arm64 port - Documentation: dt: chosen properties for arm64 kdump - efi/libstub/arm*: Set default address and size cells values for an empty dtb * hibmc driver does not include "pci:" prefix in bus ID (LP: #1698700) - SAUCE: drm: hibmc: Use set_busid function from drm core * Processes in "D" state due to zap_pid_ns_processes kernel call with Ubuntu + Docker (LP: #1698264) - pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes * Bugfixes for hns network driver (LP: #1696031) - hns_enet: use cpumask_var_t for on-stack mask - net: hns: fix uninitialized data use - net: hns: avoid gcc-7.0.1 warning for uninitialized data - net: hns: Add ACPI support to check SFP present - net: hns: Fix the implementation of irq affinity function - net: hns: Modify GMAC init TX threshold value - net: hns: Optimize the code for GMAC pad and crc Config - net: hns: Remove redundant memset during buffer release - net: hns: bug fix of ethtool show the speed - net: hns: Optimize hns_nic_common_poll for better performance - net: hns: Fix to adjust buf_size of ring according to mtu - net: hns: Replace netif_tx_lock to ring spin lock - net: hns: Correct HNS RSS key set function - net: hns: Remove the redundant adding and deleting mac function - net: hns: Remove redundant mac_get_id() - net: hns: Remove redundant mac table operations - net: hns: Clean redundant code from hns_mdio.c file - net: hns: Optimise the code in hns_mdio_wait_ready() - net: hns: Simplify the exception sequence in hns_ppe_init() - net: hns: Adjust the SBM module buffer threshold - net: hns: Avoid Hip06 chip TX packet line bug - net: hns: Some checkpatch.pl script & warning fixes - net: hns: support deferred probe when can not obtain irq - net: hns: support deferred probe when no mdio - net: hns: fix ethtool_get_strings overflow in hns driver * CVE-2017-7346 - drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() * [SRU][Zesty] qcom_emac is unable to get ip address with at803x
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
== with hwpoison-inject module == Test case fails as expected .. and does not result in soft-lockups. $ uname -r 4.10.0-28-generic $ sudo ./run_hugepage_overcommit.sh sudo: unable to resolve host awsdp0 hwpoison-inject module is loaded. *** Pay attention: This test checks that hugepage soft-offlining works under overcommitting. *** - TestCase ./thugetlb_overcommit 1 FAIL: migration failed. Unpoisoning. Num of Executed Test Case: 1Num of Failed Case: 1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Incomplete Status in linux source package in Zesty: Fix Committed Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
== Testing kernel in -proposed == The kernel installs and boots fine on the QDF2400 platform. $ uname -a Linux awsdp0 4.10.0-28-generic #32-Ubuntu SMP Fri Jun 30 05:33:10 UTC 2017 aarch64 aarch64 aarch64 GNU/Linux $ apt policy linux-image-4.10.0-28-generic linux-image-4.10.0-28-generic: Installed: 4.10.0-28.32 Candidate: 4.10.0-28.32 Version table: *** 4.10.0-28.32 500 500 http://ports.ubuntu.com/ubuntu-ports zesty-proposed/main arm64 Packages 100 /var/lib/dpkg/status The testcase needs hwpoison-inject module to run. But it is not enabled in the Configs by default. $ sudo ./run_hugepage_overcommit.sh sysctl: cannot stat /proc/sys/vm/memory_failure_early_kill: No such file or directory modprobe: FATAL: Module hwpoison-inject not found in directory /lib/modules/4.10.0-28-generic DIE: Failed to load hwpoison-inject module. Abort. DIE: Failed to load hwpoison-inject module. Abort. I can confirm that rebuilding the kernel with the hwpoison-inject module, and running the test, the kernel does not get soft-lockups and test works as expected. ** Tags removed: verification-needed-zesty ** Tags added: verification-done-zesty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Incomplete Status in linux source package in Zesty: Fix Committed Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-zesty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Incomplete Status in linux source package in Zesty: Fix Committed Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
** Also affects: linux (Ubuntu Zesty) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Zesty) Importance: Undecided => High ** Changed in: linux (Ubuntu Zesty) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Incomplete Status in linux source package in Zesty: Fix Committed Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
Test Kernel is available in PPA: https://launchpad.net/~centriq- team/+archive/ubuntu/lp1696165/ Boot tested on Power8: ubuntu@manjo-srutest:~$ uname -a Linux manjo-srutest 4.10.0-22-generic #24~lp1696165+softlockup.1-Ubuntu SMP Wed Jun 14 19:58:24 UTC 20 ppc64le ppc64le ppc64le GNU/Linux Boot tested on AMD64: ubuntu@adib:~$ uname -a Linux adib 4.10.0-22-generic #24~lp1696165+softlockup.1-Ubuntu SMP Wed Jun 14 20:01:20 UTC 20 x86_64 x86_64 x86_64 GNU/Linux ubuntu@adib:~$ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Incomplete Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
== With patch applied == ubuntu@ubuntu:~/testing/mce-test/cases/function/hwpoison$ uname -a Linux ubuntu 4.10.0-22-generic #24~lp1696165+softlockup.1 SMP Wed Jun 14 19:05:07 UTC 2017 aarch64 aarch64 aarch64 GNU/Linux ubuntu@ubuntu:~/testing/mce-test/cases/function/hwpoison$ sudo ./run_hugepage_overcommit.sh [sudo] password for ubuntu: hwpoison-inject module is loaded. *** Pay attention: This test checks that hugepage soft-offlining works under overcommitting. *** - TestCase ./thugetlb_overcommit 1 FAIL: migration failed. Unpoisoning. Num of Executed Test Case: 1Num of Failed Case: 1 Testcase failure is expected because hugepage migration is not enabled in the Ubuntu configs. Please not that we no longer see softlockups. The patch fixed that bug. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Incomplete Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1696165] Re: [SRU][Zesty] fix soft lockup on overcommited hugepages
== Without the patch == ubuntu@ubuntu:~/testing/mce-test/cases/function/hwpoison$ sudo ./run_hugepage_overcommit.sh [sudo] password for ubuntu: *** Pay attention: This test checks that hugepage soft-offlining works under overcommitting. *** - TestCase ./thugetlb_overcommit 1 [ 1628.254754] NMI watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [thugetlb_overco:3154] [ 1660.668149] INFO: rcu_sched self-detected stall on CPU [ 1660.672210] INFO: rcu_sched detected stalls on CPUs/tasks: [ 1660.672216] 8-...: (14998 ticks this GP) idle=72f/141/0 softirq=1348/1348 fqs=7389 [ 1660.672217] (detected by 18, t=15002 jiffies, g=3147, c=3146, q=503) [ 1660.692986] 8-...: (14998 ticks this GP) idle=72f/141/0 softirq=1348/1348 fqs=7392 [ 1660.701752] (t=15009 jiffies g=3147 c=3146 q=503) [ 1840.695633] INFO: rcu_sched self-detected stall on CPU [ 1840.699810] INFO: rcu_sched detected stalls on CPUs/tasks: [ 1840.699818] 8-...: (59995 ticks this GP) idle=72f/141/0 softirq=1348/1348 fqs=27921 [ 1840.699818] (t=60007 jiffies g=3147 c=3146 q=1101) [ 1840.719086] 8-...: (6 ticks this GP) idle=72f/140/0 softirq=1348/1348 fqs=27921 [ 1840.727935] (detected by 1, t=60007 jiffies, g=3147, c=3146, q=1101) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696165 Title: [SRU][Zesty] fix soft lockup on overcommited hugepages Status in linux package in Ubuntu: Incomplete Bug description: [Impact] On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite [Testing] Run the mce-test/cases/function/hwpoison/run_hugepage_overcommit.sh and you should see softlock up if hugepage migration support is not enabled. [Fix] upstream commit: 30809f559a0d mm/migrate: fix refcount handling when !hugepage_migration_supported() [Regression Potential] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696165/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp