[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
This bug was fixed in the package linux-azure - 5.4.0-1065.68 --- linux-azure (5.4.0-1065.68) focal; urgency=medium * focal/linux-azure: 5.4.0-1065.68 -proposed tracker (LP: #1952290) * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632) - [Config] azure: enable CONFIG_DEBUG_INFO_BTF * Support builtin revoked certificates (LP: #1932029) - [Config] azure: set CONFIG_SYSTEM_REVOCATION_KEYS * Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24 (LP: #1952621) - PCI/sysfs: Convert "config" to static attribute * linux-azure: add Icelake servers support in no-HWP mode to cpufreq/intel_pstate driver (LP: #1952234) - cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode [ Ubuntu: 5.4.0-92.103 ] * focal/linux: 5.4.0-92.103 -proposed tracker (LP: #1952316) * Packaging resync (LP: #1786013) - [Packaging] resync update-dkms-versions helper - debian/dkms-versions -- update from kernel-versions (main/2021.11.29) * CVE-2021-4002 - tlb: mmu_gather: add tlb_flush_*_range APIs - hugetlbfs: flush TLBs correctly after huge_pmd_unshare * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632) - [Config] Enable CONFIG_DEBUG_INFO_BTF on all arches * Focal linux-azure: Vm crash on Dv5/Ev5 (LP: #1950462) - KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again - jump_label: Fix usage in module __init * Support builtin revoked certificates (LP: #1932029) - Revert "UBUNTU: SAUCE: (lockdown) Make get_cert_list() not complain about cert lists that aren't present." - integrity: Move import of MokListRT certs to a separate routine - integrity: Load certs from the EFI MOK config table - certs: Add ability to preload revocation certs - integrity: Load mokx variables into the blacklist keyring - certs: add 'x509_revocation_list' to gitignore - SAUCE: Dump stack when X.509 certificates cannot be loaded - [Packaging] build canonical-revoked-certs.pem from branch/arch certs - [Packaging] Revoke 2012 UEFI signing certificate as built-in - [Config] Configure CONFIG_SYSTEM_REVOCATION_KEYS with revoked keys * Support importing mokx keys into revocation list from the mok table (LP: #1928679) - efi: Support for MOK variable config table - efi: mokvar-table: fix some issues in new code - efi: mokvar: add missing include of asm/early_ioremap.h - efi/mokvar: Reserve the table only if it is in boot services data - SAUCE: integrity: add informational messages when revoking certs * Support importing mokx keys into revocation list from the mok table (LP: #1928679) // CVE-2020-26541 when certificates are revoked via MokListXRT. - SAUCE: integrity: Load mokx certs from the EFI MOK config table * Focal update: v5.4.157 upstream stable release (LP: #1951883) - ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned - ARM: 9134/1: remove duplicate memcpy() definition - ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype - ARM: 9141/1: only warn about XIP address when not compile testing - ipv6: use siphash in rt6_exception_hash() - ipv4: use siphash instead of Jenkins in fnhe_hashfun() - usbnet: sanity check for maxpacket - usbnet: fix error return code in usbnet_probe() - Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode" - ata: sata_mv: Fix the error handling of mv_chip_id() - nfc: port100: fix using -ERRNO as command type mask - net/tls: Fix flipped sign in tls_err_abort() calls - mmc: vub300: fix control-message timeouts - mmc: cqhci: clear HALT state after CQE enable - mmc: dw_mmc: exynos: fix the finding clock sample value - mmc: sdhci: Map more voltage level to SDHCI_POWER_330 - mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit - cfg80211: scan: fix RCU in cfg80211_add_nontrans_list() - net: lan78xx: fix division by zero in send path - tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function - IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields - IB/hfi1: Fix abba locking issue with sc_disable() - nvmet-tcp: fix data digest pointer calculation - nvme-tcp: fix data digest pointer calculation - RDMA/mlx5: Set user priority for DCT - arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node - regmap: Fix possible double-free in regcache_rbtree_exit() - net: batman-adv: fix error handling - net: Prevent infinite while loop in skb_tx_hash() - RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string - nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST - net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume fails - net: ethernet: microchip: lan743x: Fix dma allocation failure by using dma_set_mask_and_coherent - net: nxp: lpc_eth.c: avoid
[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
This bug was fixed in the package linux-azure-5.4 - 5.4.0-1065.68~18.04.1 --- linux-azure-5.4 (5.4.0-1065.68~18.04.1) bionic; urgency=medium * bionic/linux-azure-5.4: 5.4.0-1065.68~18.04.1 -proposed tracker (LP: #1952289) [ Ubuntu: 5.4.0-1065.68 ] * focal/linux-azure: 5.4.0-1065.68 -proposed tracker (LP: #1952290) * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632) - [Config] azure: enable CONFIG_DEBUG_INFO_BTF * Support builtin revoked certificates (LP: #1932029) - [Config] azure: set CONFIG_SYSTEM_REVOCATION_KEYS * Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24 (LP: #1952621) - PCI/sysfs: Convert "config" to static attribute * linux-azure: add Icelake servers support in no-HWP mode to cpufreq/intel_pstate driver (LP: #1952234) - cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode * focal/linux: 5.4.0-92.103 -proposed tracker (LP: #1952316) * Packaging resync (LP: #1786013) - [Packaging] resync update-dkms-versions helper - debian/dkms-versions -- update from kernel-versions (main/2021.11.29) * CVE-2021-4002 - tlb: mmu_gather: add tlb_flush_*_range APIs - hugetlbfs: flush TLBs correctly after huge_pmd_unshare * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632) - [Config] Enable CONFIG_DEBUG_INFO_BTF on all arches * Focal linux-azure: Vm crash on Dv5/Ev5 (LP: #1950462) - KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again - jump_label: Fix usage in module __init * Support builtin revoked certificates (LP: #1932029) - Revert "UBUNTU: SAUCE: (lockdown) Make get_cert_list() not complain about cert lists that aren't present." - integrity: Move import of MokListRT certs to a separate routine - integrity: Load certs from the EFI MOK config table - certs: Add ability to preload revocation certs - integrity: Load mokx variables into the blacklist keyring - certs: add 'x509_revocation_list' to gitignore - SAUCE: Dump stack when X.509 certificates cannot be loaded - [Packaging] build canonical-revoked-certs.pem from branch/arch certs - [Packaging] Revoke 2012 UEFI signing certificate as built-in - [Config] Configure CONFIG_SYSTEM_REVOCATION_KEYS with revoked keys * Support importing mokx keys into revocation list from the mok table (LP: #1928679) - efi: Support for MOK variable config table - efi: mokvar-table: fix some issues in new code - efi: mokvar: add missing include of asm/early_ioremap.h - efi/mokvar: Reserve the table only if it is in boot services data - SAUCE: integrity: add informational messages when revoking certs * Support importing mokx keys into revocation list from the mok table (LP: #1928679) // CVE-2020-26541 when certificates are revoked via MokListXRT. - SAUCE: integrity: Load mokx certs from the EFI MOK config table * Focal update: v5.4.157 upstream stable release (LP: #1951883) - ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned - ARM: 9134/1: remove duplicate memcpy() definition - ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype - ARM: 9141/1: only warn about XIP address when not compile testing - ipv6: use siphash in rt6_exception_hash() - ipv4: use siphash instead of Jenkins in fnhe_hashfun() - usbnet: sanity check for maxpacket - usbnet: fix error return code in usbnet_probe() - Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode" - ata: sata_mv: Fix the error handling of mv_chip_id() - nfc: port100: fix using -ERRNO as command type mask - net/tls: Fix flipped sign in tls_err_abort() calls - mmc: vub300: fix control-message timeouts - mmc: cqhci: clear HALT state after CQE enable - mmc: dw_mmc: exynos: fix the finding clock sample value - mmc: sdhci: Map more voltage level to SDHCI_POWER_330 - mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit - cfg80211: scan: fix RCU in cfg80211_add_nontrans_list() - net: lan78xx: fix division by zero in send path - tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function - IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields - IB/hfi1: Fix abba locking issue with sc_disable() - nvmet-tcp: fix data digest pointer calculation - nvme-tcp: fix data digest pointer calculation - RDMA/mlx5: Set user priority for DCT - arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node - regmap: Fix possible double-free in regcache_rbtree_exit() - net: batman-adv: fix error handling - net: Prevent infinite while loop in skb_tx_hash() - RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string - nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST - net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume fails - net: ethernet: microchip:
[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
Verified by Microsoft. ** Tags removed: verification-needed-focal ** Tags added: verification-done-focal -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1952621 Title: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24 Status in linux-azure package in Ubuntu: Invalid Status in linux-azure-5.4 package in Ubuntu: New Status in linux-azure source package in Bionic: Invalid Status in linux-azure-5.4 source package in Bionic: Fix Committed Status in linux-azure source package in Focal: Fix Committed Status in linux-azure-5.4 source package in Focal: Invalid Bug description: SRU Justification [Impact] During large scale deployment testing, we found below call trace when provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed instance 10 times and encountered once. It looks like a race condition when probe device, but finally all devices can be probed. [ 4.938162] sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config' [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure #64~18.04.1-Ubuntu [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work [ 4.955812] Call Trace: [ 4.955812] dump_stack+0x57/0x6d [ 4.955812] sysfs_warn_dup+0x5b/0x70 [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180 [ 4.955812] sysfs_create_bin_file+0x64/0x90 [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270 [ 4.955812] pci_bus_add_device+0x30/0x80 [ 4.955812] pci_bus_add_devices+0x31/0x70 [ 4.955812] hv_pci_probe+0x48c/0x650 [ 4.955812] vmbus_probe+0x3e/0x90 [ 4.955812] really_probe+0xf5/0x440 [ 4.955812] driver_probe_device+0x11b/0x130 [ 4.955812] __device_attach_driver+0x7b/0xe0 [ 4.955812] ? driver_allows_async_probing+0x60/0x60 [ 4.955812] bus_for_each_drv+0x6e/0xb0 [ 4.955812] __device_attach+0xe4/0x160 [ 4.955812] device_initial_probe+0x13/0x20 [ 4.955812] bus_probe_device+0x92/0xa0 [ 4.955812] device_add+0x402/0x690 [ 4.955812] device_register+0x1a/0x20 [ 4.955812] vmbus_device_register+0x5e/0xf0 [ 4.955812] vmbus_add_channel_work+0x2c4/0x640 [ 4.955812] process_one_work+0x209/0x400 [ 4.955812] worker_thread+0x34/0x400 [ 4.955812] kthread+0x121/0x140 [ 4.955812] ? process_one_work+0x400/0x400 [ 4.955812] ? kthread_park+0x90/0x90 [ 4.955812] ret_from_fork+0x35/0x40 [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: Using version 0x10002 [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to bus 0004:00 Dexuan did some research and it looks like this is a longstanding race condition bug in the generic PCI subsystem (due to the timing, there can be more than 1 place where the PCI code tries to create the same ‘config’ sysfs file): https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641 The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It looks like this has not been fixed after 1+ year… Business Impact [Test Case] Repeated deployment on a Standard_NV24 instance. MS reported the reproduction rate is 3/551 before the patch, and 0/838 with the patch. [Where things could go wrong] Deployments could fail for other reasons. [Other info] SF: #00321027 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
This bug is awaiting verification that the linux-azure/5.4.0-1065.68 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-focal -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1952621 Title: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24 Status in linux-azure package in Ubuntu: Invalid Status in linux-azure-5.4 package in Ubuntu: New Status in linux-azure source package in Bionic: Invalid Status in linux-azure-5.4 source package in Bionic: Fix Committed Status in linux-azure source package in Focal: Fix Committed Status in linux-azure-5.4 source package in Focal: Invalid Bug description: SRU Justification [Impact] During large scale deployment testing, we found below call trace when provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed instance 10 times and encountered once. It looks like a race condition when probe device, but finally all devices can be probed. [ 4.938162] sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config' [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure #64~18.04.1-Ubuntu [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work [ 4.955812] Call Trace: [ 4.955812] dump_stack+0x57/0x6d [ 4.955812] sysfs_warn_dup+0x5b/0x70 [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180 [ 4.955812] sysfs_create_bin_file+0x64/0x90 [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270 [ 4.955812] pci_bus_add_device+0x30/0x80 [ 4.955812] pci_bus_add_devices+0x31/0x70 [ 4.955812] hv_pci_probe+0x48c/0x650 [ 4.955812] vmbus_probe+0x3e/0x90 [ 4.955812] really_probe+0xf5/0x440 [ 4.955812] driver_probe_device+0x11b/0x130 [ 4.955812] __device_attach_driver+0x7b/0xe0 [ 4.955812] ? driver_allows_async_probing+0x60/0x60 [ 4.955812] bus_for_each_drv+0x6e/0xb0 [ 4.955812] __device_attach+0xe4/0x160 [ 4.955812] device_initial_probe+0x13/0x20 [ 4.955812] bus_probe_device+0x92/0xa0 [ 4.955812] device_add+0x402/0x690 [ 4.955812] device_register+0x1a/0x20 [ 4.955812] vmbus_device_register+0x5e/0xf0 [ 4.955812] vmbus_add_channel_work+0x2c4/0x640 [ 4.955812] process_one_work+0x209/0x400 [ 4.955812] worker_thread+0x34/0x400 [ 4.955812] kthread+0x121/0x140 [ 4.955812] ? process_one_work+0x400/0x400 [ 4.955812] ? kthread_park+0x90/0x90 [ 4.955812] ret_from_fork+0x35/0x40 [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: Using version 0x10002 [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to bus 0004:00 Dexuan did some research and it looks like this is a longstanding race condition bug in the generic PCI subsystem (due to the timing, there can be more than 1 place where the PCI code tries to create the same ‘config’ sysfs file): https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641 The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It looks like this has not been fixed after 1+ year… Business Impact [Test Case] Repeated deployment on a Standard_NV24 instance. MS reported the reproduction rate is 3/551 before the patch, and 0/838 with the patch. [Where things could go wrong] Deployments could fail for other reasons. [Other info] SF: #00321027 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
** Changed in: linux-azure (Ubuntu Focal) Status: In Progress => Fix Committed ** Changed in: linux-azure-5.4 (Ubuntu Bionic) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1952621 Title: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24 Status in linux-azure package in Ubuntu: Invalid Status in linux-azure-5.4 package in Ubuntu: New Status in linux-azure source package in Bionic: Invalid Status in linux-azure-5.4 source package in Bionic: Fix Committed Status in linux-azure source package in Focal: Fix Committed Status in linux-azure-5.4 source package in Focal: Invalid Bug description: SRU Justification [Impact] During large scale deployment testing, we found below call trace when provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed instance 10 times and encountered once. It looks like a race condition when probe device, but finally all devices can be probed. [ 4.938162] sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config' [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure #64~18.04.1-Ubuntu [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work [ 4.955812] Call Trace: [ 4.955812] dump_stack+0x57/0x6d [ 4.955812] sysfs_warn_dup+0x5b/0x70 [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180 [ 4.955812] sysfs_create_bin_file+0x64/0x90 [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270 [ 4.955812] pci_bus_add_device+0x30/0x80 [ 4.955812] pci_bus_add_devices+0x31/0x70 [ 4.955812] hv_pci_probe+0x48c/0x650 [ 4.955812] vmbus_probe+0x3e/0x90 [ 4.955812] really_probe+0xf5/0x440 [ 4.955812] driver_probe_device+0x11b/0x130 [ 4.955812] __device_attach_driver+0x7b/0xe0 [ 4.955812] ? driver_allows_async_probing+0x60/0x60 [ 4.955812] bus_for_each_drv+0x6e/0xb0 [ 4.955812] __device_attach+0xe4/0x160 [ 4.955812] device_initial_probe+0x13/0x20 [ 4.955812] bus_probe_device+0x92/0xa0 [ 4.955812] device_add+0x402/0x690 [ 4.955812] device_register+0x1a/0x20 [ 4.955812] vmbus_device_register+0x5e/0xf0 [ 4.955812] vmbus_add_channel_work+0x2c4/0x640 [ 4.955812] process_one_work+0x209/0x400 [ 4.955812] worker_thread+0x34/0x400 [ 4.955812] kthread+0x121/0x140 [ 4.955812] ? process_one_work+0x400/0x400 [ 4.955812] ? kthread_park+0x90/0x90 [ 4.955812] ret_from_fork+0x35/0x40 [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: Using version 0x10002 [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to bus 0004:00 Dexuan did some research and it looks like this is a longstanding race condition bug in the generic PCI subsystem (due to the timing, there can be more than 1 place where the PCI code tries to create the same ‘config’ sysfs file): https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641 The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It looks like this has not been fixed after 1+ year… Business Impact [Test Case] Repeated deployment on a Standard_NV24 instance. MS reported the reproduction rate is 3/551 before the patch, and 0/838 with the patch. [Where things could go wrong] Deployments could fail for other reasons. [Other info] SF: #00321027 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
** Also affects: linux-azure-5.4 (Ubuntu) Importance: Undecided Status: New ** Changed in: linux-azure-5.4 (Ubuntu Focal) Status: New => Invalid ** Changed in: linux-azure-5.4 (Ubuntu Bionic) Status: New => In Progress ** Changed in: linux-azure-5.4 (Ubuntu Bionic) Importance: Undecided => Medium ** Changed in: linux-azure-5.4 (Ubuntu Bionic) Assignee: (unassigned) => Tim Gardner (timg-tpi) ** Changed in: linux-azure (Ubuntu Bionic) Status: New => Invalid ** Changed in: linux-azure (Ubuntu) Assignee: Tim Gardner (timg-tpi) => (unassigned) ** Changed in: linux-azure (Ubuntu) Status: New => Invalid ** Changed in: linux-azure (Ubuntu Focal) Status: New => In Progress ** Changed in: linux-azure (Ubuntu Focal) Assignee: (unassigned) => Tim Gardner (timg-tpi) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-azure in Ubuntu. https://bugs.launchpad.net/bugs/1952621 Title: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24 Status in linux-azure package in Ubuntu: Invalid Status in linux-azure-5.4 package in Ubuntu: New Status in linux-azure source package in Bionic: Invalid Status in linux-azure-5.4 source package in Bionic: In Progress Status in linux-azure source package in Focal: In Progress Status in linux-azure-5.4 source package in Focal: Invalid Bug description: SRU Justification [Impact] During large scale deployment testing, we found below call trace when provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed instance 10 times and encountered once. It looks like a race condition when probe device, but finally all devices can be probed. [ 4.938162] sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config' [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure #64~18.04.1-Ubuntu [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work [ 4.955812] Call Trace: [ 4.955812] dump_stack+0x57/0x6d [ 4.955812] sysfs_warn_dup+0x5b/0x70 [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180 [ 4.955812] sysfs_create_bin_file+0x64/0x90 [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270 [ 4.955812] pci_bus_add_device+0x30/0x80 [ 4.955812] pci_bus_add_devices+0x31/0x70 [ 4.955812] hv_pci_probe+0x48c/0x650 [ 4.955812] vmbus_probe+0x3e/0x90 [ 4.955812] really_probe+0xf5/0x440 [ 4.955812] driver_probe_device+0x11b/0x130 [ 4.955812] __device_attach_driver+0x7b/0xe0 [ 4.955812] ? driver_allows_async_probing+0x60/0x60 [ 4.955812] bus_for_each_drv+0x6e/0xb0 [ 4.955812] __device_attach+0xe4/0x160 [ 4.955812] device_initial_probe+0x13/0x20 [ 4.955812] bus_probe_device+0x92/0xa0 [ 4.955812] device_add+0x402/0x690 [ 4.955812] device_register+0x1a/0x20 [ 4.955812] vmbus_device_register+0x5e/0xf0 [ 4.955812] vmbus_add_channel_work+0x2c4/0x640 [ 4.955812] process_one_work+0x209/0x400 [ 4.955812] worker_thread+0x34/0x400 [ 4.955812] kthread+0x121/0x140 [ 4.955812] ? process_one_work+0x400/0x400 [ 4.955812] ? kthread_park+0x90/0x90 [ 4.955812] ret_from_fork+0x35/0x40 [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: Using version 0x10002 [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to bus 0004:00 Dexuan did some research and it looks like this is a longstanding race condition bug in the generic PCI subsystem (due to the timing, there can be more than 1 place where the PCI code tries to create the same ‘config’ sysfs file): https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641 The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It looks like this has not been fixed after 1+ year… Business Impact [Test Case] Repeated deployment on a Standard_NV24 instance. MS reported the reproduction rate is 3/551 before the patch, and 0/838 with the patch. [Where things could go wrong] Deployments could fail for other reasons. [Other info] SF: #00321027 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp