[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

2022-01-04 Thread Launchpad Bug Tracker
This bug was fixed in the package linux-azure - 5.4.0-1065.68

---
linux-azure (5.4.0-1065.68) focal; urgency=medium

  * focal/linux-azure: 5.4.0-1065.68 -proposed tracker (LP: #1952290)

  * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632)
- [Config] azure: enable CONFIG_DEBUG_INFO_BTF

  * Support builtin revoked certificates (LP: #1932029)
- [Config] azure: set CONFIG_SYSTEM_REVOCATION_KEYS

  * Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
(LP: #1952621)
- PCI/sysfs: Convert "config" to static attribute

  * linux-azure: add Icelake servers support in no-HWP mode to
cpufreq/intel_pstate driver (LP: #1952234)
- cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode

  [ Ubuntu: 5.4.0-92.103 ]

  * focal/linux: 5.4.0-92.103 -proposed tracker (LP: #1952316)
  * Packaging resync (LP: #1786013)
- [Packaging] resync update-dkms-versions helper
- debian/dkms-versions -- update from kernel-versions (main/2021.11.29)
  * CVE-2021-4002
- tlb: mmu_gather: add tlb_flush_*_range APIs
- hugetlbfs: flush TLBs correctly after huge_pmd_unshare
  * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632)
- [Config] Enable CONFIG_DEBUG_INFO_BTF on all arches
  * Focal linux-azure: Vm crash on Dv5/Ev5 (LP: #1950462)
- KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
- jump_label: Fix usage in module __init
  * Support builtin revoked certificates (LP: #1932029)
- Revert "UBUNTU: SAUCE: (lockdown) Make get_cert_list() not complain about
  cert lists that aren't present."
- integrity: Move import of MokListRT certs to a separate routine
- integrity: Load certs from the EFI MOK config table
- certs: Add ability to preload revocation certs
- integrity: Load mokx variables into the blacklist keyring
- certs: add 'x509_revocation_list' to gitignore
- SAUCE: Dump stack when X.509 certificates cannot be loaded
- [Packaging] build canonical-revoked-certs.pem from branch/arch certs
- [Packaging] Revoke 2012 UEFI signing certificate as built-in
- [Config] Configure CONFIG_SYSTEM_REVOCATION_KEYS with revoked keys
  * Support importing mokx keys into revocation list from the mok table
(LP: #1928679)
- efi: Support for MOK variable config table
- efi: mokvar-table: fix some issues in new code
- efi: mokvar: add missing include of asm/early_ioremap.h
- efi/mokvar: Reserve the table only if it is in boot services data
- SAUCE: integrity: add informational messages when revoking certs
  * Support importing mokx keys into revocation list from the mok table
(LP: #1928679) // CVE-2020-26541 when certificates are revoked via
MokListXRT.
- SAUCE: integrity: Load mokx certs from the EFI MOK config table
  * Focal update: v5.4.157 upstream stable release (LP: #1951883)
- ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
- ARM: 9134/1: remove duplicate memcpy() definition
- ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
- ARM: 9141/1: only warn about XIP address when not compile testing
- ipv6: use siphash in rt6_exception_hash()
- ipv4: use siphash instead of Jenkins in fnhe_hashfun()
- usbnet: sanity check for maxpacket
- usbnet: fix error return code in usbnet_probe()
- Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
- ata: sata_mv: Fix the error handling of mv_chip_id()
- nfc: port100: fix using -ERRNO as command type mask
- net/tls: Fix flipped sign in tls_err_abort() calls
- mmc: vub300: fix control-message timeouts
- mmc: cqhci: clear HALT state after CQE enable
- mmc: dw_mmc: exynos: fix the finding clock sample value
- mmc: sdhci: Map more voltage level to SDHCI_POWER_330
- mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning
  circuit
- cfg80211: scan: fix RCU in cfg80211_add_nontrans_list()
- net: lan78xx: fix division by zero in send path
- tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
- IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
- IB/hfi1: Fix abba locking issue with sc_disable()
- nvmet-tcp: fix data digest pointer calculation
- nvme-tcp: fix data digest pointer calculation
- RDMA/mlx5: Set user priority for DCT
- arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node
- regmap: Fix possible double-free in regcache_rbtree_exit()
- net: batman-adv: fix error handling
- net: Prevent infinite while loop in skb_tx_hash()
- RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
- nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST
- net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume
  fails
- net: ethernet: microchip: lan743x: Fix dma allocation failure by using
  dma_set_mask_and_coherent
- net: nxp: lpc_eth.c: avoid 

[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

2022-01-04 Thread Launchpad Bug Tracker
This bug was fixed in the package linux-azure-5.4 -
5.4.0-1065.68~18.04.1

---
linux-azure-5.4 (5.4.0-1065.68~18.04.1) bionic; urgency=medium

  * bionic/linux-azure-5.4: 5.4.0-1065.68~18.04.1 -proposed tracker
(LP: #1952289)

  [ Ubuntu: 5.4.0-1065.68 ]

  * focal/linux-azure: 5.4.0-1065.68 -proposed tracker (LP: #1952290)
  * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632)
- [Config] azure: enable CONFIG_DEBUG_INFO_BTF
  * Support builtin revoked certificates (LP: #1932029)
- [Config] azure: set CONFIG_SYSTEM_REVOCATION_KEYS
  * Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24
(LP: #1952621)
- PCI/sysfs: Convert "config" to static attribute
  * linux-azure: add Icelake servers support in no-HWP mode to
cpufreq/intel_pstate driver (LP: #1952234)
- cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode
  * focal/linux: 5.4.0-92.103 -proposed tracker (LP: #1952316)
  * Packaging resync (LP: #1786013)
- [Packaging] resync update-dkms-versions helper
- debian/dkms-versions -- update from kernel-versions (main/2021.11.29)
  * CVE-2021-4002
- tlb: mmu_gather: add tlb_flush_*_range APIs
- hugetlbfs: flush TLBs correctly after huge_pmd_unshare
  * Re-enable DEBUG_INFO_BTF where it was disabled (LP: #1945632)
- [Config] Enable CONFIG_DEBUG_INFO_BTF on all arches
  * Focal linux-azure: Vm crash on Dv5/Ev5 (LP: #1950462)
- KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
- jump_label: Fix usage in module __init
  * Support builtin revoked certificates (LP: #1932029)
- Revert "UBUNTU: SAUCE: (lockdown) Make get_cert_list() not complain about
  cert lists that aren't present."
- integrity: Move import of MokListRT certs to a separate routine
- integrity: Load certs from the EFI MOK config table
- certs: Add ability to preload revocation certs
- integrity: Load mokx variables into the blacklist keyring
- certs: add 'x509_revocation_list' to gitignore
- SAUCE: Dump stack when X.509 certificates cannot be loaded
- [Packaging] build canonical-revoked-certs.pem from branch/arch certs
- [Packaging] Revoke 2012 UEFI signing certificate as built-in
- [Config] Configure CONFIG_SYSTEM_REVOCATION_KEYS with revoked keys
  * Support importing mokx keys into revocation list from the mok table
(LP: #1928679)
- efi: Support for MOK variable config table
- efi: mokvar-table: fix some issues in new code
- efi: mokvar: add missing include of asm/early_ioremap.h
- efi/mokvar: Reserve the table only if it is in boot services data
- SAUCE: integrity: add informational messages when revoking certs
  * Support importing mokx keys into revocation list from the mok table
(LP: #1928679) // CVE-2020-26541 when certificates are revoked via
MokListXRT.
- SAUCE: integrity: Load mokx certs from the EFI MOK config table
  * Focal update: v5.4.157 upstream stable release (LP: #1951883)
- ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
- ARM: 9134/1: remove duplicate memcpy() definition
- ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
- ARM: 9141/1: only warn about XIP address when not compile testing
- ipv6: use siphash in rt6_exception_hash()
- ipv4: use siphash instead of Jenkins in fnhe_hashfun()
- usbnet: sanity check for maxpacket
- usbnet: fix error return code in usbnet_probe()
- Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
- ata: sata_mv: Fix the error handling of mv_chip_id()
- nfc: port100: fix using -ERRNO as command type mask
- net/tls: Fix flipped sign in tls_err_abort() calls
- mmc: vub300: fix control-message timeouts
- mmc: cqhci: clear HALT state after CQE enable
- mmc: dw_mmc: exynos: fix the finding clock sample value
- mmc: sdhci: Map more voltage level to SDHCI_POWER_330
- mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning
  circuit
- cfg80211: scan: fix RCU in cfg80211_add_nontrans_list()
- net: lan78xx: fix division by zero in send path
- tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
- IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
- IB/hfi1: Fix abba locking issue with sc_disable()
- nvmet-tcp: fix data digest pointer calculation
- nvme-tcp: fix data digest pointer calculation
- RDMA/mlx5: Set user priority for DCT
- arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node
- regmap: Fix possible double-free in regcache_rbtree_exit()
- net: batman-adv: fix error handling
- net: Prevent infinite while loop in skb_tx_hash()
- RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
- nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST
- net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume
  fails
- net: ethernet: microchip: 

[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

2021-12-16 Thread Tim Gardner
Verified by Microsoft.

** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1952621

Title:
  Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

Status in linux-azure package in Ubuntu:
  Invalid
Status in linux-azure-5.4 package in Ubuntu:
  New
Status in linux-azure source package in Bionic:
  Invalid
Status in linux-azure-5.4 source package in Bionic:
  Fix Committed
Status in linux-azure source package in Focal:
  Fix Committed
Status in linux-azure-5.4 source package in Focal:
  Invalid

Bug description:
  SRU Justification

  [Impact]
  During large scale deployment testing, we found below call trace when 
provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed 
instance 10 times and encountered once.

  It looks like a race condition when probe device, but finally all
  devices can be probed.

  [ 4.938162] sysfs: cannot create duplicate filename 
'/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config'
  [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray
  [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure 
#64~18.04.1-Ubuntu
  [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007 06/02/2017
  [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20
  [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work
  [ 4.955812] Call Trace:
  [ 4.955812] dump_stack+0x57/0x6d
  [ 4.955812] sysfs_warn_dup+0x5b/0x70
  [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180
  [ 4.955812] sysfs_create_bin_file+0x64/0x90
  [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270
  [ 4.955812] pci_bus_add_device+0x30/0x80
  [ 4.955812] pci_bus_add_devices+0x31/0x70
  [ 4.955812] hv_pci_probe+0x48c/0x650
  [ 4.955812] vmbus_probe+0x3e/0x90
  [ 4.955812] really_probe+0xf5/0x440
  [ 4.955812] driver_probe_device+0x11b/0x130
  [ 4.955812] __device_attach_driver+0x7b/0xe0
  [ 4.955812] ? driver_allows_async_probing+0x60/0x60
  [ 4.955812] bus_for_each_drv+0x6e/0xb0
  [ 4.955812] __device_attach+0xe4/0x160
  [ 4.955812] device_initial_probe+0x13/0x20
  [ 4.955812] bus_probe_device+0x92/0xa0
  [ 4.955812] device_add+0x402/0x690
  [ 4.955812] device_register+0x1a/0x20
  [ 4.955812] vmbus_device_register+0x5e/0xf0
  [ 4.955812] vmbus_add_channel_work+0x2c4/0x640
  [ 4.955812] process_one_work+0x209/0x400
  [ 4.955812] worker_thread+0x34/0x400
  [ 4.955812] kthread+0x121/0x140
  [ 4.955812] ? process_one_work+0x400/0x400
  [ 4.955812] ? kthread_park+0x90/0x90
  [ 4.955812] ret_from_fork+0x35/0x40
  [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: 
Using version 0x10002
  [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to 
bus 0004:00

  Dexuan did some research and it looks like this is a longstanding race 
condition bug in the generic PCI subsystem (due to the timing, there can be 
more than 1 place where the PCI code tries to create the same ‘config’ sysfs 
file):
  
https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641
  The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It 
looks like this has not been fixed after 1+ year…
  Business Impact

  [Test Case]

  Repeated deployment on a Standard_NV24 instance. MS reported the
  reproduction rate is 3/551 before the patch, and 0/838 with the patch.

  [Where things could go wrong]

  Deployments could fail for other reasons.

  [Other info]

  SF: #00321027

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

2021-12-03 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the linux-azure/5.4.0-1065.68
kernel in -proposed solves the problem. Please test the kernel and
update this bug with the results. If the problem is solved, change the
tag 'verification-needed-focal' to 'verification-done-focal'. If the
problem still exists, change the tag 'verification-needed-focal' to
'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1952621

Title:
  Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

Status in linux-azure package in Ubuntu:
  Invalid
Status in linux-azure-5.4 package in Ubuntu:
  New
Status in linux-azure source package in Bionic:
  Invalid
Status in linux-azure-5.4 source package in Bionic:
  Fix Committed
Status in linux-azure source package in Focal:
  Fix Committed
Status in linux-azure-5.4 source package in Focal:
  Invalid

Bug description:
  SRU Justification

  [Impact]
  During large scale deployment testing, we found below call trace when 
provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed 
instance 10 times and encountered once.

  It looks like a race condition when probe device, but finally all
  devices can be probed.

  [ 4.938162] sysfs: cannot create duplicate filename 
'/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config'
  [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray
  [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure 
#64~18.04.1-Ubuntu
  [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007 06/02/2017
  [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20
  [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work
  [ 4.955812] Call Trace:
  [ 4.955812] dump_stack+0x57/0x6d
  [ 4.955812] sysfs_warn_dup+0x5b/0x70
  [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180
  [ 4.955812] sysfs_create_bin_file+0x64/0x90
  [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270
  [ 4.955812] pci_bus_add_device+0x30/0x80
  [ 4.955812] pci_bus_add_devices+0x31/0x70
  [ 4.955812] hv_pci_probe+0x48c/0x650
  [ 4.955812] vmbus_probe+0x3e/0x90
  [ 4.955812] really_probe+0xf5/0x440
  [ 4.955812] driver_probe_device+0x11b/0x130
  [ 4.955812] __device_attach_driver+0x7b/0xe0
  [ 4.955812] ? driver_allows_async_probing+0x60/0x60
  [ 4.955812] bus_for_each_drv+0x6e/0xb0
  [ 4.955812] __device_attach+0xe4/0x160
  [ 4.955812] device_initial_probe+0x13/0x20
  [ 4.955812] bus_probe_device+0x92/0xa0
  [ 4.955812] device_add+0x402/0x690
  [ 4.955812] device_register+0x1a/0x20
  [ 4.955812] vmbus_device_register+0x5e/0xf0
  [ 4.955812] vmbus_add_channel_work+0x2c4/0x640
  [ 4.955812] process_one_work+0x209/0x400
  [ 4.955812] worker_thread+0x34/0x400
  [ 4.955812] kthread+0x121/0x140
  [ 4.955812] ? process_one_work+0x400/0x400
  [ 4.955812] ? kthread_park+0x90/0x90
  [ 4.955812] ret_from_fork+0x35/0x40
  [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: 
Using version 0x10002
  [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to 
bus 0004:00

  Dexuan did some research and it looks like this is a longstanding race 
condition bug in the generic PCI subsystem (due to the timing, there can be 
more than 1 place where the PCI code tries to create the same ‘config’ sysfs 
file):
  
https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641
  The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It 
looks like this has not been fixed after 1+ year…
  Business Impact

  [Test Case]

  Repeated deployment on a Standard_NV24 instance. MS reported the
  reproduction rate is 3/551 before the patch, and 0/838 with the patch.

  [Where things could go wrong]

  Deployments could fail for other reasons.

  [Other info]

  SF: #00321027

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

2021-12-01 Thread Tim Gardner
** Changed in: linux-azure (Ubuntu Focal)
   Status: In Progress => Fix Committed

** Changed in: linux-azure-5.4 (Ubuntu Bionic)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1952621

Title:
  Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

Status in linux-azure package in Ubuntu:
  Invalid
Status in linux-azure-5.4 package in Ubuntu:
  New
Status in linux-azure source package in Bionic:
  Invalid
Status in linux-azure-5.4 source package in Bionic:
  Fix Committed
Status in linux-azure source package in Focal:
  Fix Committed
Status in linux-azure-5.4 source package in Focal:
  Invalid

Bug description:
  SRU Justification

  [Impact]
  During large scale deployment testing, we found below call trace when 
provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed 
instance 10 times and encountered once.

  It looks like a race condition when probe device, but finally all
  devices can be probed.

  [ 4.938162] sysfs: cannot create duplicate filename 
'/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config'
  [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray
  [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure 
#64~18.04.1-Ubuntu
  [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007 06/02/2017
  [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20
  [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work
  [ 4.955812] Call Trace:
  [ 4.955812] dump_stack+0x57/0x6d
  [ 4.955812] sysfs_warn_dup+0x5b/0x70
  [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180
  [ 4.955812] sysfs_create_bin_file+0x64/0x90
  [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270
  [ 4.955812] pci_bus_add_device+0x30/0x80
  [ 4.955812] pci_bus_add_devices+0x31/0x70
  [ 4.955812] hv_pci_probe+0x48c/0x650
  [ 4.955812] vmbus_probe+0x3e/0x90
  [ 4.955812] really_probe+0xf5/0x440
  [ 4.955812] driver_probe_device+0x11b/0x130
  [ 4.955812] __device_attach_driver+0x7b/0xe0
  [ 4.955812] ? driver_allows_async_probing+0x60/0x60
  [ 4.955812] bus_for_each_drv+0x6e/0xb0
  [ 4.955812] __device_attach+0xe4/0x160
  [ 4.955812] device_initial_probe+0x13/0x20
  [ 4.955812] bus_probe_device+0x92/0xa0
  [ 4.955812] device_add+0x402/0x690
  [ 4.955812] device_register+0x1a/0x20
  [ 4.955812] vmbus_device_register+0x5e/0xf0
  [ 4.955812] vmbus_add_channel_work+0x2c4/0x640
  [ 4.955812] process_one_work+0x209/0x400
  [ 4.955812] worker_thread+0x34/0x400
  [ 4.955812] kthread+0x121/0x140
  [ 4.955812] ? process_one_work+0x400/0x400
  [ 4.955812] ? kthread_park+0x90/0x90
  [ 4.955812] ret_from_fork+0x35/0x40
  [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: 
Using version 0x10002
  [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to 
bus 0004:00

  Dexuan did some research and it looks like this is a longstanding race 
condition bug in the generic PCI subsystem (due to the timing, there can be 
more than 1 place where the PCI code tries to create the same ‘config’ sysfs 
file):
  
https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641
  The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It 
looks like this has not been fixed after 1+ year…
  Business Impact

  [Test Case]

  Repeated deployment on a Standard_NV24 instance. MS reported the
  reproduction rate is 3/551 before the patch, and 0/838 with the patch.

  [Where things could go wrong]

  Deployments could fail for other reasons.

  [Other info]

  SF: #00321027

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1952621] Re: Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

2021-11-29 Thread Tim Gardner
** Also affects: linux-azure-5.4 (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: linux-azure-5.4 (Ubuntu Focal)
   Status: New => Invalid

** Changed in: linux-azure-5.4 (Ubuntu Bionic)
   Status: New => In Progress

** Changed in: linux-azure-5.4 (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: linux-azure-5.4 (Ubuntu Bionic)
 Assignee: (unassigned) => Tim Gardner (timg-tpi)

** Changed in: linux-azure (Ubuntu Bionic)
   Status: New => Invalid

** Changed in: linux-azure (Ubuntu)
 Assignee: Tim Gardner (timg-tpi) => (unassigned)

** Changed in: linux-azure (Ubuntu)
   Status: New => Invalid

** Changed in: linux-azure (Ubuntu Focal)
   Status: New => In Progress

** Changed in: linux-azure (Ubuntu Focal)
 Assignee: (unassigned) => Tim Gardner (timg-tpi)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1952621

Title:
  Bionic/linux-azure: Call trace on Ubuntu 18.04 VM with Standard NV24

Status in linux-azure package in Ubuntu:
  Invalid
Status in linux-azure-5.4 package in Ubuntu:
  New
Status in linux-azure source package in Bionic:
  Invalid
Status in linux-azure-5.4 source package in Bionic:
  In Progress
Status in linux-azure source package in Focal:
  In Progress
Status in linux-azure-5.4 source package in Focal:
  Invalid

Bug description:
  SRU Justification

  [Impact]
  During large scale deployment testing, we found below call trace when 
provisioning Ubuntu 18.04 VM with size Standard_NV24. Engineer deployed 
instance 10 times and encountered once.

  It looks like a race condition when probe device, but finally all
  devices can be probed.

  [ 4.938162] sysfs: cannot create duplicate filename 
'/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/47505500-0003--3130-444531334632/pci0003:00/0003:00:00.0/config'
  [ 4.944816] sr 5:0:0:0: [sr0] scsi3-mmc drive: 0x/0x tray
  [ 4.951818] CPU: 0 PID: 135 Comm: kworker/0:2 Not tainted 5.4.0-1061-azure 
#64~18.04.1-Ubuntu
  [ 4.951820] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090007 06/02/2017
  [ 4.958943] cdrom: Uniform CD-ROM driver Revision: 3.20
  [ 4.955812] Workqueue: hv_pri_chan vmbus_add_channel_work
  [ 4.955812] Call Trace:
  [ 4.955812] dump_stack+0x57/0x6d
  [ 4.955812] sysfs_warn_dup+0x5b/0x70
  [ 4.955812] sysfs_add_file_mode_ns+0x158/0x180
  [ 4.955812] sysfs_create_bin_file+0x64/0x90
  [ 4.955812] pci_create_sysfs_dev_files+0x72/0x270
  [ 4.955812] pci_bus_add_device+0x30/0x80
  [ 4.955812] pci_bus_add_devices+0x31/0x70
  [ 4.955812] hv_pci_probe+0x48c/0x650
  [ 4.955812] vmbus_probe+0x3e/0x90
  [ 4.955812] really_probe+0xf5/0x440
  [ 4.955812] driver_probe_device+0x11b/0x130
  [ 4.955812] __device_attach_driver+0x7b/0xe0
  [ 4.955812] ? driver_allows_async_probing+0x60/0x60
  [ 4.955812] bus_for_each_drv+0x6e/0xb0
  [ 4.955812] __device_attach+0xe4/0x160
  [ 4.955812] device_initial_probe+0x13/0x20
  [ 4.955812] bus_probe_device+0x92/0xa0
  [ 4.955812] device_add+0x402/0x690
  [ 4.955812] device_register+0x1a/0x20
  [ 4.955812] vmbus_device_register+0x5e/0xf0
  [ 4.955812] vmbus_add_channel_work+0x2c4/0x640
  [ 4.955812] process_one_work+0x209/0x400
  [ 4.955812] worker_thread+0x34/0x400
  [ 4.955812] kthread+0x121/0x140
  [ 4.955812] ? process_one_work+0x400/0x400
  [ 4.955812] ? kthread_park+0x90/0x90
  [ 4.955812] ret_from_fork+0x35/0x40
  [ 5.043612] hv_pci 47505500-0004-0001-3130-444531334632: PCI VMBus probing: 
Using version 0x10002
  [ 5.260563] hv_pci 47505500-0004-0001-3130-444531334632: PCI host bridge to 
bus 0004:00

  Dexuan did some research and it looks like this is a longstanding race 
condition bug in the generic PCI subsystem (due to the timing, there can be 
more than 1 place where the PCI code tries to create the same ‘config’ sysfs 
file):
  
https://patchwork.kernel.org/project/linux-pci/patch/20200716110423.xtfyb3n6tn5ixedh@pali/#23669641
  The bug was reported on 7/16/2020, and the last reply was on 6/25/2021. It 
looks like this has not been fixed after 1+ year…
  Business Impact

  [Test Case]

  Repeated deployment on a Standard_NV24 instance. MS reported the
  reproduction rate is 3/551 before the patch, and 0/838 with the patch.

  [Where things could go wrong]

  Deployments could fail for other reasons.

  [Other info]

  SF: #00321027

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1952621/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp