[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
This bug was fixed in the package xorg-server - 2:1.20.13-1ubuntu1~20.04.8 --- xorg-server (2:1.20.13-1ubuntu1~20.04.8) focal-security; urgency=medium * SECURITY UPDATE: Overlay Window Use-After-Free - debian/patches/CVE-2023-1393.patch: fix use-after-free of the COW in composite/compwindow.c. - CVE-2023-1393 -- Marc Deslauriers Wed, 29 Mar 2023 08:53:02 -0400 ** Changed in: xorg-server (Ubuntu Focal) Status: Fix Committed => Fix Released ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2023-1393 -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: Fix Released Status in xorg-server source package in Focal: Fix Released Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx * `xserver` should crash with a similar output to the following: X.Org X Server 1.20.13 X Protocol Version 11, Revision 0 Build Operating System: linux Ubuntu Current Operating System: Linux a10test 5.15.0-1031-azure #38~20.04.1-Ubuntu SMP Mon Jan 9 18:23:48 UTC 2023 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1031-azure root=PARTUUID=4cac852b-afba-447b-b3e7-c002155c1305 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 Build Date: 07 February 2023 12:48:13PM xorg-server 2:1.20.13-1ubuntu1~20.04.6 (For technical support please see http://www.ubuntu.com/support) Current version of pixman: 0.38.4 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line,
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
Tested the proposed package (2:1.20.13-1ubuntu1~20.04.7) with the test plan; it no longer crashes and behaves as expected. ** Tags removed: verification-needed verification-needed-focal ** Tags added: verification-done-focal -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: Fix Released Status in xorg-server source package in Focal: Fix Committed Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx * `xserver` should crash with a similar output to the following: X.Org X Server 1.20.13 X Protocol Version 11, Revision 0 Build Operating System: linux Ubuntu Current Operating System: Linux a10test 5.15.0-1031-azure #38~20.04.1-Ubuntu SMP Mon Jan 9 18:23:48 UTC 2023 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1031-azure root=PARTUUID=4cac852b-afba-447b-b3e7-c002155c1305 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 Build Date: 07 February 2023 12:48:13PM xorg-server 2:1.20.13-1ubuntu1~20.04.6 (For technical support please see http://www.ubuntu.com/support) Current version of pixman: 0.38.4 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.1.log", Time: Sat Feb 18 10:54:26 2023 (==) Using config file: "/etc/X11/xorg.conf" (==) Using system config directory "/usr/share/X11/xorg.conf.d" (EE) (EE) Backtrace: (EE) 0:
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
Hello Mustafa, or anyone else affected, Accepted xorg-server into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/xorg- server/2:1.20.13-1ubuntu1~20.04.7 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed- focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification- failed-focal. In either case, without details of your testing we will not be able to proceed. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. ** Changed in: xorg-server (Ubuntu Focal) Status: In Progress => Fix Committed ** Tags added: verification-needed verification-needed-focal -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: Fix Released Status in xorg-server source package in Focal: Fix Committed Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx *
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
** Tags added: se sts -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: Fix Released Status in xorg-server source package in Focal: In Progress Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx * `xserver` should crash with a similar output to the following: X.Org X Server 1.20.13 X Protocol Version 11, Revision 0 Build Operating System: linux Ubuntu Current Operating System: Linux a10test 5.15.0-1031-azure #38~20.04.1-Ubuntu SMP Mon Jan 9 18:23:48 UTC 2023 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1031-azure root=PARTUUID=4cac852b-afba-447b-b3e7-c002155c1305 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 Build Date: 07 February 2023 12:48:13PM xorg-server 2:1.20.13-1ubuntu1~20.04.6 (For technical support please see http://www.ubuntu.com/support) Current version of pixman: 0.38.4 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.1.log", Time: Sat Feb 18 10:54:26 2023 (==) Using config file: "/etc/X11/xorg.conf" (==) Using system config directory "/usr/share/X11/xorg.conf.d" (EE) (EE) Backtrace: (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x13c) [0x55e7787c5ecc] (EE) 1: /lib/x86_64-linux-gnu/libpthread.so.0 (funlockfile+0x60) [0x7f9576cac420] (EE) 2: /usr/lib/xorg/Xorg
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
** Tags added: se-sponsor-dgadomski -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: Fix Released Status in xorg-server source package in Focal: In Progress Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx * `xserver` should crash with a similar output to the following: X.Org X Server 1.20.13 X Protocol Version 11, Revision 0 Build Operating System: linux Ubuntu Current Operating System: Linux a10test 5.15.0-1031-azure #38~20.04.1-Ubuntu SMP Mon Jan 9 18:23:48 UTC 2023 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1031-azure root=PARTUUID=4cac852b-afba-447b-b3e7-c002155c1305 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 Build Date: 07 February 2023 12:48:13PM xorg-server 2:1.20.13-1ubuntu1~20.04.6 (For technical support please see http://www.ubuntu.com/support) Current version of pixman: 0.38.4 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.1.log", Time: Sat Feb 18 10:54:26 2023 (==) Using config file: "/etc/X11/xorg.conf" (==) Using system config directory "/usr/share/X11/xorg.conf.d" (EE) (EE) Backtrace: (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x13c) [0x55e7787c5ecc] (EE) 1: /lib/x86_64-linux-gnu/libpthread.so.0 (funlockfile+0x60) [0x7f9576cac420] (EE) 2: /usr/lib/xorg/Xorg
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
** Tags added: focal ** Changed in: xorg-server (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: Fix Released Status in xorg-server source package in Focal: In Progress Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx * `xserver` should crash with a similar output to the following: X.Org X Server 1.20.13 X Protocol Version 11, Revision 0 Build Operating System: linux Ubuntu Current Operating System: Linux a10test 5.15.0-1031-azure #38~20.04.1-Ubuntu SMP Mon Jan 9 18:23:48 UTC 2023 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1031-azure root=PARTUUID=4cac852b-afba-447b-b3e7-c002155c1305 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 Build Date: 07 February 2023 12:48:13PM xorg-server 2:1.20.13-1ubuntu1~20.04.6 (For technical support please see http://www.ubuntu.com/support) Current version of pixman: 0.38.4 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.1.log", Time: Sat Feb 18 10:54:26 2023 (==) Using config file: "/etc/X11/xorg.conf" (==) Using system config directory "/usr/share/X11/xorg.conf.d" (EE) (EE) Backtrace: (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x13c) [0x55e7787c5ecc] (EE) 1: /lib/x86_64-linux-gnu/libpthread.so.0 (funlockfile+0x60)
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
** Description changed: [ Impact ] - * Microsoft Azure NV-series instances with NVidia GRID drivers started + * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. - * Root cause analysis showed that it was due to having a device with + * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. - * Removing either the BusID specification or unloading the hyperv_drm + * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. - * The crash is happening while X.server is trying to enumerate PCI + * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. - * The reason why it only happens while the hyperv_drm kernel module is + * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. - * The upstream patch [2] addresses the issue and it's confirmed that + * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. - * Ubuntu Focal `xorg-server` package does not include the patch [2] at + * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). - [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms - [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 + [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms + [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): - * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU -- e.g. `NV36adms A10` - * Install updates, required tooling, and the desktop environment: -- sudo apt-get update -- sudo apt-get upgrade -y -- sudo apt-get dist-upgrade -y -- sudo apt-get install build-essential ubuntu-desktop -y -- sudo apt-get install linux-azure -y - * Disable nouveau kernel driver: -# Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: -blacklist nouveau -blacklist lbm-nouveau - * Reboot the VM, re-connect, and then stop X server: -- sudo reboot -# wait for the reboot, reconnect, and continue: -- sudo systemctl stop lightdm.service - * Download and install the NVidia GRID driver: -- wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 -- chmod +x NVIDIA-Linux-x86_64-grid.run -- sudo ./NVIDIA-Linux-x86_64-grid.run -- # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. - * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf -- sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf - * Edit /etc/nvidia/grid.conf -- sudo nano /etc/nvidia/grid.conf -# Append the following lines: -IgnoreSP=FALSE -EnableUI=FALSE -# Remove this line if present: -FeatureType=0 -# And save. - * Reboot the VM + * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU + - e.g. `NV36adms A10` + * Install updates, required tooling, and the desktop environment: + - sudo apt-get update + - sudo apt-get upgrade -y + - sudo apt-get dist-upgrade -y + - sudo apt-get install build-essential ubuntu-desktop -y + - sudo apt-get install linux-azure -y + * Disable nouveau kernel driver: + # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: + blacklist nouveau + blacklist lbm-nouveau + * Reboot the VM, re-connect, and then stop X server: + - sudo reboot + # wait for the reboot, reconnect, and continue: + - sudo systemctl stop lightdm.service + * Download and install the NVidia GRID driver: + - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 + - chmod +x NVIDIA-Linux-x86_64-grid.run + - sudo ./NVIDIA-Linux-x86_64-grid.run + - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. + * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf + - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf + * Edit /etc/nvidia/grid.conf + - sudo nano /etc/nvidia/grid.conf + #
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
** Merge proposal linked: https://code.launchpad.net/~mustafakemalgilor/ubuntu/+source/xorg-server/+git/xorg-server/+merge/437541 -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: New Status in xorg-server source package in Focal: In Progress Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx * `xserver` should crash with a similar output to the following: X.Org X Server 1.20.13 X Protocol Version 11, Revision 0 Build Operating System: linux Ubuntu Current Operating System: Linux a10test 5.15.0-1031-azure #38~20.04.1-Ubuntu SMP Mon Jan 9 18:23:48 UTC 2023 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1031-azure root=PARTUUID=4cac852b-afba-447b-b3e7-c002155c1305 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 Build Date: 07 February 2023 12:48:13PM xorg-server 2:1.20.13-1ubuntu1~20.04.6 (For technical support please see http://www.ubuntu.com/support) Current version of pixman: 0.38.4 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.1.log", Time: Sat Feb 18 10:54:26 2023 (==) Using config file: "/etc/X11/xorg.conf" (==) Using system config directory "/usr/share/X11/xorg.conf.d" (EE) (EE) Backtrace: (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x13c) [0x55e7787c5ecc] (EE) 1:
[Desktop-packages] [Bug 2007746] Re: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver
The relevant function is absent in Bionic and Jammy is based on an upstream version that contains the fix, so I presume the only affected series is Focal right now. -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to xorg-server in Ubuntu. https://bugs.launchpad.net/bugs/2007746 Title: [SRU] xserver crashes when hyperv_drm kernel module is loaded on azure NV series instances w/ nvidia grid driver Status in xorg-server package in Ubuntu: New Status in xorg-server source package in Focal: In Progress Bug description: [ Impact ] * Microsoft Azure NV-series instances with NVidia GRID drivers started to experience xserver crashes while following Microsoft's official guide to installing Nvidia drivers [1]. * Root cause analysis showed that it was due to having a device with BusID "PCI:0@:0:0", where domain id is >= 32767 while the hyperv_drm kernel module is loaded. * Removing either the BusID specification or unloading the hyperv_drm kernel module seems to fix the crash. * The crash is happening while X.server is trying to enumerate PCI devices. X.server dereferences a NULL pointer while trying to access to the PCI device info. * The reason why it only happens while the hyperv_drm kernel module is loaded is that the hyperv_drm module does not expose PCI hardware information since it's a virtual device. * The upstream patch [2] addresses the issue and it's confirmed that the xserver with the patch does not experience the crash. * Ubuntu Focal `xorg-server` package does not include the patch [2] at the moment (xserver-xorg-core=2:1.20.13-1ubuntu1~20.04.6). [1]: https://learn.microsoft.com/en-us/azure/virtual-machines/linux/n-series-driver-setup#install-grid-drivers-on-nv-or-nvv3-series-vms [2]: https://github.com/freedesktop/xorg-xserver/commit/0d93bbfa2cfacbb73741f8bed0e32fa1a656b928 [ Test Plan ] Part (a) is quoted from Microsoft's official guide [1]. Part (a): * Spawn a Microsoft Azure NV-series instance with an NVidia GRID-supported GPU - e.g. `NV36adms A10` * Install updates, required tooling, and the desktop environment: - sudo apt-get update - sudo apt-get upgrade -y - sudo apt-get dist-upgrade -y - sudo apt-get install build-essential ubuntu-desktop -y - sudo apt-get install linux-azure -y * Disable nouveau kernel driver: # Create a blacklist file /etc/modprobe.d/nouveau.conf with following contents: blacklist nouveau blacklist lbm-nouveau * Reboot the VM, re-connect, and then stop X server: - sudo reboot # wait for the reboot, reconnect, and continue: - sudo systemctl stop lightdm.service * Download and install the NVidia GRID driver: - wget -O NVIDIA-Linux-x86_64-grid.run https://go.microsoft.com/fwlink/?linkid=874272 - chmod +x NVIDIA-Linux-x86_64-grid.run - sudo ./NVIDIA-Linux-x86_64-grid.run - # When the setup asks whether you want to run the nvidia-xconfig utility to update your X configuration file, select Yes. * Copy /etc/nvidia/gridd.conf.template to /etc/nvidia/gridd.conf - sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf * Edit /etc/nvidia/grid.conf - sudo nano /etc/nvidia/grid.conf # Append the following lines: IgnoreSP=FALSE EnableUI=FALSE # Remove this line if present: FeatureType=0 # And save. * Reboot the VM Part (b): * Ensure that the hyperv_drm kernel module is loaded: - sudo modprobe hyperv_drm * Use the attached xorg.conf file to override /etc/X11/xorg.conf file * try to start the `xserver`: - sudo startx * `xserver` should crash with a similar output to the following: X.Org X Server 1.20.13 X Protocol Version 11, Revision 0 Build Operating System: linux Ubuntu Current Operating System: Linux a10test 5.15.0-1031-azure #38~20.04.1-Ubuntu SMP Mon Jan 9 18:23:48 UTC 2023 x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1031-azure root=PARTUUID=4cac852b-afba-447b-b3e7-c002155c1305 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 panic=-1 Build Date: 07 February 2023 12:48:13PM xorg-server 2:1.20.13-1ubuntu1~20.04.6 (For technical support please see http://www.ubuntu.com/support) Current version of pixman: 0.38.4 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/var/log/Xorg.1.log", Time: Sat Feb 18 10:54:26 2023 (==) Using config file: "/etc/X11/xorg.conf" (==) Using system config directory "/usr/share/X11/xorg.conf.d" (EE) (EE) Backtrace: (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x13c) [0x55e7787c5ecc] (EE)