Public bug reported:
There is a kernel null pointer dereference caused by Libvirt VM shutdown
when vEGM is enabled:
[ 616.355597] Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000
[ 616.601430] pc : __rb_erase_color+0xc4/0x2a8
[ 616.605878] lr : interval_tree_remove+0x184/0x2e8
[ 616.696464] interval_tree_remove+0x184/0x2e8
[ 616.701185] unregister_pfn_address_space+0x4c/0xc0
[ 616.706439] nvgrace_egm_release+0x98/0xd8 [nvgrace_egm]
When booting and shutting down a raw qemu VM with vEGM, we can observe
an open() syscall on the EGM device on boot, a subsequent mmap(), and
then a close() on VM shutdown.
For libvirt + qemu, there is an additional close() on VM shutdown but
the __rb_parent_color field of the egm_region->pfn_address_space.node
struct isn't cleared after the first unregister. When the second close
happens, unregister_pfn_address_space() is called again on the same
struct. The interval_tree_remove() code checks __rb_parent_color and
assumes the node is still in the tree (because it's non-zero), then
tries to traverse the tree using the stale parent pointer, resulting in
a NULL pointer dereference.
** Affects: linux-nvidia (Ubuntu)
Importance: Undecided
Status: New
** Affects: linux-nvidia-6.14 (Ubuntu)
Importance: Undecided
Status: New
** Also affects: linux-nvidia-6.14 (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2131582
Title:
NULL pointer dereference during vEGM Libvirt VM lifecycle
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2131582/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs