Private bug reported:
PROBLEM:
Only when a non-GPU (CPU) VM is launched using our tools, the Guest OS
image, creates a VM, however connecting over console doesn’t work.
WAR:
A ‘virsh reset/reboot’ or ‘ssh in to VM; followed by reboot’ brings back
console over ‘virsh console’
STEPS to Reproduce:
Login to DGX-2
ssh [email protected] (DGXrox!)
Launch a CPU VM
nvidia-vm [ –-domain <vmname> ] -–gpu-index 0 –-gpu-count 0
virsh console <vmname> << hangs
BTW:DGX-2 User Guide, https://docs.nvidia.com/dgx/pdf/dgx2-user-guide.pdf,
chapter 11 discusses nvidia-vm tool.
Path to our Guest OS image:
/var/lib/nvidia/kvm/images$ ls -la
total 4396108
drwxr-xr-x 2 root root 4096 Oct 3 13:32 .
drwxr-xr-x 4 root root 4096 Oct 3 20:24 ..
lrwxrwxrwx 1 root root 19 Oct 3 13:32 dgx-kvm-image ->
dgx-kvm-image-4-0-2
lrwxrwxrwx 1 root root 41 Oct 3 13:32 dgx-kvm-image-4-0-2 ->
dgx-kvm-image-4.0.2~180918-d47d38.1.qcow2
-r--r--r-- 1 libvirt-qemu kvm 4501602304 Sep 19 18:35
dgx-kvm-image-4.0.2~180918-d47d38.1.qcow2
RCA done so far:
Problem is with our Guest OS images, haven’t seen with Stock ubuntu or
cloudinit images
We traced it back to couple of months ago, where it used to work.
Also, not an issue when we launch GPU based VMs
Verified CONFIG_FHANDLE is configured inside the VM (with the image )
root@cpuVM:/boot# grep FHANDLE /boot/config-4.15.0-29-generic
CONFIG_FHANDLE=y
Initially, the following times out
root@cpuVM:/boot# systemctl enable [email protected]
root@cpuVM:/boot# systemctl start [email protected] << times out
A dependency job for [email protected] failed. See 'journalctl -xe'
for details.
root@cpuVM:/boot# journalctl -xe
<snip ..>
-- The start-up result is RESULT.
Oct 09 11:04:24 cpuVM sudo[1330]: nvidia : TTY=pts/0 ; PWD=/boot ; USER=root
; COMMAND=/bin/bash
Oct 09 11:04:24 cpuVM sudo[1330]: pam_unix(sudo:session): session opened for
user root by nvidia(uid=0)
Oct 09 11:04:39 cpuVM systemd[1]: Reloading.
Oct 09 11:06:14 cpuVM systemd[1]: dev-ttyS0.device: Job dev-ttyS0.device/start
timed out.
Oct 09 11:06:14 cpuVM systemd[1]: Timed out waiting for device dev-ttyS0.device.
-- Subject: Unit dev-ttyS0.device has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit dev-ttyS0.device has failed.
** Affects: libvirt (Ubuntu)
Importance: Undecided
Status: New
** Tags: kvm
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1797177
Title:
virsh console <vm> hangs for the first time
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1797177/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs