-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

[moving to qubes-devel]

On Mon, Jan 29, 2018 at 01:55:20AM +0100, Marek Marczykowski-Górecki wrote:
> Interesting, some times it works on my machine, but indeed some times
> it doesn't. Especially it worked just after installation, and after one
> (warm) reboot. But then after cold reboot it mostly doesn't (I get
> successful suspend once, then like 10 of them failed).
> 
> It appears that downgrading kernel for sys-net and sys-usb helps:
> 
>     sudo qubes-dom0-update --action=downgrade 'kernel-qubes-vm-4.9*'
>     # set default kernel back to 4.14.13-1 - required for PVH - most of
>     # VMs
>     qubes-prefs default-kernel 4.14.13-1
>     # then set sys-net and sys-usb to 4.9
>     qvm-prefs sys-net 4.9.56-21
>     qvm-prefs sys-usb 4.9.56-21
> 
> This applies only on X1 Carbon. On T460p (one generation older than X1)
> it works just fine with 4.14 in VM.

Some more info:

VM suspend fails for any HVM running 4.14.13-1 kernel, not only those
with PCI devices (it just happens that other VMs are PVH by default,
where suspend works just fine). It's easy to test even without host
suspend:

    virsh -c xen:/// dompmsuspend VMNAME mem

When it works, it finishes immediately. When it fails, the above will
wait until 60s timeout and fail with:

    error: Domain VMNAME could not be suspended
    error: internal error: Failed to suspend domain '...'

And in /var/log/libvirt/libxl/libxl-driver.log (loglevel debug):
    
    libxl: libxl_dom_suspend.c:206:domain_suspend_callback_common: issuing 
PVH/HVM suspend request via XenBus control node
    libxl: libxl_event.c:636:libxl__ev_xswatch_register: watch w=0x7aa824004680 
wpath=/local/domain/13/control/shutdown token=15/61: register slotnum=15
    libxl: libxl.c:982:libxl_domain_suspend: ao 0x7aa8240127e0: inprogress: 
poller=0x7aa824009030, flags=i
    libxl: libxl_event.c:573:watchfd_callback: watch w=0x7aa824004680 
wpath=/local/domain/13/control/shutdown token=15/61: event 
epath=/local/domain/13/control/shutdown
    libxl: libxl_event.c:673:libxl__ev_xswatch_deregister: watch 
w=0x7aa824004680 wpath=/local/domain/13/control/shutdown token=15/61: 
deregister slotnum=15
    libxl: libxl_dom_suspend.c:288:domain_suspend_common_pvcontrol_suspending: 
guest acknowledged suspend request
    libxl: libxl_dom_suspend.c:307:domain_suspend_common_wait_guest: wait for 
the guest to suspend
    libxl: libxl_event.c:636:libxl__ev_xswatch_register: watch w=0x7aa824004698 
wpath=@releaseDomain token=15/62: register slotnum=15
    libxl: libxl_event.c:548:watchfd_callback: watch w=0x7aa824004698 
epath=/local/domain/13/control/shutdown token=15/61: counter != 62
    libxl: libxl_event.c:573:watchfd_callback: watch w=0x7aa824004698 
wpath=@releaseDomain token=15/62: event epath=@releaseDomain

    (...60s...)

    libxl: libxl_dom_suspend.c:380:suspend_common_wait_guest_timeout: guest did 
not suspend, timed out

It fails only the first time after system startup, or suspend/resume.
Later, if you kill the VM and start it again, it works most of the time
(but not always). If for some VM it works once, it will work next time
too, until VM shutdown or host suspend.

If the VM is running 4.9.56 kernel, the problem doesn't happen.

I've tried also disabling PTI on 4.14.13 kernel in the VM, but it
doesn't appear to change anything.

It is just my observation, it may be totally independent of those action
- - for example some race condition, being affected by some data being
cached or not...

Since I've seen similar reports also from other users, I'll try to
convince our anaconda to install both kernel packages on 4.0rc4, to ease
implementing workaround. But nonetheless it would be better to properly
fix the issue.

Simon, do you have any idea? If not, probably worth asking on xen-devel.

- -- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAlpvbR4ACgkQ24/THMrX
1ywVqgf+Nk7m5gEo1iA24z9tDl0phrcz8xqBRGSnHjNr+MOfIXy31/AQGqgob0Ct
mjqKQl1gTRHIVuj1hRFCGrmtl1h2Z6sCr6CTmxfg4q2QBHWJDwQQ19QXsWnPIBzM
a8KUmrOvTK1iRVuLGjLfC2DJzdm6Mfn+B1p2YBDFoqUEHzMEuf0nzqC3awY0tCLn
RZk5F8QOy05msG9ElkvOhgON2kmrwEbZNF/txOlY2IotZaz/t/JfR1V2xzNH0Ccs
AghdU+Xk8K4FN8YMH3eCQrMV6QDLHiHCxBl+UuquHjfOYFMMh1mkQVHPKzkIioMP
MSXk1lMCK5TPXW3rUS5PBTr4bk8MMw==
=94FB
-----END PGP SIGNATURE-----

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/20180129185110.GT2653%40mail-itl.
For more options, visit https://groups.google.com/d/optout.

Reply via email to