January 3, 2020 5:53 PM, "Claudia" <claud...@disroot.org> wrote:

> January 1, 2020 5:09 PM, "Claudia" <claud...@disroot.org> wrote:
> 
>> However, I still have a long road ahead of me. I did several suspend/resume 
>> cycles, and each time I
>> had a different combination of problems, including the mouse sticking, the 
>> keyboard not working,
>> and finally input/output errors and segmentation faults in the terminal. But 
>> the Xen problem has
>> been identified nonetheless. I'll try kernel-latest and see if that changes 
>> anything.
> 
> Installed kernel-latest from stable, 5.3.11-1.qubes.x86, and no difference as 
> far as I can tell. It
> resumes fine the first time usually, but after the second or third cycle, I 
> get a bunch of io
> errors, as though someone unplugged the SATA connector. I think this is 
> actually the underlying
> cause of the other symptoms. This is with no VMs running. No swap.
> 
> ata1.00: qc timeout (cmd 0xec)
> ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> ata1.00: limiting SATA link speed to 3.0 Gbps
> ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
> ata1.00: revalidation failed (errno=-5)
> ata1.00: disabled
> sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET 
> driverbyte=DRIVER_OK
> sd 0:0:0:0: [sda] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET 
> driverbyte=DRIVER_OK
> sd 0:0:0:0: [sda] tag#21 CDB: Write(10) 2a 00 3c 9f [...]
> blk_update_request: I/O error, dev sda, sector [...] op 0x1: (WRITE) flags 
> 0x100000 phys_seg 1 prio
> class 0
> BTRFS error (device dm-0): bdev /dev/mapper/luks-[...] errs: wr 1, rd 0, 
> flush 0, corrupt 0, gen0
> 
> Note this different than the Fedora 25 resume behavior. In F25 with 4.8.6, 
> the screen doesn't power
> on, but the system seems responsive otherwise. For example ctrl-alt-delete 
> reboots after 60 seconds
> as expected. (In Qubes, after resuming a second or third time and getting 
> disk errors, when you try
> to shutdown it will just hang indefinitely.) But F25 was running from a USB 
> drive so I wouldn't
> necessarily know if there were SATA errors in that case.
> 
> I'll see if I can figure out how to apply the patch to the latest 4.1 
> (F31-based) and try it from
> there. In the mean time, if anyone has any ideas please share.


And... SUCCESS 2.0!

Perhaps it's still too early to celebrate, but after six months of 
troubleshooting I think I might finally have working suspend/resume. I did some 
googling around, and eventually came across a rather inconspicuous post[1] from 
2013 in the Xen archives that mentioned something I hadn't tried or heard about 
before. All I had to do was add to the Xen command line "dom0_max_vcpus=1 
dom0_vcpus_pin". And that's it. Couldn't have been simpler. I should not have 
had to go to the 20th page of search results to find out about this.

This runs dom0 on CPU0 and only CPU0. My understanding is that it has to be 
running on the boot CPU at the exact moment of suspend and resume. Or something 
like that. Not sure of the specifics. Note that this may have a performance 
impact depending on your situation. 

At first, I thought maybe this would render the Xen patch unnecessary: e.g. 
that it was suspending on one core and resuming on another causing an apparent 
change in cpuid bits. But I can see from the log the cpuid capability bits are 
still changing as before. (Those of you just tuning in, the patch and 
instructions are earlier in this thread. However you probably won't need it 
unless you have an AMD Fam15h processor. Note that there may be security 
implications associated with this patch.)

I've only had a chance to test about 15-20 cycles or so, but it works great so 
far. Suspends fast, resumes fast, lid-switch triggers both suspend and resume, 
WiFi automatically reconnects. I suspended in the middle of a YouTube video and 
came back up seamlessly. However after resume all instances of Firefox seem to 
jump to 100% CPU (but not frozen) until I close it, but that appears to be a 
known issue outside of Qubes and Xen also. 

Tested on R4.0 stable with kernel-latest 5.3.11-1.qubes.x86 on Xen 
4.8.5-14.fc25 (patched). I haven't tried this yet on the default kernel but I 
think it would probably work just as well. It also very well might work on 
other Qubes/Xen versions. I'll update my HCL accordingly when I have a chance.

[1] https://lists.gt.net/xen/devel/270965#270965

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/87602866f77e6fa8866448a22b48b4a1%40disroot.org.

Reply via email to