> marmarek:
> This is a very bad idea to "fix" it. Those missing/changed CPUID bits later 
> on will cause issues.
> And given most of the microcode updates recently are about speculative 
> execution, missing those
> features will make the host vulnerable to those issues again. There are 
> multiple ways it can
> manifest - from crashes when Xen uses (now not present) CPU feature, to 
> silent failures when Xen
> tries to use some feature and assume it protects the system, while it does 
> not in practice.
> 
> For this particular case (microcode included in BIOS newer than in OS), I see 
> two options: make
> BIOS (coreboot, right?) apply microcode update on resume too, or include 
> newer microcode in OS.

I want to make one thing clear: I am **not** suggesting this check be removed 
altogether. I am suggesting adding an **optional**, even undocumented, override 
parameter which defaults to the **current behavior** which is to panic. 

I've found the patch to be quite stable so far. Unpatched is guaranteed to 
cause a crash (xen
panic) at resume; patched so far has not caused any noticeable stability issues 
for the four of us
using it, afaik. Just saying.

Also, not everyone has the option of coreboot. And we're not even completely 
certain this a
post-resume microcode update issue, either.

> lunarthegray:
> @marmarek the "fix" is a hack for sure but it's currently the only way to get 
> some AMD Ryzen
> laptops to work with Qubes. I built Qubes R4.1 the other day and with kernel 
> 5.4 and Xen 4.13 the
> issue remains.Laptop users often suspend and are on the go as I am. There was 
> some discussion on
> the qubes-users mailing list about other solutions. I'm no firmware/Xen 
> expert though. Would
> pinning dom0 to 1 vCPU prevent the issue of missing or changed CPU bits?I'm 
> not exactly sure what
> the fix would be with standard BIOS, as I'm not brave enough to flash 
> coreboot on my very new
> ThinkPad. Should I start trying to get in contact with Lenovo? I'm assuming 
> AMD needs to release a
> microcode patch as it's not really an issue with Xen itself.

At least in my case, CPU pinning did not fix this issue. The bits still change 
and (would) cause a
Xen panic as before. Pinning dom0 to CPU0 merely fixed a separate post-resume 
issue with my SATA
controller. In that thread, I link to the original Xen archives thread about 
pinning which had
nothing to do with Ryzen.

February 9, 2020 2:09 AM, "Marek Marczykowski-Górecki" 
<marma...@invisiblethingslab.com> wrote:
> (continuing discussion from the above PR)
> 
> The patch as it is, is not acceptable, as it may introduce security
> and/or stability issues on some machines. Xen (and Linux too) assumes
> what CPU features is can use based on CPUID flags. If those changes
> during system runtime (including suspend/resume) some instructions or
> control registers may no longer be valid (->crash) or safe to use
> (->security issue).

Like I said, it's been very stable for me so far. I've only had one bad resume 
in the months I've been using it, suspending at least once a day. Security 
issues on the other hand are indeed unknown at this point.

Also worth noting that this is Xen-specific. Afaik, the Linux kernel doesn't 
check for these changes. So everyone using plain old Ubuntu or whatever would 
be subject to the same stability and security implications caused by this patch.

> If that's just about microcode updates, that's probably BIOS bug - if it
> applies microcode update on system startup, it should do the same on

Weird that it's happening equally on various vendor BIOSes as well as coreboot, 
the only thing they have in common is Ryzen 2xxx-3xxx chips. It doesn't sound 
to me like a **BIOS** bug, per se, unless all these vendors and the Coreboot 
developers wrote the same bug independently. More likely an AMD bug, imo.

> system resume too. Anyway it's worth trying updating linux-firmware
> package, which carries microcode updates for AMD. This should make Xen
> apply microcode updates too - before checking those flags.
> I've just uploaded updated version of the package to the current-testing
> repository (both R4.0 and R4.1).

Thanks for the tip. I'll try it when I have a chance. 
`--enablerepo=qubes-dom0-current-testing kernel-latest linux-firmware` I'm 
guessing?

> If that's about something else, then fixing it would require finding
> what exactly is changing (and preferably also why). And only then find
> how to mitigate this issue. If specific flags would turn out to be not
> related to security features or otherwise having unwanted effects, then
> ignoring those changes would be an option. But ignoring _only those
> flags verified to be safe to ignore_, not all of them.

See my other reply about that.

But I would like to mention, there are already all kinds of options and 
parameters throughout the Xen, Qubes, and Linux codebases that come with 
stability/security implications. This isn't Apple iOS. You can easily shoot 
yourself in the foot. That's the nature of the beast. It is not Qubes' purpose 
to hide these from the user or take away control.

By that logic, we should also patch Xen so that "smt=off" is hardcoded, because 
as it is now someone might open xen.cfg and see that parameter and decide to 
turn it on for performance, which we all know is dangerous. Same with Qubes' 
"no-strict-reset", or dm-crypt's weak upstream default crypto parameters, I 
could go on and on.

So, again, I'm not suggesting we skip this check for everybody. I'm suggesting 
we make it into an undocumented Xen cmdline parameter known only to those who, 
as they say, have been warned. As it is right now, all of us who are affected 
by this are patching our own machines anyway, so what's the difference to 
anyone else?


> - --
> Best Regards,
> Marek Marczykowski-Górecki


Thank you for your consideration and for taking the time to follow up on the 
ML. I look forward to hearing your thoughts.

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-users/dba41acb99bd062cc72351b02244f53c%40disroot.org.

Reply via email to