On 31.05.19 17:52, Ralf Ramsauer wrote:
On 5/30/19 3:00 AM, Jan Kiszka wrote:
On 21.05.19 16:34, Ralf Ramsauer wrote:
Recent gcc versions emit SSE instructions for 32-bit inmates (e.g., in
hex2str or cmdline_parse routines). Inmates aren't able to execute those
instructions as SSE is not enabled and will crash.
Enabling SSE is the same code for 32 and 64 bit x86 and straight
forward: Lookup SSE availability via cpuid and enable OSFXSR in cr4. If
SSE is not available, stop the inmate.
If AVX is available, activate it (XCR0).
Lookup features that need no explicit activation.
Signed-off-by: Ralf Ramsauer <[email protected]>
---
inmates/lib/x86/Makefile | 6 +-
inmates/lib/x86/cpu-features.c | 103 +++++++++++++++++++++++++++++
inmates/lib/x86/header-32.S | 9 ++-
inmates/lib/x86/header-64.S | 9 ++-
inmates/lib/x86/include/asm/regs.h | 36 ++++++++++
5 files changed, 158 insertions(+), 5 deletions(-)
create mode 100644 inmates/lib/x86/cpu-features.c
diff --git a/inmates/lib/x86/Makefile b/inmates/lib/x86/Makefile
index e474ffd0..ed3b04d5 100644
--- a/inmates/lib/x86/Makefile
+++ b/inmates/lib/x86/Makefile
@@ -40,7 +40,7 @@ include $(INMATES_LIB)/Makefile.lib
always := lib.a lib32.a
-TARGETS := header-common.o ioapic.o printk.o setup.o smp.o uart.o
+TARGETS := cpu-features.o header-common.o ioapic.o printk.o setup.o
smp.o uart.o
TARGETS += ../alloc.o ../pci.o ../string.o ../cmdline.o ../setup.o
TARGETS += ../uart-8250.o ../printk.o
TARGETS_32_ONLY := header-32.o
@@ -57,6 +57,10 @@ $(obj)/lib32.a: $(addprefix $(obj)/,$(lib32-y))
targets += header-32.o
+# Code of this object is called before SSE/AVX is available. Ensure
that the
+# compiler won't generate unsupported instructions for this file.
+CFLAGS_cpu-features.o += -mno-sse
+
$(obj)/%-32.o: c_flags += -m32
$(obj)/%-32.o: $(src)/%.c
$(call if_changed_rule,cc_o_c)
diff --git a/inmates/lib/x86/cpu-features.c
b/inmates/lib/x86/cpu-features.c
new file mode 100644
index 00000000..9cf98543
--- /dev/null
+++ b/inmates/lib/x86/cpu-features.c
@@ -0,0 +1,103 @@
+/*
+ * Jailhouse, a Linux-based partitioning hypervisor
+ *
+ * Copyright (c) OTH Regensburg, 2019
+ *
+ * Authors:
+ * Ralf Ramsauer <[email protected]>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ *
+ * Alternatively, you can use or redistribute this file under the
following
+ * BSD license:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
the
+ * documentation and/or other materials provided with the
distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <inmate.h>
+#include <asm/regs.h>
+
+/* Must only be called from assembler via jmp */
+void arch_init_features(void);
+
+struct x86_cpu_features x86_cpu_features
__attribute__((section(".data")));
+
+/*
+ * We arrive here very early, and we don't have a stack. Take care.
+ *
+ * Every booting CPU will call this function. We make the assumption
that all
+ * CPUs have the same feature set. So we don't need any locks when
writing to
+ * x86_cpu_features.
Any comments on this? Is it okay that this code will run on each CPU,
even if only required on the primary CPU?
Running once on all CPUs is harmless - we are not in a hurry.
Running without stack is a problem, see below.
+ */
+void __attribute__((naked, noreturn, section(".boot")))
What's "naked" supposed to do here? The compilers says it will ignore it.
Huh? Does it produce a warning?
Yes, gcc 7.4 said so.
naked ensures that there's no pro- and epilogue sequences. And as the
comment states: We won't have a stack that early.
BTW: That raises a question: Where do we set the stack for secondary
CPUs on x86?
In header.S, always to stack_top... Looks like the ARM issue is still present
here.
+arch_init_features(void)
+{
+ register u64 features;
+
+ features = cpuid_edx(X86_CPUID_FEATURES, 0);
+ /* Check availability of FPU */
+ x86_cpu_features.fpu = !!(features & X86_FEATURE_FPU);
+
+ /* Discover and enable FXSR */
+ if (features & X86_FEATURE_FXSR) {
+ write_cr4(read_cr4() | X86_CR4_OSFXSR);
+ x86_cpu_features.fxsr = true;
+ }
BTW: Another point that I realised in the meanwhile: AFAICT, the
hypervisor will very likely hand over with OSFXSR enabled as it is set
by the cr4_required1 mask.
The hypervisor hands over in reset state, and there these features are off.
Nevertheless, it should be properly set.
+
+ /* Check availability of SSE */
+ x86_cpu_features.sse = !!(features & X86_FEATURE_SSE);
+ x86_cpu_features.sse2 = !!(features & X86_FEATURE_SSE2);
+
+ /* ECX hides the rest */
+ features = cpuid_ecx(X86_CPUID_FEATURES, 0);
+ x86_cpu_features.sse3 = !!(features & X86_FEATURE_SSE3);
+ x86_cpu_features.sse4_1 = !!(features & X86_FEATURE_SSE4_1);
+ x86_cpu_features.sse4_2 = !!(features & X86_FEATURE_SSE4_2);
+ x86_cpu_features.pclmulqdq = !!(features & X86_FEATURE_PCLMULQDQ);
+
+ if (features & X86_FEATURE_XSAVE) {
+ /* Enable XSAVE related instructions */
+ write_cr4(read_cr4() | X86_CR4_OSXSAVE);
+ x86_cpu_features.xsave = true;
+
+ /*
+ * Intel SDM 13.2: A bit can be set in XCR0 if and only if the
+ * corresponding bit is set in this bitmap. Every processor
+ * that supports the XSAVE feature set will set EAX[0] (x87
+ * state) and EAX[1] (SSE state).
+ *
+ * We can always safely write SSE + FP, but only set AVX if
+ * available.
+ */
+
+ features = cpuid_edax(X86_CPUID_XSTATE, 0);
+ write_xcr0(read_xcr0() | (features & X86_XCR0_AVX) | \
+ X86_XCR0_SSE | X86_XCR0_X87);
+ x86_cpu_features.avx = !!(features & X86_XCR0_AVX);
+ }
+
+ /* hand control back to assembler */
+ asm volatile("jmp x86_start\t\n");
Hacky... Is there no stack yet to do proper call/ret?
See above, no stack. But I agree on hacky. :-)
But if you want to / have to use a jmp here, you need to tell the
compiler that
this will never "return" (__builtin_unreachable).
What's the difference between noreturn and builtin_unreachable?
The former declares it for the function and, thus, also informs its callers, the
latter marks that some line is never reached. That is needed when you call
something that is noreturn but does not declare it (like your inline assembly).
The noreturn attribute makes sure that - besides the epilogue, we won't
have a ret.
The compiler should have barked at you that your noreturn function seems to
return (the compiler has no idea about the semantic of the inline assembly).
Jan
--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
--
You received this message because you are subscribed to the Google Groups
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jailhouse-dev/fdb84e17-c70f-9818-a3b3-2bf5a4f18ed3%40siemens.com.
For more options, visit https://groups.google.com/d/optout.