On 5/31/19 12:04 PM, Jan Kiszka wrote:
> On 31.05.19 17:52, Ralf Ramsauer wrote:
>> On 5/30/19 3:00 AM, Jan Kiszka wrote:
>>> On 21.05.19 16:34, Ralf Ramsauer wrote:
>>>> Recent gcc versions emit SSE instructions for 32-bit inmates (e.g., in
>>>> hex2str or cmdline_parse routines). Inmates aren't able to execute
>>>> those
>>>> instructions as SSE is not enabled and will crash.
>>>>
>>>> Enabling SSE is the same code for 32 and 64 bit x86 and straight
>>>> forward: Lookup SSE availability via cpuid and enable OSFXSR in cr4. If
>>>> SSE is not available, stop the inmate.
>>>>
>>>> If AVX is available, activate it (XCR0).
>>>>
>>>> Lookup features that need no explicit activation.
>>>>
>>>> Signed-off-by: Ralf Ramsauer <[email protected]>
>>>> ---
>>>>    inmates/lib/x86/Makefile           |   6 +-
>>>>    inmates/lib/x86/cpu-features.c     | 103
>>>> +++++++++++++++++++++++++++++
>>>>    inmates/lib/x86/header-32.S        |   9 ++-
>>>>    inmates/lib/x86/header-64.S        |   9 ++-
>>>>    inmates/lib/x86/include/asm/regs.h |  36 ++++++++++
>>>>    5 files changed, 158 insertions(+), 5 deletions(-)
>>>>    create mode 100644 inmates/lib/x86/cpu-features.c
>>>>
>>>> diff --git a/inmates/lib/x86/Makefile b/inmates/lib/x86/Makefile
>>>> index e474ffd0..ed3b04d5 100644
>>>> --- a/inmates/lib/x86/Makefile
>>>> +++ b/inmates/lib/x86/Makefile
>>>> @@ -40,7 +40,7 @@ include $(INMATES_LIB)/Makefile.lib
>>>>
>>>>    always := lib.a lib32.a
>>>>
>>>> -TARGETS := header-common.o ioapic.o printk.o setup.o smp.o uart.o
>>>> +TARGETS := cpu-features.o header-common.o ioapic.o printk.o setup.o
>>>> smp.o uart.o
>>>>    TARGETS += ../alloc.o ../pci.o ../string.o ../cmdline.o ../setup.o
>>>>    TARGETS += ../uart-8250.o ../printk.o
>>>>    TARGETS_32_ONLY := header-32.o
>>>> @@ -57,6 +57,10 @@ $(obj)/lib32.a: $(addprefix $(obj)/,$(lib32-y))
>>>>
>>>>    targets += header-32.o
>>>>
>>>> +# Code of this object is called before SSE/AVX is available. Ensure
>>>> that the
>>>> +# compiler won't generate unsupported instructions for this file.
>>>> +CFLAGS_cpu-features.o += -mno-sse
>>>> +
>>>>    $(obj)/%-32.o: c_flags += -m32
>>>>    $(obj)/%-32.o: $(src)/%.c
>>>>        $(call if_changed_rule,cc_o_c)
>>>> diff --git a/inmates/lib/x86/cpu-features.c
>>>> b/inmates/lib/x86/cpu-features.c
>>>> new file mode 100644
>>>> index 00000000..9cf98543
>>>> --- /dev/null
>>>> +++ b/inmates/lib/x86/cpu-features.c
>>>> @@ -0,0 +1,103 @@
>>>> +/*
>>>> + * Jailhouse, a Linux-based partitioning hypervisor
>>>> + *
>>>> + * Copyright (c) OTH Regensburg, 2019
>>>> + *
>>>> + * Authors:
>>>> + *  Ralf Ramsauer <[email protected]>
>>>> + *
>>>> + * This work is licensed under the terms of the GNU GPL, version
>>>> 2.  See
>>>> + * the COPYING file in the top-level directory.
>>>> + *
>>>> + * Alternatively, you can use or redistribute this file under the
>>>> following
>>>> + * BSD license:
>>>> + *
>>>> + * Redistribution and use in source and binary forms, with or without
>>>> + * modification, are permitted provided that the following conditions
>>>> + * are met:
>>>> + *
>>>> + * 1. Redistributions of source code must retain the above copyright
>>>> + *    notice, this list of conditions and the following disclaimer.
>>>> + *
>>>> + * 2. Redistributions in binary form must reproduce the above
>>>> copyright
>>>> + *    notice, this list of conditions and the following disclaimer in
>>>> the
>>>> + *    documentation and/or other materials provided with the
>>>> distribution.
>>>> + *
>>>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>>>> CONTRIBUTORS "AS IS"
>>>> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
>>>> TO, THE
>>>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
>>>> PURPOSE
>>>> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
>>>> CONTRIBUTORS BE
>>>> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
>>>> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
>>>> PROCUREMENT OF
>>>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
>>>> BUSINESS
>>>> + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
>>>> WHETHER IN
>>>> + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
>>>> OTHERWISE)
>>>> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
>>>> ADVISED OF
>>>> + * THE POSSIBILITY OF SUCH DAMAGE.
>>>> + */
>>>> +
>>>> +#include <inmate.h>
>>>> +#include <asm/regs.h>
>>>> +
>>>> +/* Must only be called from assembler via jmp */
>>>> +void arch_init_features(void);
>>>> +
>>>> +struct x86_cpu_features x86_cpu_features
>>>> __attribute__((section(".data")));
>>>> +
>>>> +/*
>>>> + * We arrive here very early, and we don't have a stack. Take care.
>>>> + *
>>>> + * Every booting CPU will call this function. We make the assumption
>>>> that all
>>>> + * CPUs have the same feature set. So we don't need any locks when
>>>> writing to
>>>> + * x86_cpu_features.
>>
>> Any comments on this? Is it okay that this code will run on each CPU,
>> even if only required on the primary CPU?
>>
> 
> Running once on all CPUs is harmless - we are not in a hurry.
> 
> Running without stack is a problem, see below.

Depends. It's perfectly fine to have no stack if we don't use it (even
if we had an corrupt stack…).

> 
>>>> + */
>>>> +void __attribute__((naked, noreturn, section(".boot")))
>>>
>>> What's "naked" supposed to do here? The compilers says it will ignore
>>> it.
>>
>> Huh? Does it produce a warning?
> 
> Yes, gcc 7.4 said so.
> 
>>
>> naked ensures that there's no pro- and epilogue sequences. And as the
>> comment states: We won't have a stack that early.
>>
>> BTW: That raises a question: Where do we set the stack for secondary
>> CPUs on x86?
> 
> In header.S, always to stack_top... Looks like the ARM issue is still
> present here.

I remember that I once fixed some corrupted stack things for ARM
inmates, but it was a bit different there.

Trying to interpret your words: We're currently running on an
overlapping stack for secondary (all) CPUs, right? Should probably be
fixed first.

Hmm. On x86 we're hlting all secondary CPUs very early and wait for a
SIPI. Besides setting ap_entry, we could allocate a valid stack on the
primary CPU inside smp_start_cpu() and pass it.

> 
>>
>>>
>>>> +arch_init_features(void)
>>>> +{
>>>> +    register u64 features;
>>>> +
>>>> +    features = cpuid_edx(X86_CPUID_FEATURES, 0);
>>>> +    /* Check availability of FPU */
>>>> +    x86_cpu_features.fpu = !!(features & X86_FEATURE_FPU);
>>>> +
>>>> +    /* Discover and enable FXSR */
>>>> +    if (features & X86_FEATURE_FXSR) {
>>>> +        write_cr4(read_cr4() | X86_CR4_OSFXSR);
>>>> +        x86_cpu_features.fxsr = true;
>>>> +    }
>>
>> BTW: Another point that I realised in the meanwhile: AFAICT, the
>> hypervisor will very likely hand over with OSFXSR enabled as it is set
>> by the cr4_required1 mask.
> 
> The hypervisor hands over in reset state, and there these features are off.

Ahm. No. On a real AMD machine that's true, yes.

But it behaves differently on Qemu x86: CR4 is handed over with 0x2020,
which means OSFXSR | PVI. And even if I try to deactivate, it won't
work. Look at this inmate code:

       printk("CR4 handover: 0x%lx\n", read_cr4());
       write_cr4(read_cr4() & ~0x2000);
       printk("CR4 now: 0x%lx\n", read_cr4());

will output:
CR4 handover: 0x2020
CR4 now: 0x2020

CR4 is write-intercepted, and vmx_set_guest_cr in vmx.c:443 will set it
again on interception.

> 
>>
>> Nevertheless, it should be properly set.
>>
>>>> +
>>>> +    /* Check availability of SSE */
>>>> +    x86_cpu_features.sse = !!(features & X86_FEATURE_SSE);
>>>> +    x86_cpu_features.sse2 = !!(features & X86_FEATURE_SSE2);
>>>> +
>>>> +    /* ECX hides the rest */
>>>> +    features = cpuid_ecx(X86_CPUID_FEATURES, 0);
>>>> +    x86_cpu_features.sse3 = !!(features & X86_FEATURE_SSE3);
>>>> +    x86_cpu_features.sse4_1 = !!(features & X86_FEATURE_SSE4_1);
>>>> +    x86_cpu_features.sse4_2 = !!(features & X86_FEATURE_SSE4_2);
>>>> +    x86_cpu_features.pclmulqdq = !!(features & X86_FEATURE_PCLMULQDQ);
>>>> +
>>>> +    if (features & X86_FEATURE_XSAVE) {
>>>> +        /* Enable XSAVE related instructions */
>>>> +        write_cr4(read_cr4() | X86_CR4_OSXSAVE);
>>>> +        x86_cpu_features.xsave = true;
>>>> +
>>>> +        /*
>>>> +         * Intel SDM 13.2: A bit can be set in XCR0 if and only if the
>>>> +         * corresponding bit is set in this bitmap.  Every processor
>>>> +         * that supports the XSAVE feature set will set EAX[0] (x87
>>>> +         * state) and EAX[1] (SSE state).
>>>> +         *
>>>> +         * We can always safely write SSE + FP, but only set AVX if
>>>> +         * available.
>>>> +         */
>>>> +
>>>> +        features = cpuid_edax(X86_CPUID_XSTATE, 0);
>>>> +        write_xcr0(read_xcr0() | (features & X86_XCR0_AVX) | \
>>>> +               X86_XCR0_SSE | X86_XCR0_X87);
>>>> +        x86_cpu_features.avx = !!(features & X86_XCR0_AVX);
>>>> +    }
>>>> +
>>>> +    /* hand control back to assembler */
>>>> +    asm volatile("jmp x86_start\t\n");
>>>
>>> Hacky... Is there no stack yet to do proper call/ret?
>>
>> See above, no stack. But I agree on hacky. :-)
>>
>>>
>>> But if you want to / have to use a jmp here, you need to tell the
>>> compiler that
>>> this will never "return" (__builtin_unreachable).
>>
>> What's the difference between noreturn and builtin_unreachable?
> 
> The former declares it for the function and, thus, also informs its
> callers, the latter marks that some line is never reached. That is
> needed when you call something that is noreturn but does not declare it
> (like your inline assembly).
> 
>>
>> The noreturn attribute makes sure that - besides the epilogue, we won't
>> have a ret.
> 
> The compiler should have barked at you that your noreturn function seems
> to return (the compiler has no idea about the semantic of the inline
> assembly).

I used gcc 8.3.0. No complaints. Anyway, I get your point.

  Ralf

> 
> Jan
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/de841457-a95f-6ddb-6c53-09c14f64403d%40oth-regensburg.de.
For more options, visit https://groups.google.com/d/optout.

Reply via email to