On 14.08.21 15:55, Dongjiu Geng wrote:
> There is not need to init bootstrap page table in all CPUs,
> bootstrap page table is only for boot, when jump to C language
> environment it will use the formal page table instead of
> bootstrap page table.
>
> Init bootstrap page table in all CPUs will impact hypervisor
> startup time.

Did you measure this?

>
> Signed-off-by: Dongjiu Geng <[email protected]>
> ---
>  hypervisor/arch/arm64/entry.S | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/hypervisor/arch/arm64/entry.S b/hypervisor/arch/arm64/entry.S
> index a9cabf7f..69af05f9 100644
> --- a/hypervisor/arch/arm64/entry.S
> +++ b/hypervisor/arch/arm64/entry.S
> @@ -22,6 +22,9 @@
>  #define LINUX_HVC_SET_VECTORS_LEGACY 1
>  #define LINUX_HVC_SET_VECTORS                0
>
> +#define INITIALIZED_BOOTSTRAP_PT     1
> +#define UNINITIALIZED_BOOTSTRAP_PT   0
> +
>       .data
>  vmexits_total:
>       .quad CPU_STAT_VMEXITS_TOTAL
> @@ -29,6 +32,9 @@ vmexits_total:
>  vmexits_smccc:
>       .quad CPU_STAT_VMEXITS_SMCCC
>
> +bootstrap_pt_status:

Let's make this a boolean, bootstrap_pt_initialized, then it is clear by
itself that 0 means uninitialized.

> +     .byte UNINITIALIZED_BOOTSTRAP_PT
> +
>  /* x11 must contain the virt-to-phys offset */
>  .macro virt2phys, register
>       add     \register, \register, x11
> @@ -120,15 +126,24 @@ el2_entry:
>       cmp     x1, #0x16
>       b.ne    .               /* not hvc */
>
> +     adr     x0, bootstrap_pt_status
> +     ldaxr   w1, [x0]
> +     mov     w2, #INITIALIZED_BOOTSTRAP_PT
> +     /* Check whether bootstrap page table is created */
> +     cmp     w1, w2
> +     b.eq    1f
> +
> +     /* Set inited status */
> +     stlxr   w1, w2, [x0]

This create a nasty race: You declare the pt initialized before you
actually called init_bootstrap_pt. You need to reorder those to make it
correct.

You didn't notice the impact of this race because you also didn't check
w1 being 0, thus w2 actually having been stored. If w1 is returned
non-zero, you need to go back and re-read the memory until you had
exclusive access and know the true value of bootstrap_pt_status/initialized.

>       /* init bootstrap page tables */
>       bl      init_bootstrap_pt
> -
> +1:
>       /* enable temporary mmu mapings for early initialization */
>       adr     x0, bootstrap_pt_l0
> -     adr     x30, 1f         /* set lr manually to ensure... */
> +     adr     x30, 2f         /* set lr manually to ensure... */
>       phys2virt x30           /* ...that we return to a virtual address */
>       b       enable_mmu_el2
> -1:
> +2:
>       /* install the final vectors */
>       adr     x1, hyp_vectors
>       msr     vbar_el2, x1
>

The general idea is not wrong, but the implementation is so far.

However, I'm not yet sure how much this will actually buy us because the
secondary CPUs will now spin (if done correctly) until the first CPU has
initialized the bootstrap pt - rather than re-doing that themselves. You
may only gain time if the CPUs are initialized a little bit time-shifted
or if those concurrent writes + dcache flushes are actually slower than
doing it only once. Please measure with at least 4 cores!

Thanks,
Jan

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/5e9b29de-b93c-0604-40ff-dd29bc47cf09%40web.de.

Reply via email to