On 14.08.21 15:55, Dongjiu Geng wrote: > There is not need to init bootstrap page table in all CPUs, > bootstrap page table is only for boot, when jump to C language > environment it will use the formal page table instead of > bootstrap page table. > > Init bootstrap page table in all CPUs will impact hypervisor > startup time.
Did you measure this? > > Signed-off-by: Dongjiu Geng <[email protected]> > --- > hypervisor/arch/arm64/entry.S | 21 ++++++++++++++++++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/hypervisor/arch/arm64/entry.S b/hypervisor/arch/arm64/entry.S > index a9cabf7f..69af05f9 100644 > --- a/hypervisor/arch/arm64/entry.S > +++ b/hypervisor/arch/arm64/entry.S > @@ -22,6 +22,9 @@ > #define LINUX_HVC_SET_VECTORS_LEGACY 1 > #define LINUX_HVC_SET_VECTORS 0 > > +#define INITIALIZED_BOOTSTRAP_PT 1 > +#define UNINITIALIZED_BOOTSTRAP_PT 0 > + > .data > vmexits_total: > .quad CPU_STAT_VMEXITS_TOTAL > @@ -29,6 +32,9 @@ vmexits_total: > vmexits_smccc: > .quad CPU_STAT_VMEXITS_SMCCC > > +bootstrap_pt_status: Let's make this a boolean, bootstrap_pt_initialized, then it is clear by itself that 0 means uninitialized. > + .byte UNINITIALIZED_BOOTSTRAP_PT > + > /* x11 must contain the virt-to-phys offset */ > .macro virt2phys, register > add \register, \register, x11 > @@ -120,15 +126,24 @@ el2_entry: > cmp x1, #0x16 > b.ne . /* not hvc */ > > + adr x0, bootstrap_pt_status > + ldaxr w1, [x0] > + mov w2, #INITIALIZED_BOOTSTRAP_PT > + /* Check whether bootstrap page table is created */ > + cmp w1, w2 > + b.eq 1f > + > + /* Set inited status */ > + stlxr w1, w2, [x0] This create a nasty race: You declare the pt initialized before you actually called init_bootstrap_pt. You need to reorder those to make it correct. You didn't notice the impact of this race because you also didn't check w1 being 0, thus w2 actually having been stored. If w1 is returned non-zero, you need to go back and re-read the memory until you had exclusive access and know the true value of bootstrap_pt_status/initialized. > /* init bootstrap page tables */ > bl init_bootstrap_pt > - > +1: > /* enable temporary mmu mapings for early initialization */ > adr x0, bootstrap_pt_l0 > - adr x30, 1f /* set lr manually to ensure... */ > + adr x30, 2f /* set lr manually to ensure... */ > phys2virt x30 /* ...that we return to a virtual address */ > b enable_mmu_el2 > -1: > +2: > /* install the final vectors */ > adr x1, hyp_vectors > msr vbar_el2, x1 > The general idea is not wrong, but the implementation is so far. However, I'm not yet sure how much this will actually buy us because the secondary CPUs will now spin (if done correctly) until the first CPU has initialized the bootstrap pt - rather than re-doing that themselves. You may only gain time if the CPUs are initialized a little bit time-shifted or if those concurrent writes + dcache flushes are actually slower than doing it only once. Please measure with at least 4 cores! Thanks, Jan -- You received this message because you are subscribed to the Google Groups "Jailhouse" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jailhouse-dev/5e9b29de-b93c-0604-40ff-dd29bc47cf09%40web.de.
