On Tue, 22 Jan 2019 at 16:09, Laszlo Ersek <ler...@redhat.com> wrote:
>
> Performing thread-necromancy on a patch from 2015, which is today known
> as commit b7de7e3cab3f. Also CC'ing Christoffer and Marc.
>
> Question at the bottom.
>
> On 12/07/15 08:06, Ard Biesheuvel wrote:
> > From: "Cohen, Eugene" <eug...@hp.com>
> >
> > This patch updates the ArmPkg variant of InvalidateInstructionCacheRange to
> > flush the data cache only to the point of unification (PoU). This improves
> > performance and also allows invalidation in scenarios where it would be
> > inappropriate to flush to the point of coherency (like when executing code
> > from L2 configured as cache-as-ram).
> >
> > Contributed-under: TianoCore Contribution Agreement 1.0
> > Signed-off-by: Eugene Cohen <eug...@hp.com>
> >
> > Added AARCH64 and ARM/GCC implementations of the above.
> >
> > Contributed-under: TianoCore Contribution Agreement 1.0
> > Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
> > ---
> >  ArmPkg/Include/Library/ArmLib.h                                | 8 +++++++-
> >  ArmPkg/Library/ArmCacheMaintenanceLib/ArmCacheMaintenanceLib.c | 2 +-
> >  ArmPkg/Library/ArmLib/AArch64/AArch64Support.S                 | 6 ++++++
> >  ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.S                     | 6 ++++++
> >  ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.asm                   | 5 +++++
> >  5 files changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/ArmPkg/Include/Library/ArmLib.h 
> > b/ArmPkg/Include/Library/ArmLib.h
> > index 9622444ec63f..85fa1f600ba9 100644
> > --- a/ArmPkg/Include/Library/ArmLib.h
> > +++ b/ArmPkg/Include/Library/ArmLib.h
> > @@ -183,12 +183,18 @@ ArmInvalidateDataCacheEntryByMVA (
> >
> >  VOID
> >  EFIAPI
> > -ArmCleanDataCacheEntryByMVA (
> > +ArmCleanDataCacheEntryToPoUByMVA(
> >    IN  UINTN   Address
> >    );
> >
> >  VOID
> >  EFIAPI
> > +ArmCleanDataCacheEntryByMVA(
> > +IN  UINTN   Address
> > +);
> > +
> > +VOID
> > +EFIAPI
> >  ArmCleanInvalidateDataCacheEntryByMVA (
> >    IN  UINTN   Address
> >    );
> > diff --git a/ArmPkg/Library/ArmCacheMaintenanceLib/ArmCacheMaintenanceLib.c 
> > b/ArmPkg/Library/ArmCacheMaintenanceLib/ArmCacheMaintenanceLib.c
> > index feab4497ac1b..1045f9068f4d 100644
> > --- a/ArmPkg/Library/ArmCacheMaintenanceLib/ArmCacheMaintenanceLib.c
> > +++ b/ArmPkg/Library/ArmCacheMaintenanceLib/ArmCacheMaintenanceLib.c
> > @@ -64,7 +64,7 @@ InvalidateInstructionCacheRange (
> >    IN      UINTN                     Length
> >    )
> >  {
> > -  CacheRangeOperation (Address, Length, ArmCleanDataCacheEntryByMVA);
> > +  CacheRangeOperation (Address, Length, ArmCleanDataCacheEntryToPoUByMVA);
> >    ArmInvalidateInstructionCache ();
> >    return Address;
> >  }
> > diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S 
> > b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > index c530d19e897e..db21f73f0ed7 100644
> > --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > @@ -22,6 +22,7 @@
> >  GCC_ASM_EXPORT (ArmInvalidateInstructionCache)
> >  GCC_ASM_EXPORT (ArmInvalidateDataCacheEntryByMVA)
> >  GCC_ASM_EXPORT (ArmCleanDataCacheEntryByMVA)
> > +GCC_ASM_EXPORT (ArmCleanDataCacheEntryToPoUByMVA)
> >  GCC_ASM_EXPORT (ArmCleanInvalidateDataCacheEntryByMVA)
> >  GCC_ASM_EXPORT (ArmInvalidateDataCacheEntryBySetWay)
> >  GCC_ASM_EXPORT (ArmCleanDataCacheEntryBySetWay)
> > @@ -72,6 +73,11 @@ ASM_PFX(ArmCleanDataCacheEntryByMVA):
> >    ret
> >
> >
> > +ASM_PFX(ArmCleanDataCacheEntryToPoUByMVA):
> > +  dc      cvau, x0    // Clean single data cache line to PoU
> > +  ret
> > +
> > +
> >  ASM_PFX(ArmCleanInvalidateDataCacheEntryByMVA):
> >    dc      civac, x0   // Clean and invalidate single data cache line
> >    ret
> > diff --git a/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.S 
> > b/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.S
> > index 5f030d92de31..7de1b11ef818 100644
> > --- a/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.S
> > +++ b/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.S
> > @@ -19,6 +19,7 @@
> >  GCC_ASM_EXPORT (ArmInvalidateInstructionCache)
> >  GCC_ASM_EXPORT (ArmInvalidateDataCacheEntryByMVA)
> >  GCC_ASM_EXPORT (ArmCleanDataCacheEntryByMVA)
> > +GCC_ASM_EXPORT (ArmCleanDataCacheEntryToPoUByMVA)
> >  GCC_ASM_EXPORT (ArmCleanInvalidateDataCacheEntryByMVA)
> >  GCC_ASM_EXPORT (ArmInvalidateDataCacheEntryBySetWay)
> >  GCC_ASM_EXPORT (ArmCleanDataCacheEntryBySetWay)
> > @@ -69,6 +70,11 @@ ASM_PFX(ArmCleanDataCacheEntryByMVA):
> >    bx      lr
> >
> >
> > +ASM_PFX(ArmCleanDataCacheEntryToPoUByMVA):
> > +  mcr     p15, 0, r0, c7, c11, 1  @clean single data cache line to PoU
> > +  bx      lr
> > +
> > +
> >  ASM_PFX(ArmCleanInvalidateDataCacheEntryByMVA):
> >    mcr     p15, 0, r0, c7, c14, 1  @clean and invalidate single data cache 
> > line
> >    bx      lr
> > diff --git a/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.asm 
> > b/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.asm
> > index df7e22dca2d9..a460bd2da7a9 100644
> > --- a/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.asm
> > +++ b/ArmPkg/Library/ArmLib/ArmV7/ArmV7Support.asm
> > @@ -34,6 +34,11 @@ CTRL_I_BIT      EQU     (1 << 12)
> >    bx      lr
> >
> >
> > + RVCT_ASM_EXPORT ArmCleanDataCacheEntryToPoUByMVA
> > +  mcr     p15, 0, r0, c7, c11, 1  ; clean single data cache line to PoU
> > +  bx      lr
> > +
> > +
> >   RVCT_ASM_EXPORT ArmCleanInvalidateDataCacheEntryByMVA
> >    mcr     p15, 0, r0, c7, c14, 1  ; clean and invalidate single data cache 
> > line
> >    bx      lr
> >
>
> The implementation of the LoadImage boot service calls
> InvalidateInstructionCacheRange():
>
>   CoreLoadImage()                         
> [MdeModulePkg/Core/Dxe/Image/Image.c]
>     CoreLoadImageCommon()                 
> [MdeModulePkg/Core/Dxe/Image/Image.c]
>       CoreLoadPeImage()                   
> [MdeModulePkg/Core/Dxe/Image/Image.c]
>         InvalidateInstructionCacheRange() 
> [ArmPkg/Library/ArmCacheMaintenanceLib/ArmCacheMaintenanceLib.c]
>
> Assuming that this code runs in a virtual machine, is any one the
> following scenarios plausible?
>
> (1) the dcache is flushed to PoU on physical socket #M, and the icache
>     is invalidated to PoU on physical socket #N, M being unequal to N
>
> (2) the dcache cleaning + icache invalidation happen on the same
>     physical socket #M, but the image execution (via StartImage)
>     commences on socket #N
>
> In either case, DRAM (PoC) would not form a "bridge" between the
> physical CPUs sitting in said sockets. If the host kernel / KVM
> re-scheduled VCPU#0 from socket #M to socket #N in the "middle" of the
> above scenarios (1) and (2), could that result in StartImage executing
> *not* what LoadImage loaded?
>

Cache maintenance by virtual address to the PoU is broadcast to the
inner shareable shareability domain. So this should be sufficient to
ensure that the CPU executing the instruction sees the correct data.

Our code uses full system barriers after each stage of cache
maintenance, so the code looks correct to me. However, it does rely on
the host firmware to configure the inner shareable shareability domain
correctly (and use the correct memory attributes for all memory it
maps)

> Because, we got a report
> <https://bugzilla.redhat.com/show_bug.cgi?id=1666280> -- sorry, private
> bug :/ -- in which the *very first* instruction of grub crashes. That
> is, the instruction that forms the entry point of the image. The symptom
> is:
>
> > ESR : EC 0x21  IL 0x1  ISS 0x0000000E
> >
> > Instruction abort: Permission fault, second level
>
> If I'm parsing the ARM ARM correctly:
>
> - "ESR" stands for "Exception Syndrome Register".
>
> - "EC 0x21" stands for Exception Class 0x21, "Instruction Abort taken
>   without a change of Exception level. Used for MMU faults generated by
>   instruction accesses, and for synchronous external aborts, including
>   synchronous parity errors."
>
> - "ISS" stands for "Instruction Specific Syndrome".
>
> - "ISS 0x00E" (24 bit value), given EC 0x21, stands for "Permission
>   fault, second level", as the log itself says.
>

This appears to be a TLB maintenance issue, not a cache maintenance
issue. Level 2 mappings are 2 MB block mappings, and so we usually
have smaller mappings for executables (unless GRUB is much larger than
2 MB and it covers an entire 2 MB naturally aligned block)

Does it happen in DEBUG mode? Is SetUefiImageMemoryAttributes() being
called to remap the memory R-X ?

> The crash reproduces intermittently / unpredictably, and it happens on
> unspecified hardware. Personally I couldn't reproduce it on my APM
> Mustang A3, using the exact same guest ISO.
>
> So, I'm thinking that, as soon as CoreStartImage() calls
> Image->EntryPoint(), the instruction fetch blows up, because we didn't
> clean the dcache / invalidate the icache *to PoC* in CoreLoadImage(),
> and KVM moved VCPU#0 to a different physical socket meanwhile.
>
> Am I totally lost? :)
>
> (I plan to undo commit cf580da1bc4c ("ArmPkg/ArmLib: don't invalidate
> entire I-cache on range operation", 2016-05-12) and commit b7de7e3cab3f
> ("ArmPkg: update InvalidateInstructionCacheRange to flush only to PoU",
> 2015-12-08), and to ask the reporter to reproduce the issue like that,
> but I figured I'd ask first.)
>
> Thanks!
> Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to