Re: [PATCH v3 27/42] target/arm: Use softmmu tlbs for page table walking

2022-10-07 Thread Peter Maydell
On Fri, 7 Oct 2022 at 16:27, Richard Henderson
 wrote:
>
> On 10/7/22 02:01, Peter Maydell wrote:
> > The upcoming v8R support has its stage 2 attributes in the MAIR
> > format, so it might be a little awkward to assume the v8A-stage-2
> > format here rather than being able to add the "if !is_s2_format"
> > condition. I guess we'll deal with that when we get to it...
>
> Ah.  I had wondered whether it would be better to convert the result here, so 
> that we
> always have the MAIR format.  I decided against it within the scope of this 
> patch set
> because it meant that I kept the existing s1+s2 attribute merging logic 
> unchanged.

Unfortunately, for A-profile you can't just convert the s2 attrs
to MAIR format, because their interpretation depends on the
s1 attr values in the FWB case. So you have to keep the s2
attrs as raw until they get to the point of combination.
(v8R doesn't have any equivalent of FWB.)

thanks
-- PMM



Re: [PATCH v3 27/42] target/arm: Use softmmu tlbs for page table walking

2022-10-07 Thread Richard Henderson

On 10/7/22 02:01, Peter Maydell wrote:

The upcoming v8R support has its stage 2 attributes in the MAIR
format, so it might be a little awkward to assume the v8A-stage-2
format here rather than being able to add the "if !is_s2_format"
condition. I guess we'll deal with that when we get to it...


Ah.  I had wondered whether it would be better to convert the result here, so that we 
always have the MAIR format.  I decided against it within the scope of this patch set 
because it meant that I kept the existing s1+s2 attribute merging logic unchanged.



+/*
+ * Allow S1_ptw_translate to see any fault generated here.
+ * Since this may recurse, read and clear.
+ */
+fi = cpu->env.tlb_fi;
+if (fi) {
+cpu->env.tlb_fi = NULL;
+} else {
+fi = memset(_fi, 0, sizeof(local_fi));
+}


This makes two architectures now that want to do "call a probe_access
function, and get information that's known in the architecture-specific
tlb_fill function", and need to do it via this awkward "have tlb_fill
know that it should stash the info away in the CPU state struct somewhere"
trick (the other being s390 tlb_fill_exc/tlb_fill_tec). But I don't
really have a better idea...


A better idea would be most welcome, if anyone has one...  :-)


r~



Re: [PATCH v3 27/42] target/arm: Use softmmu tlbs for page table walking

2022-10-07 Thread Peter Maydell
On Sat, 1 Oct 2022 at 17:52, Richard Henderson
 wrote:
>
> So far, limit the change to S1_ptw_translate, arm_ldl_ptw, and
> arm_ldq_ptw.  Use probe_access_full to find the host address,
> and if so use a host load.  If the probe fails, we've got our
> fault info already.  On the off chance that page tables are not
> in RAM, continue to use the address_space_ld* functions.
>
> Signed-off-by: Richard Henderson 
> ---


> -static bool ptw_attrs_are_device(uint64_t hcr, ARMCacheAttrs cacheattrs)
> +static bool S2_attrs_are_device(uint64_t hcr, uint8_t attrs)
>  {
>  /*
>   * For an S1 page table walk, the stage 1 attributes are always
> @@ -202,41 +203,72 @@ static bool ptw_attrs_are_device(uint64_t hcr, 
> ARMCacheAttrs cacheattrs)
>   * With HCR_EL2.FWB == 1 this is when descriptor bit [4] is 0, ie
>   * when cacheattrs.attrs bit [2] is 0.
>   */
> -assert(cacheattrs.is_s2_format);
>  if (hcr & HCR_FWB) {
> -return (cacheattrs.attrs & 0x4) == 0;
> +return (attrs & 0x4) == 0;
>  } else {
> -return (cacheattrs.attrs & 0xc) == 0;
> +return (attrs & 0xc) == 0;
>  }

The upcoming v8R support has its stage 2 attributes in the MAIR
format, so it might be a little awkward to assume the v8A-stage-2
format here rather than being able to add the "if !is_s2_format"
condition. I guess we'll deal with that when we get to it...

> +env->tlb_fi = fi;
> +flags = probe_access_full(env, addr, MMU_DATA_LOAD,
> +  arm_to_core_mmu_idx(s2_mmu_idx),
> +  true, hphys, , 0);
> +env->tlb_fi = NULL;
> +
> +if (unlikely(flags & TLB_INVALID_MASK)) {
> +goto fail;
> +}
> +*gphys = full->phys_addr;
> +pte_attrs = full->pte_attrs;
> +pte_secure = full->attrs.secure;
> +}
> +

> --- a/target/arm/tlb_helper.c
> +++ b/target/arm/tlb_helper.c
> @@ -208,10 +208,21 @@ bool arm_cpu_tlb_fill(CPUState *cs, vaddr address, int 
> size,
>bool probe, uintptr_t retaddr)
>  {
>  ARMCPU *cpu = ARM_CPU(cs);
> -ARMMMUFaultInfo fi = {};
>  GetPhysAddrResult res = {};
> +ARMMMUFaultInfo local_fi, *fi;
>  int ret;
>
> +/*
> + * Allow S1_ptw_translate to see any fault generated here.
> + * Since this may recurse, read and clear.
> + */
> +fi = cpu->env.tlb_fi;
> +if (fi) {
> +cpu->env.tlb_fi = NULL;
> +} else {
> +fi = memset(_fi, 0, sizeof(local_fi));
> +}

This makes two architectures now that want to do "call a probe_access
function, and get information that's known in the architecture-specific
tlb_fill function", and need to do it via this awkward "have tlb_fill
know that it should stash the info away in the CPU state struct somewhere"
trick (the other being s390 tlb_fill_exc/tlb_fill_tec). But I don't
really have a better idea, so

Reviewed-by: Peter Maydell 

thanks
-- PMM



[PATCH v3 27/42] target/arm: Use softmmu tlbs for page table walking

2022-10-01 Thread Richard Henderson
So far, limit the change to S1_ptw_translate, arm_ldl_ptw, and
arm_ldq_ptw.  Use probe_access_full to find the host address,
and if so use a host load.  If the probe fails, we've got our
fault info already.  On the off chance that page tables are not
in RAM, continue to use the address_space_ld* functions.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h|   5 +
 target/arm/ptw.c| 207 ++--
 target/arm/tlb_helper.c |  17 +++-
 3 files changed, 155 insertions(+), 74 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 732c0c00ac..7108568685 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -225,6 +225,8 @@ typedef struct CPUARMTBFlags {
 target_ulong flags2;
 } CPUARMTBFlags;
 
+typedef struct ARMMMUFaultInfo ARMMMUFaultInfo;
+
 typedef struct CPUArchState {
 /* Regs for current mode.  */
 uint32_t regs[16];
@@ -715,6 +717,9 @@ typedef struct CPUArchState {
 struct CPUBreakpoint *cpu_breakpoint[16];
 struct CPUWatchpoint *cpu_watchpoint[16];
 
+/* Optional fault info across tlb lookup. */
+ARMMMUFaultInfo *tlb_fi;
+
 /* Fields up to this point are cleared by a CPU reset */
 struct {} end_reset_fields;
 
diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 45adb9d5a9..ba496c3421 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -9,6 +9,7 @@
 #include "qemu/osdep.h"
 #include "qemu/log.h"
 #include "qemu/range.h"
+#include "exec/exec-all.h"
 #include "cpu.h"
 #include "internals.h"
 #include "idau.h"
@@ -191,7 +192,7 @@ static bool regime_translation_disabled(CPUARMState *env, 
ARMMMUIdx mmu_idx,
 return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
 }
 
-static bool ptw_attrs_are_device(uint64_t hcr, ARMCacheAttrs cacheattrs)
+static bool S2_attrs_are_device(uint64_t hcr, uint8_t attrs)
 {
 /*
  * For an S1 page table walk, the stage 1 attributes are always
@@ -202,41 +203,72 @@ static bool ptw_attrs_are_device(uint64_t hcr, 
ARMCacheAttrs cacheattrs)
  * With HCR_EL2.FWB == 1 this is when descriptor bit [4] is 0, ie
  * when cacheattrs.attrs bit [2] is 0.
  */
-assert(cacheattrs.is_s2_format);
 if (hcr & HCR_FWB) {
-return (cacheattrs.attrs & 0x4) == 0;
+return (attrs & 0x4) == 0;
 } else {
-return (cacheattrs.attrs & 0xc) == 0;
+return (attrs & 0xc) == 0;
 }
 }
 
 /* Translate a S1 pagetable walk through S2 if needed.  */
-static hwaddr S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx,
-   hwaddr addr, bool *is_secure_ptr, bool debug,
-   ARMMMUFaultInfo *fi)
+static bool S1_ptw_translate(CPUARMState *env, ARMMMUIdx mmu_idx, hwaddr addr,
+ bool *is_secure_ptr, void **hphys, hwaddr *gphys,
+ bool debug, ARMMMUFaultInfo *fi)
 {
 bool is_secure = *is_secure_ptr;
 ARMMMUIdx s2_mmu_idx = is_secure ? ARMMMUIdx_Stage2_S : ARMMMUIdx_Stage2;
+bool s2_phys = false;
+uint8_t pte_attrs;
+bool pte_secure;
 
-if (arm_mmu_idx_is_stage1_of_2(mmu_idx) &&
-!regime_translation_disabled(env, s2_mmu_idx, is_secure)) {
-GetPhysAddrResult s2 = {};
-uint64_t hcr;
-int ret;
+if (!arm_mmu_idx_is_stage1_of_2(mmu_idx)
+|| regime_translation_disabled(env, s2_mmu_idx, is_secure)) {
+s2_mmu_idx = is_secure ? ARMMMUIdx_Phys_S : ARMMMUIdx_Phys_NS;
+s2_phys = true;
+}
 
-ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, s2_mmu_idx,
- is_secure, false, debug, , fi);
-if (ret) {
-assert(fi->type != ARMFault_None);
-fi->s2addr = addr;
-fi->stage2 = true;
-fi->s1ptw = true;
-fi->s1ns = !is_secure;
-return ~0;
+if (unlikely(debug)) {
+/*
+ * From gdbstub, do not use softmmu so that we don't modify the
+ * state of the cpu at all, including softmmu tlb contents.
+ */
+if (s2_phys) {
+*gphys = addr;
+pte_attrs = 0;
+pte_secure = is_secure;
+} else {
+GetPhysAddrResult s2 = { };
+if (!get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, s2_mmu_idx,
+is_secure, false, debug, , fi)) {
+goto fail;
+}
+*gphys = s2.f.phys_addr;
+pte_attrs = s2.cacheattrs.attrs;
+pte_secure = s2.f.attrs.secure;
 }
+*hphys = NULL;
+} else {
+CPUTLBEntryFull *full;
+int flags;
 
-hcr = arm_hcr_el2_eff_secstate(env, is_secure);
-if ((hcr & HCR_PTW) && ptw_attrs_are_device(hcr, s2.cacheattrs)) {
+env->tlb_fi = fi;
+flags = probe_access_full(env, addr, MMU_DATA_LOAD,
+  arm_to_core_mmu_idx(s2_mmu_idx),
+  true, hphys, , 0);
+