Alistair Francis <alistai...@gmail.com> writes: >> @@ -310,10 +311,17 @@ bool pmp_hart_has_privs(CPURISCVState *env, >> target_ulong addr, >> } >> >> if (size == 0) { >> - if (riscv_feature(env, RISCV_FEATURE_MMU)) { >> + if (riscv_cpu_mxl(env) == MXL_RV32) { >> + satp_mode = SATP32_MODE; >> + } else { >> + satp_mode = SATP64_MODE; >> + } >> + >> + if (riscv_feature(env, RISCV_FEATURE_MMU) >> + && get_field(env->satp, satp_mode)) { >> /* >> - * If size is unknown (0), assume that all bytes >> - * from addr to the end of the page will be accessed. >> + * If size is unknown (0) and virtual memory is enabled, assume >> that >> + * all bytes from addr to the end of the page will be accessed. >> */ >> pmp_size = -(addr | TARGET_PAGE_MASK); > > I'm not sure if we need this at all. > > This function is only called from get_physical_address_pmp() which > then calculates the maximum size using pmp_is_range_in_tlb().
I'm by no means an expert on QEMU and the TCG, so I've spun up GDB to trace down why exactly this function is called with a `size = 0` argument. It seems that there are, generally, two code paths to this function for instruction fetching: 1. From `get_page_addr_code`: this will invoke `tlb_fill` with `size = 0` to check whether an entire page is accessible and can be translated given the current MMU / PMP configuration. In my particular example, it may rightfully fail then. `get_page_addr_code` can handle this and will subsequently cause an MMU protection check to be run for each instruction translated. 2. From `riscv_tr_translate_insn` through `cpu_lduw_code`, which will execute `tlb_fill` with `size = 2` (to try and decode a compressed instruction), assuming that the above check failed. So far, so good. In this context, it actually makes sense for `pmp_hart_has_privs` to interpret `size = 0` to mean whether the entire page is allowed to be accessed. > I suspect that we could just use sizeof(target_ulong) as the fallback > for every time size == 0. Then pmp_is_range_in_tlb() will set the > tlb_size to the maximum possible size of the PMP region. Given the above, I don't think that this is correct either. The PMP check would pass even for non-page sized regions, but the entire page would be accessible through the TCG's TLB, as a consequence of `get_page_addr_code`. In the current implementation, `get_page_addr_code_hostp` calls `tlb_fill`, which in turn invokes the RISC-V TCG op `tlb_fill` with the parameter `probe = false`. This in turn raises a PMP exception in the CPU, whereas `get_page_addr_code` would seem to expect this a failing `tlb_fill` to be side-effect free, such that the MMU protection checks can be re-run per instruction in the TCG code generation phase. I think that this is sufficient evidence to conclude that my initial patch is actually incorrect, however I am unsure as to how this issue can be solved proper. One approach which seems to work is to change `get_page_addr_code_hostp` to use a non-faulting page-table read instead: @@ -1510,11 +1510,15 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr, uintptr_t mmu_idx = cpu_mmu_index(env, true); uintptr_t index = tlb_index(env, mmu_idx, addr); CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); + CPUState *cs = env_cpu(env); + CPUClass *cc = CPU_GET_CLASS(cs); void *p; if (unlikely(!tlb_hit(entry->addr_code, addr))) { if (!VICTIM_TLB_HIT(addr_code, addr)) { - tlb_fill(env_cpu(env), addr, 0, MMU_INST_FETCH, mmu_idx, 0); + // Nonfaulting page-table read: + cc->tcg_ops->tlb_fill(cs, addr, 0, MMU_INST_FETCH, mmu_idx, true, + 0); index = tlb_index(env, mmu_idx, addr); entry = tlb_entry(env, mmu_idx, addr); However, given this touches the generic TCG implementation, I cannot judge whether this is correct or has any unintended side effects for other targets. If this is correct, I'd be happy to send a proper patch. -Leon