On 06.09.2013, at 10:10, Alexey Kardashevskiy wrote: > On the real hardware, RTAS is called in real mode and therefore > ignores top 4 bits of the address passed in the call. > > This fixes QEMU to use softmmu which can chop top 4 bits > if MSR DR is not set. > > Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > --- > Changes: > v2: > * masking from replaced with the use of cpu_ldl_data which can handle > realmode case properly > --- > hw/ppc/spapr_hcall.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c > index 063bd36..30f90bf 100644 > --- a/hw/ppc/spapr_hcall.c > +++ b/hw/ppc/spapr_hcall.c > @@ -4,6 +4,7 @@ > #include "hw/ppc/spapr.h" > #include "mmu-hash64.h" > #include "cpu-models.h" > +#include "exec/softmmu_exec.h" > > #include <libfdt.h> > > @@ -523,10 +524,11 @@ static target_ulong h_cede(PowerPCCPU *cpu, > sPAPREnvironment *spapr, > static target_ulong h_rtas(PowerPCCPU *cpu, sPAPREnvironment *spapr, > target_ulong opcode, target_ulong *args) > { > + CPUPPCState *env = &cpu->env; > target_ulong rtas_r3 = args[0]; > - uint32_t token = ldl_be_phys(rtas_r3); > - uint32_t nargs = ldl_be_phys(rtas_r3 + 4); > - uint32_t nret = ldl_be_phys(rtas_r3 + 8); > + uint32_t token = cpu_ldl_data(env, rtas_r3); > + uint32_t nargs = cpu_ldl_data(env, rtas_r3 + 4); > + uint32_t nret = cpu_ldl_data(env, rtas_r3 + 8);
Wow - this really aren't that many memory accesses. So looking through rtas calls that come after this dispatch, from what I see they mostly use rtas_ld and rtas_st. They are defined as static inline uint32_t rtas_ld(target_ulong phys, int n) { return ldl_be_phys(phys + 4*n); } static inline void rtas_st(target_ulong phys, int n, uint32_t val) { stl_be_phys(phys + 4*n, val); } which means we need to change them as well to make rtas subcall memory accesses resolve properly. Which brings me to my next question: Why don't we use rtas_ld in the 3 calls above? It looks like a perfect fit really. Also, just changing the call to cpu_ldl_data will potentially break I think. Imagine you do cpu_synchronize_state with DR=1 in a previous call. Now comes an RTAS call. Because you don't do a cpu_synchronize_state() call here, you will still have the DR=1 in MSR, accessing virtual memory at a potentially very low address that may not be mapped in. So before you do the cpu_ldl_data you need to either do cpu_synchronize_state() - which can be slow - or you need to manually change MSR.DR to 0 so that we end up accessing the correct memory location always. If you encapsulate that in rtas_ld and rtas_st it might end up looking quite simple: static inline uint32_t rtas_ld(CPUPPCState *env, target_ulong phys, int n) { uint32_t r; target_ulong msr = env->msr; /* Make sure we access RTAS data in guest real mode */ env->msr ~= MSR_DR; r = ldl_be_phys(phys + 4*n); env->msr = msr; return r; } Or if you think it's too complicated, you can simply make it static inline uint64_t ppc64_phys_to_real(uint64_t addr) { return addr & ~0xF000000000000000ULL; } static inline uint32_t rtas_ld(target_ulong phys, int n) { return ldl_be_phys(ppc64_phys_to_real(phys) + 4*n); } All things considered, I might even prefer the latter solution as it makes the code flow more obvious. But I'll leave the final call to you. Alex