On Tue, Apr 15, 2025 at 3:22 PM Ziqiao Kong <ziqiaok...@gmail.com> wrote: > > On Tue, Apr 15, 2025 at 3:19 PM Ziqiao Kong <ziqiaok...@gmail.com> wrote: > > > > Hello Philippe, > > > > On Tue, Apr 15, 2025 at 3:15 PM Philippe Mathieu-Daudé > > <phi...@linaro.org> wrote: > > > > > > On 15/4/25 09:04, Ziqiao Kong wrote: > > > > Accidentally not cc all recipients. Sorry for the confusion. Below is > > > > the duplicated message: > > > > > > > > Hello Philippe, > > > > > > > > On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé > > > > <phi...@linaro.org> wrote: > > > >> > > > >> Hi, > > > >> > > > >> On 14/4/25 18:59, Ziqiao Kong wrote: > > > >>> Hello Philippe, > > > >>> > > > >>> Any further concern regarding this series? I certainly would like to > > > >>> investigate > > > >>> and help =). > > > >> > > > >> Short term I can't keep looking because I'm busy with other stuffs and > > > >> tagged this patch for another review, because there is some endianness > > > >> code smell in get_physical_address(). I understand your change fixes > > > >> your issue, but I'm skeptical about it, in part because there are no > > > >> such use in the whole code base. My change suggestion is just a > > > >> starting > > > >> point, more is needed. > > > > > > > > Thanks for responding. > > > > > > > > Actually, the pattern of this usage is actually very common in the code > > > > base and > > > > that's why I fixed in this way. Sorry I should have put this in the > > > > cover letter to > > > > justify my fix. Below is an incomplete list of the code using this > > > > pattern: > > > > > > > > - target/i386/tcg/system/excp_helper.c:129 > > > > > > > > if (likely(in->haddr)) { > > > > old = cpu_to_le32(old); > > > > new = cpu_to_le32(new); > > > > return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old; > > > > } > > > > > > > > - target/arm/ptw.c: 840 > > > > > > > > if (ptw->out_be) { > > > > old_val = cpu_to_be64(old_val); > > > > new_val = cpu_to_be64(new_val); > > > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val); > > > > cur_val = be64_to_cpu(cur_val); > > > > } else { > > > > old_val = cpu_to_le64(old_val); > > > > new_val = cpu_to_le64(new_val); > > > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val); > > > > cur_val = le64_to_cpu(cur_val); > > > > } > > > > > > Doh OK... > > > > > > > > > > > You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all > > > > matches. > > > > > > > > > > > >> > > > >>> > > > >>> Bests, > > > >>> Ziqiao > > > >>> > > > >>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaok...@gmail.com> > > > >>> wrote: > > > >>>> > > > >>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé > > > >>>> <phi...@linaro.org> wrote: > > > >>>>> > > > >>>>> Hi, > > > >>>>> > > > >>>>> On 14/4/25 05:46, Ziqiao Kong wrote: > > > >>>>>> On big endian systems, pte and updated_pte hold big endian host > > > >>>>>> data > > > >>>>>> while pte_pa points to little endian target data. This means the > > > >>>>>> branch > > > >>>>>> at cpu_helper.c:1669 will be always satisfied and restart > > > >>>>>> translation, > > > >>>>>> causing an endless translation loop. > > > >>>>>> > > > >>>>> > > > >>>>> Cc: qemu-sta...@nongnu.org > > > >>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers") > > > >>>>> > > > >>>>>> Signed-off-by: Ziqiao Kong <ziqiaok...@gmail.com> > > > >>>>>> --- > > > >>>>>> target/riscv/cpu_helper.c | 4 ++-- > > > >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) > > > >>>>>> > > > >>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > > > >>>>>> index 6c4391d96b..bc146771c8 100644 > > > >>>>>> --- a/target/riscv/cpu_helper.c > > > >>>>>> +++ b/target/riscv/cpu_helper.c > > > >>>>>> @@ -1662,9 +1662,9 @@ static int > > > >>>>>> get_physical_address(CPURISCVState *env, hwaddr *physical, > > > >>>>>> target_ulong *pte_pa = > > > >>>>>> qemu_map_ram_ptr(mr->ram_block, addr1); > > > >>>>>> target_ulong old_pte; > > > >>>>>> if (riscv_cpu_sxl(env) == MXL_RV32) { > > > >>>>>> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, > > > >>>>>> pte, updated_pte); > > > >>>>>> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, > > > >>>>>> cpu_to_le32(pte), cpu_to_le32(updated_pte)); > > > > > > Then don't we need: > > > > > > old_pte = le32_to_cpu(old_pte); > > > > > > >>>>>> } else { > > > >>>>>> - old_pte = qatomic_cmpxchg(pte_pa, pte, > > > >>>>>> updated_pte); > > > >>>>>> + old_pte = qatomic_cmpxchg(pte_pa, > > > >>>>>> cpu_to_le64(pte), cpu_to_le64(updated_pte)); > > > > > > old_pte = le64_to_cpu(old_pte); > > > > > > ?
Oh you are correct. I will draft a v3. > > > > Note old_pte is no longer used later (it is defined within the scope here) > > and > > dropped immediately after qatomic_cmpxchg so we don't need that extra bswap. > > Also pte is not modified inplace previously, unlike the code samples above. > > > > > > > > > >>>>>> } > > > >>>>>> if (old_pte != pte) { > > > >>>>>> goto restart; > > > >>>>> > > > >>>>> If PTEs are always stored in LE order, maybe what we want is > > > >>>>> earlier: > > > >>>>> > > > >>>>> -- >8 -- > > > >>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c > > > >>>>> index 619c76cc001..b6ac2800240 100644 > > > >>>>> --- a/target/riscv/cpu_helper.c > > > >>>>> +++ b/target/riscv/cpu_helper.c > > > >>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState > > > >>>>> *env, hwaddr *physical, > > > >>>>> if (riscv_cpu_mxl(env) == MXL_RV32) { > > > >>>>> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res); > > > >>>>> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, > > > >>>>> &res); > > > >>>>> } else { > > > >>>>> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res); > > > >>>>> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, > > > >>>>> &res); > > > >>>> > > > >>>> Unfortunately, this doesn't work in two ways: > > > >>>> > > > >>>> 1. Note pte is used in the following code and that means pte must > > > >>>> hold > > > >>>> a correct value from the > > > >>>> view of host endian (in my case, big endian not little endian). > > > >>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while > > > >>>> address_space_leq will dispatch to ldq_p. > > > >>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so > > > >>>> making no effects. > > > >>>> > > > >>>> Per my testing, this patch doesn't have any effect indeed. To have a > > > >>>> brief view what is happening, > > > >>>> see the logs just before atomic_cmpxchg: > > > >>>> > > > >>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f > > > >>>> > > > >>>>> } > > > >>>>> --- > > > >> > > >