On 2023/1/30 13:43, Richard Henderson wrote:
On 1/29/23 16:03, LIU Zhiwei wrote:
Thanks. It's a bug. We should load all memory addresses to local TCG
temps first.
Do you think we should probe all the memory addresses for the store
pair instructions? If so, can we avoid the use of a helper function?
Depends on what the hardware does. Even with a trap in the middle the
stores are restartable, since no register state changes.
I refer to the specification of LDP and STP on AARCH64. The
specification allows
"any access performed before the exception was taken is repeated".
In detailed,
"If, according to these rules, an instruction is executed as a sequence of
accesses, exceptions, including interrupts,
can be taken during that sequence, regardless of the memory type being
accessed. If any of these exceptions are
returned from using their preferred return address, the instruction that
generated the sequence of accesses is
re-executed, and so any access performed before the exception was taken is
repeated. See also Taking an interrupt
during a multi-access load or store on page D1-4664."
However I see the implementation of LDP and STP on QEMU are in different
ways. LDP will only load the first register when it ensures no trap in
the second access.
So I have two questions here.
1) One for the QEMU implementation about LDP. Can we implement the LDP
as two directly loads to cpu registers instead of local TCG temps?
2) One for the comment. Why register state changes cause
non-restartable? Do you mean if the first register changes, it may
influence the calculation of address after the trap?
"Even with a trap in the middle the stores are restartable, since no register state
changes."
But if you'd like no changes verifying both stores, for this case you
can pack the pair into a larger data type: TCGv_i64 for pair of
32-bit, and TCGv_i128 for pair of 64-bit.
Patches for TCGv_i128 [1] are just finishing review; patches to
describe atomicity of the larger operation are also on list [2].
Anyway, the idea is that you issue one TCG memory operation, the
entire operation is validated, and then the stores happen.
The main reason is that assembler can do this check. Is it necessary
to check this in QEMU?
Yes. Conciser what happens when the insn is encoded with .long. Does
the hardware trap an illegal instruction? Is the behavior simply
unspecified? The manual could be improved to specify, akin to the Arm
terms: UNDEFINED, CONSTRAINED UNPREDICTABLE, IMPLEMENTATION DEFINED, etc.
Thanks, I will fix the manual.
Best Regards,
Zhiwei
r~
[1]
https://patchew.org/QEMU/20230126043824.54819-1-richard.hender...@linaro.org/
[2]
https://patchew.org/QEMU/20221118094754.242910-1-richard.hender...@linaro.org/