Re: [Qemu-devel] [Qemu-arm] [patch 1/1]about armv8's prefetch decode

Wangjintang Fri, 24 Mar 2017 19:23:42 -0700

Hi Peter,
        More detail illustration at below.

> -----Original Message-----
> From: Peter Maydell [mailto:peter.mayd...@linaro.org]
> Sent: Friday, March 24, 2017 6:06 PM
> To: Wangjintang
> Cc: Pranith Kumar; Shlomo Pongratz (A); Wanghaibin (Benjamin); qemu-arm;
> qemu-devel; Ori Chalak (A)
> Subject: Re: [Qemu-arm] [patch 1/1]about armv8's prefetch decode
> No, these changes look wrong. PRFM instructions do not need to
> do anything and should definitely not be emitting any intermediate
> code. In particular if you let execution fall through and try
> do_gpr_ld() then it will really do a load, which might cause
> an exception -- this is specifically forbidden for PRFM.
> Architecturally the ARM ARM says "it is valid for the PE to
> treat any or all prefetch instructions as a NOP", which is
> what QEMU does.
> 
> The existing code is correct. In general you should not
> expect to be able to deduce the guest instructions from
> the intermediate code representation.
>

"it is valid for the PE to treat any or all prefetch instructions as a NOP",
from software view, it's right.
the patch regard the prefetch as load instruction, at the same time
don't affect rm/rt register. Only the PRFM instruction been emitted to
intermediate code and do a really load, then we can get the memory
address relative to the prefetch instruction. Because the rm/rt register
don't been modified, so the application can run correctly.
BTW, the new added code default is disable. So for the common user, have no
affect to them.

In our case, we need all the instruction trace & ld/st instruction's
access memory address, the trace as the input for chip cycle-accurate
model. Similar with flexus + qemu.
Current code that skip generate prefetch instructions' intermediate code,
So we can get prefetch instruction, but can't get the prefetch instruction
relative memory address.
We have tested that the ratio of prefetch instructions is about 2%~3% during
run Dhrystone in system mode. The ratio is high.
________________ ________________
| | | |
| | | |
| Qemu | | chip |
| | instruction trace | cycle-accurate |
| | -----------------> | model |
| | memory trace | |
|________________| |________________|

Ori Chalak's explain this as below:
" Indeed, prefetch instruction affects only the micro architecture,
and hence not needed for running correctly the generated code.
However, we developed a performance simulator for a detailed
ARMv8 CPU model, and use Qemu to resolve the functionality.
And for this purpose we need to translate all instructions that
may affect the pipeline behavior, caches, etc.

This is not the major usage of Qemu, however there may be
others doing this and it may help them.
http://www.linux-kvm.org/images/4/45/01x09-Christopher_Covington-Using_Upstream_QEMU_for_CASS.pdf
"

Best Regards,
Wang jintang / Jed
Huawei Technologies Co., Ltd.
Email: wangjint...@huawei.com
http://www.huawei.com

Re: [Qemu-devel] [Qemu-arm] [patch 1/1]about armv8's prefetch decode

Reply via email to