пт, 11 янв. 2019 г. в 12:52, Peter Maydell <peter.mayd...@linaro.org>:
>
> On Thu, 10 Jan 2019 at 19:33, Matwey V. Kornilov
> <matwey.korni...@gmail.com> wrote:
> > I am running the same application compiled for aarch64 and armv7l on
> > x86_64 platform using qemu-user-linux tools.
> >
> > I see dramatic performance difference (30 times) between emulated
> > architectures: aarch64 runs for ~4 minutes, armv7l runs for ~2 hours.
> > I do understand that CPU architecture emulation is inherently slow
> > thing, but my question is about the difference.
> >
> > How could I debug to understand what is the reason for such a big
> > difference? I've already tried to run stress-ng compiled for this two
> > architectures, but it leads to the same performance per second.
> >
> > I am running qemu 2.11, should I try other version?
>
> Yes, do try 3.1 -- we have done some overall TCG performance
> improvements.

Indeed, qemu-arm from master runs for 4 minutes where 2.11 runs for 2
hours for me. It is impressive improvement.

>
> For a big difference between target architectures like that,
> I would try starting by using some host performance tools on
> the two runs to see where all the time is being taken in
> the armv7l guest run -- is it all in translated guest code,
> or is there more time (proportionally) spent in particular
> parts of the QEMU C code? Does the armv7l version do
> many more or different syscalls (check with the QEMU -strace
> option) ?
>
> Also you should check performance on h/w 32 bit vs
> 64-bit Arm if you can, to confirm that it's not just
> that the guest application runs much slower there.
> (If you don't have the arm hardware you could at least
> check x86 32-bit vs 64-bit.)
>
> thanks
> -- PMM



-- 
With best regards,
Matwey V. Kornilov

Reply via email to