On Fri, 08 Aug 2025 05:07, Pierrick Bouvier <pierrick.bouv...@linaro.org> wrote:
>This documentation summarizes how to use the plugin, and present two
>examples of the possibilities offered by it, in system and user mode.
>
>As well, it explains how to rebuild and reproduce those examples.
>
>Signed-off-by: Pierrick Bouvier <pierrick.bouv...@linaro.org>
>---
> docs/about/emulation.rst | 197 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 197 insertions(+)
>
>diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst
>index 456d01d5b08..9ce47ac2712 100644
>--- a/docs/about/emulation.rst
>+++ b/docs/about/emulation.rst
>@@ -816,6 +816,203 @@ This plugin can limit the number of Instructions Per 
>Second that are executed::
>       The lower the number the more accurate time will be, but the less 
> efficient the plugin.
>       Defaults to ips/10
> 
>+Uftrace
>+.......
>+
>+``contrib/plugins/uftrace.c``
>+
>+This plugin generates a binary trace compatible with
>+`uftrace <https://github.com/namhyung/uftrace>`_.
>+
>+Plugin supports aarch64 and x64, and works in user and system mode, allowing 
>to
>+trace a system boot, which is not something possible usually.

Now it is!

>+
>+In user mode, the memory mapping is directly copied from ``/proc/self/maps`` 
>at
>+the end of execution. Uftrace should be able to retrieve symbols by itself,
>+without any additional step.
>+In system mode, the default memory mapping is empty, and you can generate
>+one (and associated symbols) using ``contrib/plugins/uftrace_symbols.py``.
>+Symbols must be present in ELF binaries.
>+
>+It tracks the call stack (based on frame pointer analysis). Thus, your program
>+and its dependencies must be compiled using ``-fno-omit-frame-pointer
>+-mno-omit-leaf-frame-pointer``. In 2024, `Ubuntu and Fedora enabled it by
>+default again on x64
>+<https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html>`_.
>+On aarch64, this is less of a problem, as they are usually part of the ABI,
>+except for leaf functions. That's true for user space applications, but not
>+necessarily for bare metal code. You can read this `section
>+<uftrace_build_system_example>` to easily build a system with frame pointers.
>+
>+When tracing long scenarios (> 1 min), the generated trace can become very 
>long,
>+making it hard to extract data from it. In this case, a simple solution is to
>+trace execution while generating a timestamped output log using
>+``qemu-system-aarch64 ... | ts "%s"``. Then, ``uftrace 
>--time-range=start~end``
>+can be used to reduce trace for only this part of execution.
>+
>+Performance wise, overhead compared to normal tcg execution is around x5-x15.
>+
>+.. list-table:: Uftrace plugin arguments
>+  :widths: 20 80
>+  :header-rows: 1
>+
>+  * - Option
>+    - Description
>+  * - trace-privilege-level=[on|off]
>+    - Generate separate traces for each privilege level (Exception Level +
>+      Security State on aarch64, Rings on x64).
>+
>+.. list-table:: uftrace_symbols.py arguments
>+  :widths: 20 80
>+  :header-rows: 1
>+
>+  * - Option
>+    - Description
>+  * - elf_file [elf_file ...]
>+    - path to an ELF file. Use /path/to/file:0xdeadbeef to add a mapping 
>offset.
>+  * - --prefix-symbols
>+    - prepend binary name to symbols
>+
>+Example user trace
>+++++++++++++++++++
>+
>+As an example, we can trace qemu itself running git::
>+
>+    $ ./build/qemu-aarch64 -plugin \
>+      build/contrib/plugins/libuftrace.so \
>+      ./build/qemu-aarch64 /usr/bin/git --help
>+
>+    # and generate a chrome trace directly
>+    $ uftrace dump --chrome | gzip > ~/qemu_aarch64_git_help.json.gz
>+
>+For convenience, you can download this trace `qemu_aarch64_git_help.json.gz
>+<https://fileserver.linaro.org/s/N8X8fnZ5yGRZLsT/download/qemu_aarch64_git_help.json.gz>`_.

We should be able to add static files in the docs/ folder that sphinx 
html can link to for images and json. WDYT?

>+Download it and open this trace on https://ui.perfetto.dev/. You can zoom 
>in/out
>+using w,a,s,d keys. Some sequences taken from this trace:

You can use :kbd:`W` etc for nice key formatting

>+
>+- Loading program and its interpreter
>+
>+.. image:: https://fileserver.linaro.org/s/fie8JgX76yyL5cq/preview
>+   :height: 200px
>+
>+- open syscall
>+
>+.. image:: https://fileserver.linaro.org/s/rsXPTeZZPza4PcE/preview
>+   :height: 200px
>+
>+- TB creation
>+
>+.. image:: https://fileserver.linaro.org/s/GXY6NKMw5EeRCew/preview
>+   :height: 200px
>+
>+It's usually better to use ``uftrace record`` directly. However, tracing
>+binaries through qemu-user can be convenient when you don't want to recompile
>+them (``uftrace record`` requires instrumentation), as long as symbols are
>+present.
>+
>+Example system trace
>+++++++++++++++++++++
>+
>+A full trace example (chrome trace, from instructions below) generated from a
>+system boot can be found `here
>+<https://fileserver.linaro.org/s/WsemLboPEzo24nw/download/aarch64_boot.json.gz>`_.
>+Download it and open this trace on https://ui.perfetto.dev/. You can see code
>+executed for all privilege levels, and zoom in/out using w,a,s,d keys. You can
>+find below some sequences taken from this trace:
>+
>+- Two first stages of boot sequence in Arm Trusted Firmware (EL3 and S-EL1)
>+
>+.. image:: https://fileserver.linaro.org/s/kkxBS552W7nYESX/preview
>+   :height: 200px
>+
>+- U-boot initialization (until code relocation, after which we can't track it)
>+
>+.. image:: https://fileserver.linaro.org/s/LKTgsXNZFi5GFNC/preview
>+   :height: 200px
>+
>+- Stat and open syscalls in kernel
>+
>+.. image:: https://fileserver.linaro.org/s/dXe4MfraKg2F476/preview
>+   :height: 200px
>+
>+- Timer interrupt
>+
>+.. image:: https://fileserver.linaro.org/s/TM5yobYzJtP7P3C/preview
>+   :height: 200px
>+
>+- Poweroff sequence (from kernel back to firmware, NS-EL2 to EL3)
>+
>+.. image:: https://fileserver.linaro.org/s/oR2PtyGKJrqnfRf/preview
>+   :height: 200px
>+
>+Build and run system example
>+++++++++++++++++++++++++++++
>+
>+.. _uftrace_build_system_example:
>+
>+Building a full system image with frame pointers is not trivial.
>+
>+We provide a `simple way <https://github.com/pbo-linaro/qemu-linux-stack>`_ to
>+build an aarch64 system, combining Arm Trusted firmware, U-boot, Linux kernel
>+and debian userland. It's based on containers (``podman`` only) and
>+``qemu-user-static (binfmt)`` to make sure it's easily reproducible and does 
>not depend
>+on machine where you build it.
>+
>+You can follow the exact same instructions for a x64 system, combining edk2,
>+Linux, and Ubuntu, simply by switching to
>+`x86_64 <https://github.com/pbo-linaro/qemu-linux-stack/tree/x86_64>`_ branch.
>+
>+To build the system::
>+
>+    # Install dependencies
>+    $ sudo apt install -y podman qemu-user-static
>+
>+    $ git clone https://github.com/pbo-linaro/qemu-linux-stack
>+    $ cd qemu-linux-stack
>+    $ ./build.sh
>+
>+    # system can be started using:
>+    $ ./run.sh /path/to/qemu-system-aarch64
>+
>+To generate a uftrace for a system boot from that::
>+
>+    # run true and poweroff the system
>+    $ env INIT=true ./run.sh path/to/qemu-system-aarch64 \
>+      -plugin path/to/contrib/plugins/libuftrace.so,trace-privilege-level=on
>+
>+    # generate symbols and memory mapping
>+    $ path/to/contrib/plugins/uftrace_symbols.py \
>+      --prefix-symbols \
>+      arm-trusted-firmware/build/qemu/debug/bl1/bl1.elf \
>+      arm-trusted-firmware/build/qemu/debug/bl2/bl2.elf \
>+      arm-trusted-firmware/build/qemu/debug/bl31/bl31.elf \
>+      u-boot/u-boot:0x60000000 \
>+      linux/vmlinux
>+
>+    # inspect trace with
>+    $ uftrace replay
>+
>+Uftrace allows to filter the trace, and dump flamegraphs, or a chrome trace.
>+This last one is very interesting to see visually the boot process::
>+
>+    $ uftrace dump --chrome > boot.json
>+    # Open your browser, and load boot.json on https://ui.perfetto.dev/.
>+
>+Long visual chrome traces can't be easily opened, thus, it might be
>+interesting to generate them around a particular point of execution::
>+
>+    # execute qemu and timestamp output log
>+    $ env INIT=true ./run.sh path/to/qemu-system-aarch64 \
>+      -plugin path/to/contrib/plugins/libuftrace.so,trace-privilege-level=on 
>|&
>+      ts "%s" | tee exec.log
>+
>+    $ cat exec.log  | grep 'Run /init'
>+      1753122320 [   11.834391] Run /init as init process
>+      # init was launched at 1753122320
>+
>+    # generate trace around init execution (2 seconds):
>+    $ uftrace dump --chrome --time-range=1753122320~1753122322 > init.json
>+
> Other emulation features
> ------------------------
> 
>-- 
>2.47.2
>


Sounds comprehensive all in all. I will try to follow the instructions 
and post a Tested-by

For the doc text:

Reviewed-by: Manos Pitsidianakis <manos.pitsidiana...@linaro.org> 

Reply via email to