analyze and dump JITed code with perf.
Example of use:
perf record -k 1 qemu-x86_64 -perf ./a.out
perf inject -j -i perf.data -o perf.data.jitted
perf report -i perf.data.jitted
vandersonmr (2):
accel/tcg: adding integration with linux perf
tb-stats: adding TBStatistics info into perf dump
Adding TBStatistics information to linux perf TB's symbol names.
This commit depends on the following PATCH:
[PATCH v5 00/10] Measure Tiny Code Generation Quality
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/perf/jitdump.c | 16 +++-
1 file changed, 15 insertions(+), 1 delet
This commit adds support to Linux Perf in order
to be able to analyze qemu jitted code and
also to able to see the TBs PC in it.
When using "-perf" qemu creates a jitdump file in
the current working directory. The file format
specification can be found in:
https://github.com/torvalds/linux/blob/ma
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.
tb_stats filter allow to only collect statistics for the TB
in the last_search list.
The goal of this command is to allow the dynamic exploration
of the TCG behavior and qu
These commands allow the exploration of TBs
generated by the TCG. Understand which one
hotter, with more guest/host instructions...
and examine their guest, host and IR code.
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now, a
corres
Replace all others CONFIG_PROFILER statistics and migrate it to
TBStatistics system. However, TCGProfiler still exists and can
be use to store global statistics and times. All TB related
statistics goes to TBStatistics.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tb-stats.c | 91 +
Adding tb_stats [start|pause|stop|filter] command to hmp.
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.
tb_stats filter allow to only collect statistics for the TB
in the last_search list.
The goal of this command is t
Adding "info cfg id depth" commands to HMP.
This command allow the exploration a TB
neighbors by dumping [and opening] a .dot
file with the TB CFG neighbors colorized
by their hotness.
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now
-d tb_stats[[,level=(+all+jit+exec+time)][,dump_limit=]]
"dump_limit" is used to limit the number of dumped TBStats in
linux-user mode.
[all+jit+exec+time] control the profilling level used
by the TBStats. Can be used as follow:
-d tb_stats
-d tb_stats,level=jit+time
-d tb_stats,dump_limit=15
.
dumps, in linux-user mode, the hottest TBs if -d tb_stats is used.
Example of output for the 3 hottest TBs:
TB id:1 | phys:0x34d54 virt:0x00034d54 flags:0xf0
| exec:4828932/0 guest inst cov:16.38%
| trans:1 ints: g:3 op:82 op_opt:34 spills:3
| h/g (host bytes /
If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
enabled, then we instrument the start code of this TB
to atomically count the number of times it is executed.
We count both the number of "normal" executions and atomic
executions of a TB.
The execution count of the TB is stored in its respec
If a TB has a TBS (TBStatistics) with the TB_JIT_STATS
enabled then we collect statistics of its translation
processes and code translation.
Collecting the number of host instructions seems to be
not simple as it would imply in having to modify several
target source files. So, for now, we are only
We add some of the statistics collected in the TCGProfiler
into the TBStats, having the statistics not only for the whole
emulation but for each TB. Then, we removed these stats
from TCGProfiler and reconstruct the information for the
"info jit" using the sum of all TBStats statistics.
The goal is
To store statistics for each TB, we created a TBStatistics structure
which is linked with the TBs. TBStatistics can stay alive after
tb_flush and be relinked to a regenerated TB. So the statistics can
be accumulated even through flushes.
The goal is to have all present and future qemu/tcg statisti
.
- change info tbs to info tb-list
- fix crash when dumping tb's targets
- fix "liveness/code time" calculation
v5:
- full replacement of CONFIG_PROFILER
- several fixes
- adds "info cfg"
- adds TB's targets to dump
vandersonmr (11):
accel: introducing
dumps, in linux-user mode, the hottest TBs if -d tb_stats is used.
Example of output for the 3 hottest TBs:
TB id:1 | phys:0x34d54 virt:0x00034d54 flags:0xf0
| exec:4828932/0 guest inst cov:16.38%
| trans:1 ints: g:3 op:82 op_opt:34 spills:3
| h/g (host bytes /
These commands allow the exploration of TBs
generated by the TCG. Understand which one
hotter, with more guest/host instructions...
and examine their guest, host and IR code.
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now, a
corres
Adding "info cfg id depth" commands to HMP.
This command allow the exploration a TB
neighbors by dumping [and opening] a .dot
file with the TB CFG neighbors colorized
by their hotness.
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now
Adding tb_stats [start|pause|stop|filter] command to hmp.
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.
tb_stats filter allow to only collect statistics for the TB
in the last_search list.
The goal of this command is t
We add some of the statistics collected in the TCGProfiler
into the TBStats, having the statistics not only for the whole
emulation but for each TB. Then, we removed these stats
from TCGProfiler and reconstruct the information for the
"info jit" using the sum of all TBStats statistics.
The goal is
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.
tb_stats filter allow to only collect statistics for the TB
in the last_search list.
The goal of this command is to allow the dynamic exploration
of the TCG behavior and qu
If a TB has a TBS (TBStatistics) with the TB_JIT_STATS
enabled then we collect statistics of its translation
processes and code translation.
Collecting the number of host instructions seems to be
not simple as it would imply in having to modify several
target source files. So, for now, we are only
-d tb_stats[[,level=(+all+jit+exec+time)][,dump_limit=]]
"dump_limit" is used to limit the number of dumped TBStats in
linux-user mode.
[all+jit+exec+time] control the profilling level used
by the TBStats. Can be used as follow:
-d tb_stats
-d tb_stats,level=jit+time
-d tb_stats,dump_limit=15
.
- fix crash when dumping tb's targets
- fix "liveness/code time" calculation
v5:
- full replacement of CONFIG_PROFILER
- several fixes
- adds "info cfg"
- adds TB's targets to dump
vandersonmr (10):
accel: collecting TB execution count
accel: collecting JIT
Replace all others CONFIG_PROFILER statistics and migrate it to
TBStatistics system. However, TCGProfiler still exists and can
be use to store global statistics and times. All TB related
statistics goes to TBStatistics.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tb-stats.c | 91 +
If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
enabled, then we instrument the start code of this TB
to atomically count the number of times it is executed.
We count both the number of "normal" executions and atomic
executions of a TB.
The execution count of the TB is stored in its respec
| exec:872032/0 guest inst cov:1.97%
| trans:1 ints: g:2 op:56 op_opt:26 spills:1
| h/g (host bytes / guest insts): 68.00
| time to gen at 2.4GHz => code:1692.08(ns) IR:473.75(ns)
| targets: 0x000ec1c5 (id:4), 0x000ec1cb (id:13)
Signed-off-by: va
These commands allow the exploration of TBs
generated by the TCG. Understand which one
hotter, with more guest/host instructions...
and examine their guest, host and IR code.
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now, a
corres
Adding -d tb_stats to control TBStatistics collection:
-d tb_stats[[,level=(+all+jit+exec+time)][,dump_limit=]]
"dump_limit" is used to limit the number of dumped TBStats in
linux-user mode.
[all+jit+exec+time] control the profilling level used
by the TBStats. Can be used as follow:
-d tb_stat
Adding "info cfg id depth" commands to HMP.
This command allow the exploration a TB
neighbors by dumping [and opening] a .dot
file with the TB CFG neighbors colorized
by their hotness.
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now
Adding tb_stats [start|pause|stop|filter] command to hmp.
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.
tb_stats filter allow to only collect statistics for the TB
in the last_search list.
The goal of this command is t
Replace all others CONFIG_PROFILER statistics and migrate it to
TBStatistics system. However, TCGProfiler still exists and can
be use to store global statistics and times. All TB related
statistics goes to TBStatistics.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tb-stats.c | 91 +
If a TB has a TBS (TBStatistics) with the TB_JIT_STATS
enabled then we collect statistics of its translation
processes and code translation.
Collecting the number of host instructions seems to be
not simple as it would imply in having to modify several
target source files. So, for now, we are only
Signed-off-by: vandersonmr
---
accel/tcg/tb-stats.c | 96 +++
accel/tcg/translate-all.c | 8 +---
include/exec/tb-stats.h | 11 +
tcg/tcg.c | 93 -
tcg/tcg.h | 10
5 files ch
If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
enabled, then we instrument the start code of this TB
to atomically count the number of times it is executed.
We count both the number of "normal" executions and atomic
executions of a TB.
The execution count of the TB is stored in its respec
To store statistics for each TB, we created a TBStatistics structure
which is linked with the TBs. TBStatistics can stay alive after
tb_flush and be relinked to a regenerated TB. So the statistics can
be accumulated even through flushes.
The goal is to have all present and future qemu/tcg statisti
/code time" calculation
v5:
- full replacement of CONFIG_PROFILER
- several fixes
- adds "info cfg"
- adds TB's targets to dump
vandersonmr (10):
accel: introducing TBStatistics structure
accel: collecting TB execution count
accel: collecting JIT statistics
accel: replacin
This commit adds support to Linux Perf in order
to be able to analyze qemu jitted code and
also to able to see the TBs PC in it.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/Makefile.objs | 1 +
accel/tcg/perf/Makefile.objs | 1 +
accel/tcg/perf/jitdump.c | 180 +
analyze and dump JITed code with perf.
Example of use:
perf record -k 1 qemu-x86_64 -perf ./a.out
perf inject -j -i perf.data -o perf.data.jitted
perf report -i perf.data.jitted
vandersonmr (2):
accel/tcg: adding integration with linux perf
tb-stats: adding TBStatistics info into perf dump
Adding "info cfg id depth" commands to HMP.
This command allow the exploration a TB
neighbors by dumping [and opening] a .dot
file with the TB CFG neighbors colorized
by their hotness.
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now
The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now, a
corresponding QMP command is not worthwhile.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tb-stats.c | 398 ++-
accel/tcg/translate
while emulating.
Collecting these statistics and information is useful to understand
qemu performance and to help to add the support for traces to QEMU.
v5:
- full replacement of CONFIG_PROFILER
- several fixes
- adds "info cfg"
- adds TB's targets to dump
vandersonm
We add some of the statistics collected in the TCGProfiler
into the TBStats, having the statistics not only for the whole
emulation but for each TB. Then, we removed these stats
from TCGProfiler and reconstruct the information for the
"info jit" using the sum of all TBStats statistics.
The goal is
If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
enabled, then we instrument the start code of this TB
to atomically count the number of times it is executed.
We count both the number of "normal" executions and atomic
executions of a TB.
The execution count of the TB is stored in its respec
Adding -d tb_stats to control TBStatistics collection:
-d tb_stats[[,level=(+all+jit+exec+time)][,dump_limit=]]
"dump_limit" is used to limit the number of dumped TBStats in
linux-user mode.
[all+jit+exec+time] control the profilling level used
by the TBStats. Can be used as follow:
-d tb_stat
To store statistics for each TB, we created a TBStatistics structure
which is linked with the TBs. TBStatistics can stay alive after
tb_flush and be relinked to a regenerated TB. So the statistics can
be accumulated even through flushes.
The goal is to have all present and future qemu/tcg statisti
Adding TBStatistics information to linux perf TB's symbol names.
This commit depends on the following PATCH:
[PATCH v5 00/10] Measure Tiny Code Generation Quality
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/perf/jitdump.c | 15 ++-
1 file changed, 14 insertions(+), 1 deleti
dumps, in linux-user mode, the hottest TBs if -d tb_stats is used.
Signed-off-by: Vanderson M. do Rosario
---
linux-user/exit.c | 4
1 file changed, 4 insertions(+)
diff --git a/linux-user/exit.c b/linux-user/exit.c
index bdda720553..7226104959 100644
--- a/linux-user/exit.c
+++ b/linux-us
Adding tb_stats [start|pause|stop|filter] command to hmp.
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.
tb_stats filter allow to only collect statistics for the TB
in the last_search list.
The goal of this command is t
Replace all others CONFIG_PROFILER statistics and migrate it to
TBStatistics system. However, TCGProfiler still exists and can
be use to store global statistics and times. All TB related
statistics goes to TBStatistics.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tb-stats.c | 95 +
If a TB has a TBS (TBStatistics) with the TB_JIT_STATS
enabled then we collect statistics of its translation
processes and code translation.
Collecting the number of host instructions seems to be
not simple as it would imply in having to modify several
target source files. So, for now, we are only
Adding -d tb_stats:[limit:[all|jit|exec]] to control TBStatistics
collection. "limit" is used to limit the number of TBStats in the
linux-user dump. [all|jit|exec] control the profilling level used
by the TBStats: all, only jit stats or only execution count stats.
Signed-off-by: Vanderson M. do Ro
If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
enabled, then we instrument the start code of the TB
to atomically count the number of times it is executed.
The execution count of the TB is stored in its respective
TBS.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tcg-runtime.c
Adding tb_stats [start|pause|stop|filter] command to hmp.
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.
The goal of this command is to allow the dynamic exploration
of the TCG behavior and quality. Therefore, for now, a
We add some of the statistics collected in the TCGProfiler
into the TBStats, having the statistics not only for the whole
emulation but for each TB. Then, we removed these stats
from TCGProfiler and reconstruct the information for the
"info jit" using the sum of all TBStats statistics.
The goal is
To store statistics for each TB we created a TBStatistics structure
which is linked with the TBs. The TBStatistics can stay alive after
tb_flush and be relinked to a regenerated TB. So the statistics can
be accumulated even through flushes.
TBStatistics will be also referred to as TBS or tbstats.
while emulating.
Collecting these statistics and information is useful to understand
qemu performance and to help to add the support for traces to QEMU.
vandersonmr (7):
accel: introducing TBStatistics structure
accel: collecting TB execution count
accel: collecting JIT statistics
accel
(fake_fprintf) but counting the number
of instructions.
Signed-off-by: vandersonmr
---
accel/tcg/translate-all.c | 18 +++
accel/tcg/translator.c| 5 ++
disas.c | 108 ++
include/disas/disas.h | 1 +
include/exec/tb-stats.h | 14
behavior and code quality. Therefore, for now, a
corresponding QMP command is not worthwhile.
Signed-off-by: vandersonmr
---
accel/tcg/tb-stats.c | 275 +++
hmp-commands-info.hx | 23 +++
include/exec/tb-stats.h | 37 +
include/qemu/log-for
adding options to list tbs by some metric and
investigate their code.
Signed-off-by: Vanderson M. do Rosario
---
hmp-commands-info.hx | 22 ++
monitor/misc.c | 69
2 files changed, 91 insertions(+)
diff --git a/hmp-commands-info.hx
add option to dump the N most hot TB blocks.
-d hot_tbs:N
and also add all tbstats dump functions.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/Makefile.objs | 1 +
accel/tcg/tb-stats.c | 293 +++
include/exec/cpu-all.h | 43 +
in
Filling other tb statistics such as number of times the
tb is compiled, its number of guest/host/IR instructions...
Signed-off-by: vandersonmr
---
accel/tcg/translate-all.c | 14 +
accel/tcg/translator.c| 4 ++
disas.c | 107
adding the option to start collecting the tb
statistics later using the start_stats command.
Signed-off-by: vandersonmr
---
hmp-commands.hx | 15 +++
monitor/misc.c | 15 +++
2 files changed, 30 insertions(+)
diff --git a/hmp-commands.hx b/hmp-commands.hx
index
We add the option to instrument each TB to
count the number of times it is executed and
store this in the its TBStatistics struct.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tcg-runtime.c | 7 +++
accel/tcg/tcg-runtime.h | 2 ++
accel/tcg/translator.c| 1 +
include/exec/gen
We want to store statistics for each TB even after flushes.
We do not want to modify or grow the TB struct.
So we create a new struct to contain this statistics and
we link one of it to each TB as they are generated.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/translate-all.c | 60 +
add option to dump the N most hot TB blocks.
-d hot_tbs:N
Signed-off-by: vandersonmr
---
include/qemu/log-for-trace.h | 2 ++
linux-user/exit.c| 3 +++
util/log.c | 9 +
3 files changed, 14 insertions(+)
diff --git a/include/qemu/log-for-trace.h b/include
Adding a function to dump the Nth hottest TBs.
The block PC, execution count and ops is dump to the log.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/translate-all.c | 45 +++
include/exec/exec-all.h | 2 ++
2 files changed, 47 insertions(+)
diff -
We collect the number of times each TB is executed
and store it in the its TBStatistics.
We also count the number of times the execution counter overflows.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/tcg-runtime.c | 10 ++
accel/tcg/tcg-runtime.h | 2 ++
accel/tcg/translato
It adds a new structure which is linked with each TBs and stores its statistics.
We collect the execution count of the TBs and store in this new structure.
The information stored in this new struct is then used to support a new
command line -d hot_tbs:N which dumps information of the N most hot TBs
We want to store statistics for each TB even after flushes.
We do not want to modify or grow the TB struct.
So we create a new struct to contain this statistics and
link it to each TB while they are created.
Signed-off-by: Vanderson M. do Rosario
---
accel/tcg/translate-all.c | 40 ++
Added -execfreq to enable execution frequency counting and dump
all the TB's addresses and their execution frequency at the end
of the execution.
Signed-off-by: vandersonmr
---
linux-user/exit.c | 5 +
linux-user/main.c | 7 +++
2 files changed, 12 insertions(+)
diff --git a/linux
A new hash map was added to store the accumulated execution
frequency of the TBs even after tb_flush events. A dump
function was also added as a way to visualize these frequencies.
Signed-off-by: vandersonmr
---
accel/tcg/translate-all.c | 59 +++
accel/tcg
This is the first series of patches related to the TCGCodeQuality GSoC project
More at https://wiki.qemu.org/Features/TCGCodeQuality
It adds an option to instrument TBs and collects their execution frequency.
The execution frequency is then store/accumulated in an auxiliary structure
every time a
An uint64_t counter was added in the TranslationBlock struct and
it is incremented every time that the TB is executed.
Signed-off-by: vandersonmr
---
accel/tcg/tcg-runtime.c | 6 ++
accel/tcg/tcg-runtime.h | 2 ++
include/exec/exec-all.h | 1 +
include/exec/gen-icount.h | 7
74 matches
Mail list logo