On 5/3/2023 4:28 PM, Richard Henderson wrote: > On 4/21/23 14:24, Fei Wu wrote: >> From: "Vanderson M. do Rosario" <vanderson...@gmail.com> >> >> If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS >> enabled, then we instrument the start code of this TB >> to atomically count the number of times it is executed. >> We count both the number of "normal" executions and atomic >> executions of a TB. >> >> The execution count of the TB is stored in its respective >> TBS. >> >> All TBStatistics are created by default with the flags from >> default_tbstats_flag. >> >> Signed-off-by: Vanderson M. do Rosario <vanderson...@gmail.com> >> Message-Id: <20190829173437.5926-3-vanderson...@gmail.com> >> [AJB: Fix author] >> Signed-off-by: Alex Bennée <alex.ben...@linaro.org> >> --- >> accel/tcg/cpu-exec.c | 6 ++++++ >> accel/tcg/tb-stats.c | 6 ++++++ >> accel/tcg/tcg-runtime.c | 8 ++++++++ >> accel/tcg/tcg-runtime.h | 2 ++ >> accel/tcg/translate-all.c | 7 +++++-- >> accel/tcg/translator.c | 10 ++++++++++ >> include/exec/gen-icount.h | 1 + >> include/exec/tb-stats.h | 18 ++++++++++++++++++ >> 8 files changed, 56 insertions(+), 2 deletions(-) >> >> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c >> index c815f2dbfd..d89f9fe493 100644 >> --- a/accel/tcg/cpu-exec.c >> +++ b/accel/tcg/cpu-exec.c >> @@ -25,6 +25,7 @@ >> #include "trace.h" >> #include "disas/disas.h" >> #include "exec/exec-all.h" >> +#include "exec/tb-stats.h" >> #include "tcg/tcg.h" >> #include "qemu/atomic.h" >> #include "qemu/rcu.h" >> @@ -564,7 +565,12 @@ void cpu_exec_step_atomic(CPUState *cpu) >> mmap_unlock(); >> } >> + if (tb_stats_enabled(tb, TB_EXEC_STATS)) { >> + tb->tb_stats->executions.atomic++; >> + } > > The write is protected by the exclusive lock, but the read might be > accessible from the monitor, iiuc. Which means you should use > atomic_set(), for non-tearable write after non-atomic increment. > The writes are serialized, 'atomic' is an aligned integer (unsigned long), the read in parallel with write should not be a problem? It returns the value either before increment or after, not part of.
>> @@ -148,3 +149,10 @@ void HELPER(exit_atomic)(CPUArchState *env) >> { >> cpu_loop_exit_atomic(env_cpu(env), GETPC()); >> } >> + >> +void HELPER(inc_exec_freq)(void *ptr) >> +{ >> + TBStatistics *stats = (TBStatistics *) ptr; >> + tcg_debug_assert(stats); >> + qatomic_inc(&stats->executions.normal); >> +} > > Ug. Do we really need an atomic update? > > If we have multiple threads executing through the same TB, we'll get > significant slow-down at the cost of not missing increments. If we > allow a non-atomic update, we'll get much less slow-down at the cost of > missing a few increments. But this is statistical only, so how much > does it really matter? > This sounds reasonable to me. Alex, what's your point here? Richard, could you please review all this series? I just saw your reviews on patch 01 and 02. Thanks, Fei. > > r~