[Qemu-devel] [PATCH 0/2] Fainal TCG clean-up patches
This set of patches moves rest global variables to tcg_ctx. Also second patch introduces new TBContext for translation blocks ans moves translation block globals there. We place tb_ctx inside tcg_ctx and get noticable speed-up. After this patchset was aplied, I noticed ~4-5% speed-up of code generation. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 662.4 max: 696 avg: 672.28 standard deviation: ~17 ~= 3.5% Average cycles/op = 672 +- 17 After clean-up: min: 635 max: 650.5 avg: 640.14 standard deviation: ~8 ~= 1.6% Average cycles/op = 640 +- 8 Evgeny Voevodin (2): TCG: Final globals clean-up TCG: Move translation block variables to new context inside tcg_ctx: tb_ctx cpu-exec.c | 18 +++-- include/exec/exec-all.h | 27 +--- linux-user/main.c |6 +- tcg/tcg.c |2 +- tcg/tcg.h | 16 - translate-all.c | 173 +++ 6 files changed, 130 insertions(+), 112 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH 2/2] TCG: Move translation block variables to new context inside tcg_ctx: tb_ctx
It's worth to clean-up translation blocks variables and move them into one context as was suggested by Swirl. Also if we use this context directly inside tcg_ctx, then it speeds up code generation a bit. Signed-off-by: Evgeny Voevodin evgenyvoevo...@gmail.com --- cpu-exec.c | 18 - include/exec/exec-all.h | 27 + linux-user/main.c |6 +-- tcg/tcg.h |2 + translate-all.c | 96 +++ 5 files changed, 79 insertions(+), 70 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 19ebb4a..ff9a884 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -23,8 +23,6 @@ #include qemu/atomic.h #include sysemu/qtest.h -int tb_invalidated_flag; - //#define CONFIG_DEBUG_EXEC bool qemu_cpu_has_work(CPUState *cpu) @@ -90,13 +88,13 @@ static TranslationBlock *tb_find_slow(CPUArchState *env, tb_page_addr_t phys_pc, phys_page1; target_ulong virt_page2; -tb_invalidated_flag = 0; +tcg_ctx.tb_ctx.tb_invalidated_flag = 0; /* find translated block using physical mappings */ phys_pc = get_page_addr_code(env, pc); phys_page1 = phys_pc TARGET_PAGE_MASK; h = tb_phys_hash_func(phys_pc); -ptb1 = tb_phys_hash[h]; +ptb1 = tcg_ctx.tb_ctx.tb_phys_hash[h]; for(;;) { tb = *ptb1; if (!tb) @@ -128,8 +126,8 @@ static TranslationBlock *tb_find_slow(CPUArchState *env, /* Move the last found TB to the head of the list */ if (likely(*ptb1)) { *ptb1 = tb-phys_hash_next; -tb-phys_hash_next = tb_phys_hash[h]; -tb_phys_hash[h] = tb; +tb-phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h]; +tcg_ctx.tb_ctx.tb_phys_hash[h] = tb; } /* we add the TB in the virtual pc hash table */ env-tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = tb; @@ -563,16 +561,16 @@ int cpu_exec(CPUArchState *env) #endif } #endif /* DEBUG_DISAS || CONFIG_DEBUG_EXEC */ -spin_lock(tb_lock); +spin_lock(tcg_ctx.tb_ctx.tb_lock); tb = tb_find_fast(env); /* Note: we do it here to avoid a gcc bug on Mac OS X when doing it in tb_find_slow */ -if (tb_invalidated_flag) { +if (tcg_ctx.tb_ctx.tb_invalidated_flag) { /* as some TB could have been invalidated because of memory exceptions while generating the code, we must recompute the hash index here */ next_tb = 0; -tb_invalidated_flag = 0; +tcg_ctx.tb_ctx.tb_invalidated_flag = 0; } #ifdef CONFIG_DEBUG_EXEC qemu_log_mask(CPU_LOG_EXEC, Trace %p [ TARGET_FMT_lx ] %s\n, @@ -585,7 +583,7 @@ int cpu_exec(CPUArchState *env) if (next_tb != 0 tb-page_addr[1] == -1) { tb_add_jump((TranslationBlock *)(next_tb ~3), next_tb 3, tb); } -spin_unlock(tb_lock); +spin_unlock(tcg_ctx.tb_ctx.tb_lock); /* cpu_interrupt might be called while translating the TB, but before it is linked into a potentially diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index d235ef8..f685c28 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -168,6 +168,25 @@ struct TranslationBlock { uint32_t icount; }; +#include exec/spinlock.h + +typedef struct TBContext TBContext; + +struct TBContext { + +TranslationBlock *tbs; +TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE]; +int nb_tbs; +/* any access to the tbs or the page table must use this lock */ +spinlock_t tb_lock; + +/* statistics */ +int tb_flush_count; +int tb_phys_invalidate_count; + +int tb_invalidated_flag; +}; + static inline unsigned int tb_jmp_cache_hash_page(target_ulong pc) { target_ulong tmp; @@ -192,8 +211,6 @@ void tb_free(TranslationBlock *tb); void tb_flush(CPUArchState *env); void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr); -extern TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE]; - #if defined(USE_DIRECT_JUMP) #if defined(CONFIG_TCG_INTERPRETER) @@ -275,12 +292,6 @@ static inline void tb_add_jump(TranslationBlock *tb, int n, } } -#include exec/spinlock.h - -extern spinlock_t tb_lock; - -extern int tb_invalidated_flag; - /* The return address may point to the start of the next instruction. Subtracting one gets us the call instruction itself. */ #if defined(CONFIG_TCG_INTERPRETER) diff --git a/linux-user/main.c b/linux-user/main.c index 0181bc2..8f09abd 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -111,7 +111,7 @@ static int pending_cpus; /* Make sure everything is in a consistent state for calling fork(). */ void fork_start(void) { -pthread_mutex_lock(tb_lock); +pthread_mutex_lock
[Qemu-devel] [PATCH 1/2] TCG: Final globals clean-up
Signed-off-by: Evgeny Voevodin evgenyvoevo...@gmail.com --- tcg/tcg.c |2 +- tcg/tcg.h | 14 ++-- translate-all.c | 97 --- 3 files changed, 61 insertions(+), 52 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index 9275e37..c8a843e 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -263,7 +263,7 @@ void tcg_context_init(TCGContext *s) void tcg_prologue_init(TCGContext *s) { /* init global prologue and epilogue */ -s-code_buf = code_gen_prologue; +s-code_buf = s-code_gen_prologue; s-code_ptr = s-code_buf; tcg_target_qemu_prologue(s); flush_icache_range((tcg_target_ulong)s-code_buf, diff --git a/tcg/tcg.h b/tcg/tcg.h index a427972..4086e98 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -462,6 +462,15 @@ struct TCGContext { uint16_t gen_opc_icount[OPC_BUF_SIZE]; uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; +/* Code generation */ +int code_gen_max_blocks; +uint8_t *code_gen_prologue; +uint8_t *code_gen_buffer; +size_t code_gen_buffer_size; +/* threshold to flush the translated code buffer */ +size_t code_gen_buffer_max_size; +uint8_t *code_gen_ptr; + #if defined(CONFIG_QEMU_LDST_OPTIMIZATION) defined(CONFIG_SOFTMMU) /* labels info for qemu_ld/st IRs The labels help to generate TLB miss case codes at the end of TB */ @@ -658,12 +667,11 @@ TCGv_i64 tcg_const_i64(int64_t val); TCGv_i32 tcg_const_local_i32(int32_t val); TCGv_i64 tcg_const_local_i64(int64_t val); -extern uint8_t *code_gen_prologue; - /* TCG targets may use a different definition of tcg_qemu_tb_exec. */ #if !defined(tcg_qemu_tb_exec) # define tcg_qemu_tb_exec(env, tb_ptr) \ -((tcg_target_ulong (*)(void *, void *))code_gen_prologue)(env, tb_ptr) +((tcg_target_ulong (*)(void *, void *))tcg_ctx.code_gen_prologue)(env, \ + tb_ptr) #endif void tcg_register_jit(void *buf, size_t buf_size); diff --git a/translate-all.c b/translate-all.c index d367fc4..d666562 100644 --- a/translate-all.c +++ b/translate-all.c @@ -72,21 +72,13 @@ #define SMC_BITMAP_USE_THRESHOLD 10 -/* Code generation and translation blocks */ +/* Translation blocks */ static TranslationBlock *tbs; -static int code_gen_max_blocks; TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE]; static int nb_tbs; /* any access to the tbs or the page table must use this lock */ spinlock_t tb_lock = SPIN_LOCK_UNLOCKED; -uint8_t *code_gen_prologue; -static uint8_t *code_gen_buffer; -static size_t code_gen_buffer_size; -/* threshold to flush the translated code buffer */ -static size_t code_gen_buffer_max_size; -static uint8_t *code_gen_ptr; - typedef struct PageDesc { /* list of TBs intersecting this ram page */ TranslationBlock *first_tb; @@ -514,7 +506,7 @@ static inline size_t size_code_gen_buffer(size_t tb_size) if (tb_size MAX_CODE_GEN_BUFFER_SIZE) { tb_size = MAX_CODE_GEN_BUFFER_SIZE; } -code_gen_buffer_size = tb_size; +tcg_ctx.code_gen_buffer_size = tb_size; return tb_size; } @@ -524,7 +516,7 @@ static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE] static inline void *alloc_code_gen_buffer(void) { -map_exec(static_code_gen_buffer, code_gen_buffer_size); +map_exec(static_code_gen_buffer, tcg_ctx.code_gen_buffer_size); return static_code_gen_buffer; } #elif defined(USE_MMAP) @@ -547,8 +539,8 @@ static inline void *alloc_code_gen_buffer(void) Leave the choice of exact location with the kernel. */ flags |= MAP_32BIT; /* Cannot expect to map more than 800MB in low memory. */ -if (code_gen_buffer_size 800u * 1024 * 1024) { -code_gen_buffer_size = 800u * 1024 * 1024; +if (tcg_ctx.code_gen_buffer_size 800u * 1024 * 1024) { +tcg_ctx.code_gen_buffer_size = 800u * 1024 * 1024; } # elif defined(__sparc__) start = 0x4000ul; @@ -556,17 +548,17 @@ static inline void *alloc_code_gen_buffer(void) start = 0x9000ul; # endif -buf = mmap((void *)start, code_gen_buffer_size, +buf = mmap((void *)start, tcg_ctx.code_gen_buffer_size, PROT_WRITE | PROT_READ | PROT_EXEC, flags, -1, 0); return buf == MAP_FAILED ? NULL : buf; } #else static inline void *alloc_code_gen_buffer(void) { -void *buf = g_malloc(code_gen_buffer_size); +void *buf = g_malloc(tcg_ctx.code_gen_buffer_size); if (buf) { -map_exec(buf, code_gen_buffer_size); +map_exec(buf, tcg_ctx.code_gen_buffer_size); } return buf; } @@ -574,27 +566,30 @@ static inline void *alloc_code_gen_buffer(void) static inline void code_gen_alloc(size_t tb_size) { -code_gen_buffer_size = size_code_gen_buffer(tb_size); -code_gen_buffer = alloc_code_gen_buffer(); -if (code_gen_buffer == NULL) { +tcg_ctx.code_gen_buffer_size = size_code_gen_buffer(tb_size); +tcg_ctx.code_gen_buffer
Re: [Qemu-devel] [PATCH] exynos4210/mct: Avoid infinite loop on non incremental timers
On 12/01/2012 09:08 PM, Jean-Christophe DUBOIS wrote: Check for a 0 distance value to avoid infinite loop when the expired FCR timer was not programed with auto-increment. With this change the behavior is coherent with the same type of code in the exynos4210_gfrc_restart() function in the same file. Linux seems to mostly use this timer with auto-increment which explain why it is not a problem most of the time. However other OS might have a problem with this if they don't use the auto-increment feature. Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net --- hw/exynos4210_mct.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c index e79cd6a..31a41d5 100644 --- a/hw/exynos4210_mct.c +++ b/hw/exynos4210_mct.c @@ -568,7 +568,7 @@ static void exynos4210_gfrc_event(void *opaque) /* Reload FRC to reach nearest comparator */ s-g_timer.curr_comp = exynos4210_gcomp_find(s); distance = exynos4210_gcomp_get_distance(s, s-g_timer.curr_comp); -if (distance MCT_GT_COUNTER_STEP) { +if ((distance MCT_GT_COUNTER_STEP) || !distance) { You don't need additional braces here. distance = MCT_GT_COUNTER_STEP; } exynos4210_gfrc_set_count(s-g_timer, distance); -- 1.7.9.5 Doesn't apply to current master, please, rebase: Applying: exynos4210/mct: Avoid infinite loop on non incremental timers error: patch failed: hw/exynos4210_mct.c:568 error: hw/exynos4210_mct.c: patch does not apply -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Centre, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH v2] exynos4210/mct: Avoid infinite loop on non incremental timers
On 12/04/2012 02:55 AM, Jean-Christophe DUBOIS wrote: Check for a 0 distance value to avoid infinite loop when the expired FCR timer was not programed with auto-increment. With this change the behavior is coherent with the same type of code in the exynos4210_gfrc_restart() function in the same file. Linux seems to mostly use this timer with auto-increment which explain why it is not a problem most of the time. However other OS might have a problem with this if they don't use the auto-increment feature. Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net --- hw/exynos4210_mct.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c index e79cd6a..37dbda9 100644 --- a/hw/exynos4210_mct.c +++ b/hw/exynos4210_mct.c @@ -568,7 +568,7 @@ static void exynos4210_gfrc_event(void *opaque) /* Reload FRC to reach nearest comparator */ s-g_timer.curr_comp = exynos4210_gcomp_find(s); distance = exynos4210_gcomp_get_distance(s, s-g_timer.curr_comp); -if (distance MCT_GT_COUNTER_STEP) { +if (distance MCT_GT_COUNTER_STEP || !distance) { distance = MCT_GT_COUNTER_STEP; } exynos4210_gfrc_set_count(s-g_timer, distance); Reviewed-by: Evgeny Voevodin e.voevo...@samsung.com P.S.: Next time, please, don't forget to CC appropriate people to not let them miss your patch. -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH 0/5] TCG global gen_opc_ arrays clean-up
On 11/26/2012 08:19 AM, Evgeny Voevodin wrote: On 11/21/2012 11:43 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_instr gen_opparam_icount gen_opc_pc Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed no speed-up or slow-down of code generation. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 655.5 max: 659.3 avg: 657.2 standard deviation: ~2 ~= 0.4% Average cycles/op = 657 +- 2 After clean-up: min: 654.6 max: 657.1 avg: 655.5 standard deviation: ~1 ~= 0.2% Average cycles/op = 656 +- 1 Evgeny Voevodin (5): tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext. TCG: Use gen_opc_pc from context instead of global variable. TCG: Use gen_opc_icount from context instead of global variable. TCG: Use gen_opc_instr_start from context instead of global variable. TCG: Remove unused global gen_opc_ arrays. exec-all.h|4 target-alpha/translate.c | 12 ++-- target-arm/translate.c| 12 ++-- target-cris/translate.c | 14 +++--- target-i386/translate.c | 19 ++- target-lm32/translate.c | 12 ++-- target-m68k/translate.c | 12 ++-- target-microblaze/translate.c | 12 ++-- target-mips/translate.c | 12 ++-- target-openrisc/translate.c | 12 ++-- target-ppc/translate.c| 12 ++-- target-s390x/translate.c | 12 ++-- target-sh4/translate.c| 12 ++-- target-sparc/translate.c | 12 ++-- target-unicore32/translate.c | 12 ++-- target-xtensa/translate.c | 10 +- tcg/tcg.h |3 +++ translate-all.c |9 +++-- 18 files changed, 100 insertions(+), 103 deletions(-) Ping? +CC: Alexander Graf ag...@suse.de; Paul Brook p...@codesourcery.com Ping?? -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Centre, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH 0/5] TCG global gen_opc_ arrays clean-up
On 11/21/2012 11:43 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_instr gen_opparam_icount gen_opc_pc Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed no speed-up or slow-down of code generation. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 655.5 max: 659.3 avg: 657.2 standard deviation: ~2 ~= 0.4% Average cycles/op = 657 +- 2 After clean-up: min: 654.6 max: 657.1 avg: 655.5 standard deviation: ~1 ~= 0.2% Average cycles/op = 656 +- 1 Evgeny Voevodin (5): tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext. TCG: Use gen_opc_pc from context instead of global variable. TCG: Use gen_opc_icount from context instead of global variable. TCG: Use gen_opc_instr_start from context instead of global variable. TCG: Remove unused global gen_opc_ arrays. exec-all.h|4 target-alpha/translate.c | 12 ++-- target-arm/translate.c| 12 ++-- target-cris/translate.c | 14 +++--- target-i386/translate.c | 19 ++- target-lm32/translate.c | 12 ++-- target-m68k/translate.c | 12 ++-- target-microblaze/translate.c | 12 ++-- target-mips/translate.c | 12 ++-- target-openrisc/translate.c | 12 ++-- target-ppc/translate.c| 12 ++-- target-s390x/translate.c | 12 ++-- target-sh4/translate.c| 12 ++-- target-sparc/translate.c | 12 ++-- target-unicore32/translate.c | 12 ++-- target-xtensa/translate.c | 10 +- tcg/tcg.h |3 +++ translate-all.c |9 +++-- 18 files changed, 100 insertions(+), 103 deletions(-) Ping? +CC: Alexander Graf ag...@suse.de; Paul Brook p...@codesourcery.com -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Centre, e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH 4/5] TCG: Use gen_opc_instr_start from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |6 +++--- target-arm/translate.c|6 +++--- target-cris/translate.c |6 +++--- target-i386/translate.c |8 target-lm32/translate.c |6 +++--- target-m68k/translate.c |6 +++--- target-microblaze/translate.c |6 +++--- target-mips/translate.c |6 +++--- target-openrisc/translate.c |6 +++--- target-ppc/translate.c|6 +++--- target-s390x/translate.c |6 +++--- target-sh4/translate.c|6 +++--- target-sparc/translate.c |6 +++--- target-unicore32/translate.c |6 +++--- target-xtensa/translate.c |4 ++-- translate-all.c |3 ++- 16 files changed, 47 insertions(+), 46 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 8b73fbb..71fe1a1 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3410,10 +3410,10 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, if (lj j) { lj++; while (lj j) -gen_opc_instr_start[lj++] = 0; +tcg_ctx.gen_opc_instr_start[lj++] = 0; } tcg_ctx.gen_opc_pc[lj] = ctx.pc; -gen_opc_instr_start[lj] = 1; +tcg_ctx.gen_opc_instr_start[lj] = 1; tcg_ctx.gen_opc_icount[lj] = num_insns; } if (num_insns + 1 == max_insns (tb-cflags CF_LAST_IO)) @@ -3468,7 +3468,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) -gen_opc_instr_start[lj++] = 0; +tcg_ctx.gen_opc_instr_start[lj++] = 0; } else { tb-size = ctx.pc - pc_start; tb-icount = num_insns; diff --git a/target-arm/translate.c b/target-arm/translate.c index 4695d8b..3cf3604 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9838,11 +9838,11 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, if (lj j) { lj++; while (lj j) -gen_opc_instr_start[lj++] = 0; +tcg_ctx.gen_opc_instr_start[lj++] = 0; } tcg_ctx.gen_opc_pc[lj] = dc-pc; gen_opc_condexec_bits[lj] = (dc-condexec_cond 4) | (dc-condexec_mask 1); -gen_opc_instr_start[lj] = 1; +tcg_ctx.gen_opc_instr_start[lj] = 1; tcg_ctx.gen_opc_icount[lj] = num_insns; } @@ -9977,7 +9977,7 @@ done_generating: j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) -gen_opc_instr_start[lj++] = 0; +tcg_ctx.gen_opc_instr_start[lj++] = 0; } else { tb-size = dc-pc - pc_start; tb-icount = num_insns; diff --git a/target-cris/translate.c b/target-cris/translate.c index 6ec8c3c..60bdc24 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3301,7 +3301,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, if (lj j) { lj++; while (lj j) { -gen_opc_instr_start[lj++] = 0; +tcg_ctx.gen_opc_instr_start[lj++] = 0; } } if (dc-delayed_branch == 1) { @@ -3309,7 +3309,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, } else { tcg_ctx.gen_opc_pc[lj] = dc-pc; } -gen_opc_instr_start[lj] = 1; +tcg_ctx.gen_opc_instr_start[lj] = 1; tcg_ctx.gen_opc_icount[lj] = num_insns; } @@ -3439,7 +3439,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) { -gen_opc_instr_start[lj++] = 0; +tcg_ctx.gen_opc_instr_start[lj++] = 0; } } else { tb-size = dc-pc - pc_start; diff --git a/target-i386/translate.c b/target-i386/translate.c index 80fb695..f394ea6 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -7988,11 +7988,11 @@ static inline void gen_intermediate_code_internal(CPUX86State *env, if (lj j) { lj++; while (lj j) -gen_opc_instr_start[lj++] = 0; +tcg_ctx.gen_opc_instr_start[lj++] = 0; } tcg_ctx.gen_opc_pc[lj] = pc_ptr; gen_opc_cc_op[lj] = dc-cc_op; -gen_opc_instr_start[lj] = 1; +tcg_ctx.gen_opc_instr_start[lj] = 1; tcg_ctx.gen_opc_icount[lj] = num_insns; } if (num_insns + 1 == max_insns (tb-cflags CF_LAST_IO)) @@ -8037,7 +8037,7 @@ static inline void
[Qemu-devel] [PATCH 5/5] TCG: Remove unused global gen_opc_ arrays.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- exec-all.h |4 translate-all.c |4 2 files changed, 8 deletions(-) diff --git a/exec-all.h b/exec-all.h index 21aacda..b18d4ca 100644 --- a/exec-all.h +++ b/exec-all.h @@ -70,10 +70,6 @@ typedef struct TranslationBlock TranslationBlock; #define OPPARAM_BUF_SIZE (OPC_BUF_SIZE * MAX_OPC_PARAM) -extern target_ulong gen_opc_pc[OPC_BUF_SIZE]; -extern uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; -extern uint16_t gen_opc_icount[OPC_BUF_SIZE]; - #include qemu-log.h void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb); diff --git a/translate-all.c b/translate-all.c index 2f616bf..f22e3ee 100644 --- a/translate-all.c +++ b/translate-all.c @@ -33,10 +33,6 @@ /* code generation context */ TCGContext tcg_ctx; -target_ulong gen_opc_pc[OPC_BUF_SIZE]; -uint16_t gen_opc_icount[OPC_BUF_SIZE]; -uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; - void cpu_gen_init(void) { tcg_context_init(tcg_ctx); -- 1.7.9.5
[Qemu-devel] [PATCH 1/5] tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.h |3 +++ 1 file changed, 3 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index 9481e35..f6e255f 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -455,6 +455,9 @@ struct TCGContext { uint16_t *gen_opc_ptr; TCGArg *gen_opparam_ptr; +target_ulong gen_opc_pc[OPC_BUF_SIZE]; +uint16_t gen_opc_icount[OPC_BUF_SIZE]; +uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; #if defined(CONFIG_QEMU_LDST_OPTIMIZATION) defined(CONFIG_SOFTMMU) /* labels info for qemu_ld/st IRs -- 1.7.9.5
[Qemu-devel] [PATCH 3/5] TCG: Use gen_opc_icount from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |2 +- target-arm/translate.c|2 +- target-cris/translate.c |2 +- target-i386/translate.c |2 +- target-lm32/translate.c |2 +- target-m68k/translate.c |2 +- target-microblaze/translate.c |2 +- target-mips/translate.c |2 +- target-openrisc/translate.c |2 +- target-ppc/translate.c|2 +- target-s390x/translate.c |2 +- target-sh4/translate.c|2 +- target-sparc/translate.c |2 +- target-unicore32/translate.c |2 +- target-xtensa/translate.c |2 +- translate-all.c |2 +- 16 files changed, 16 insertions(+), 16 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index bcde367..8b73fbb 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3414,7 +3414,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } tcg_ctx.gen_opc_pc[lj] = ctx.pc; gen_opc_instr_start[lj] = 1; -gen_opc_icount[lj] = num_insns; +tcg_ctx.gen_opc_icount[lj] = num_insns; } if (num_insns + 1 == max_insns (tb-cflags CF_LAST_IO)) gen_io_start(); diff --git a/target-arm/translate.c b/target-arm/translate.c index 8ea8bba..4695d8b 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9843,7 +9843,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, tcg_ctx.gen_opc_pc[lj] = dc-pc; gen_opc_condexec_bits[lj] = (dc-condexec_cond 4) | (dc-condexec_mask 1); gen_opc_instr_start[lj] = 1; -gen_opc_icount[lj] = num_insns; +tcg_ctx.gen_opc_icount[lj] = num_insns; } if (num_insns + 1 == max_insns (tb-cflags CF_LAST_IO)) diff --git a/target-cris/translate.c b/target-cris/translate.c index 745cd7a..6ec8c3c 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3310,7 +3310,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, tcg_ctx.gen_opc_pc[lj] = dc-pc; } gen_opc_instr_start[lj] = 1; -gen_opc_icount[lj] = num_insns; +tcg_ctx.gen_opc_icount[lj] = num_insns; } /* Pretty disas. */ diff --git a/target-i386/translate.c b/target-i386/translate.c index aea843c..80fb695 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -7993,7 +7993,7 @@ static inline void gen_intermediate_code_internal(CPUX86State *env, tcg_ctx.gen_opc_pc[lj] = pc_ptr; gen_opc_cc_op[lj] = dc-cc_op; gen_opc_instr_start[lj] = 1; -gen_opc_icount[lj] = num_insns; +tcg_ctx.gen_opc_icount[lj] = num_insns; } if (num_insns + 1 == max_insns (tb-cflags CF_LAST_IO)) gen_io_start(); diff --git a/target-lm32/translate.c b/target-lm32/translate.c index fcafb06..4e029e0 100644 --- a/target-lm32/translate.c +++ b/target-lm32/translate.c @@ -1056,7 +1056,7 @@ static void gen_intermediate_code_internal(CPULM32State *env, } tcg_ctx.gen_opc_pc[lj] = dc-pc; gen_opc_instr_start[lj] = 1; -gen_opc_icount[lj] = num_insns; +tcg_ctx.gen_opc_icount[lj] = num_insns; } /* Pretty disas. */ diff --git a/target-m68k/translate.c b/target-m68k/translate.c index 74772dd..0762085 100644 --- a/target-m68k/translate.c +++ b/target-m68k/translate.c @@ -3023,7 +3023,7 @@ gen_intermediate_code_internal(CPUM68KState *env, TranslationBlock *tb, } tcg_ctx.gen_opc_pc[lj] = dc-pc; gen_opc_instr_start[lj] = 1; -gen_opc_icount[lj] = num_insns; +tcg_ctx.gen_opc_icount[lj] = num_insns; } if (num_insns + 1 == max_insns (tb-cflags CF_LAST_IO)) gen_io_start(); diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c index 6803f73..d975756 100644 --- a/target-microblaze/translate.c +++ b/target-microblaze/translate.c @@ -1792,7 +1792,7 @@ gen_intermediate_code_internal(CPUMBState *env, TranslationBlock *tb, } tcg_ctx.gen_opc_pc[lj] = dc-pc; gen_opc_instr_start[lj] = 1; -gen_opc_icount[lj] = num_insns; +tcg_ctx.gen_opc_icount[lj] = num_insns; } /* Pretty disas. */ diff --git a/target-mips/translate.c b/target-mips/translate.c index 17d5ece..81807cf 100644 --- a/target-mips/translate.c +++ b/target-mips/translate.c @@ -15559,7 +15559,7 @@ gen_intermediate_code_internal (CPUMIPSState *env, TranslationBlock *tb, gen_opc_hflags[lj] = ctx.hflags MIPS_HFLAG_BMASK; gen_opc_btarget[lj] = ctx.btarget; gen_opc_instr_start[lj] = 1; -gen_opc_icount
[Qemu-devel] [PATCH 2/5] TCG: Use gen_opc_pc from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |4 ++-- target-arm/translate.c|4 ++-- target-cris/translate.c |6 +++--- target-i386/translate.c |9 + target-lm32/translate.c |4 ++-- target-m68k/translate.c |4 ++-- target-microblaze/translate.c |4 ++-- target-mips/translate.c |4 ++-- target-openrisc/translate.c |4 ++-- target-ppc/translate.c|4 ++-- target-s390x/translate.c |4 ++-- target-sh4/translate.c|4 ++-- target-sparc/translate.c |4 ++-- target-unicore32/translate.c |4 ++-- target-xtensa/translate.c |4 ++-- 15 files changed, 34 insertions(+), 33 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 4045f78..bcde367 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3412,7 +3412,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, while (lj j) gen_opc_instr_start[lj++] = 0; } -gen_opc_pc[lj] = ctx.pc; +tcg_ctx.gen_opc_pc[lj] = ctx.pc; gen_opc_instr_start[lj] = 1; gen_opc_icount[lj] = num_insns; } @@ -3551,5 +3551,5 @@ CPUAlphaState * cpu_alpha_init (const char *cpu_model) void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb, int pc_pos) { -env-pc = gen_opc_pc[pc_pos]; +env-pc = tcg_ctx.gen_opc_pc[pc_pos]; } diff --git a/target-arm/translate.c b/target-arm/translate.c index c42110a..8ea8bba 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9840,7 +9840,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, while (lj j) gen_opc_instr_start[lj++] = 0; } -gen_opc_pc[lj] = dc-pc; +tcg_ctx.gen_opc_pc[lj] = dc-pc; gen_opc_condexec_bits[lj] = (dc-condexec_cond 4) | (dc-condexec_mask 1); gen_opc_instr_start[lj] = 1; gen_opc_icount[lj] = num_insns; @@ -10043,6 +10043,6 @@ void cpu_dump_state(CPUARMState *env, FILE *f, fprintf_function cpu_fprintf, void restore_state_to_opc(CPUARMState *env, TranslationBlock *tb, int pc_pos) { -env-regs[15] = gen_opc_pc[pc_pos]; +env-regs[15] = tcg_ctx.gen_opc_pc[pc_pos]; env-condexec_bits = gen_opc_condexec_bits[pc_pos]; } diff --git a/target-cris/translate.c b/target-cris/translate.c index 0b0e86d..745cd7a 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3305,9 +3305,9 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, } } if (dc-delayed_branch == 1) { -gen_opc_pc[lj] = dc-ppc | 1; +tcg_ctx.gen_opc_pc[lj] = dc-ppc | 1; } else { -gen_opc_pc[lj] = dc-pc; +tcg_ctx.gen_opc_pc[lj] = dc-pc; } gen_opc_instr_start[lj] = 1; gen_opc_icount[lj] = num_insns; @@ -3621,5 +3621,5 @@ CRISCPU *cpu_cris_init(const char *cpu_model) void restore_state_to_opc(CPUCRISState *env, TranslationBlock *tb, int pc_pos) { -env-pc = gen_opc_pc[pc_pos]; +env-pc = tcg_ctx.gen_opc_pc[pc_pos]; } diff --git a/target-i386/translate.c b/target-i386/translate.c index 8e676ba..aea843c 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -7990,7 +7990,7 @@ static inline void gen_intermediate_code_internal(CPUX86State *env, while (lj j) gen_opc_instr_start[lj++] = 0; } -gen_opc_pc[lj] = pc_ptr; +tcg_ctx.gen_opc_pc[lj] = pc_ptr; gen_opc_cc_op[lj] = dc-cc_op; gen_opc_instr_start[lj] = 1; gen_opc_icount[lj] = num_insns; @@ -8081,15 +8081,16 @@ void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb, int pc_pos) qemu_log(RESTORE:\n); for(i = 0;i = pc_pos; i++) { if (gen_opc_instr_start[i]) { -qemu_log(0x%04x: TARGET_FMT_lx \n, i, gen_opc_pc[i]); +qemu_log(0x%04x: TARGET_FMT_lx \n, i, +tcg_ctx.gen_opc_pc[i]); } } qemu_log(pc_pos=0x%x eip= TARGET_FMT_lx cs_base=%x\n, -pc_pos, gen_opc_pc[pc_pos] - tb-cs_base, +pc_pos, tcg_ctx.gen_opc_pc[pc_pos] - tb-cs_base, (uint32_t)tb-cs_base); } #endif -env-eip = gen_opc_pc[pc_pos] - tb-cs_base; +env-eip = tcg_ctx.gen_opc_pc[pc_pos] - tb-cs_base; cc_op = gen_opc_cc_op[pc_pos]; if (cc_op != CC_OP_DYNAMIC) env-cc_op = cc_op; diff --git a/target-lm32/translate.c b/target-lm32/translate.c index af98649..fcafb06 100644 --- a/target-lm32/translate.c +++ b/target-lm32/translate.c @@ -1054,7 +1054,7 @@ static void gen_intermediate_code_internal(CPULM32State *env
[Qemu-devel] [PATCH 0/5] TCG global gen_opc_ arrays clean-up
This set of patches moves global variables to tcg_ctx: gen_opc_instr gen_opparam_icount gen_opc_pc Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed no speed-up or slow-down of code generation. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 655.5 max: 659.3 avg: 657.2 standard deviation: ~2 ~= 0.4% Average cycles/op = 657 +- 2 After clean-up: min: 654.6 max: 657.1 avg: 655.5 standard deviation: ~1 ~= 0.2% Average cycles/op = 656 +- 1 Evgeny Voevodin (5): tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext. TCG: Use gen_opc_pc from context instead of global variable. TCG: Use gen_opc_icount from context instead of global variable. TCG: Use gen_opc_instr_start from context instead of global variable. TCG: Remove unused global gen_opc_ arrays. exec-all.h|4 target-alpha/translate.c | 12 ++-- target-arm/translate.c| 12 ++-- target-cris/translate.c | 14 +++--- target-i386/translate.c | 19 ++- target-lm32/translate.c | 12 ++-- target-m68k/translate.c | 12 ++-- target-microblaze/translate.c | 12 ++-- target-mips/translate.c | 12 ++-- target-openrisc/translate.c | 12 ++-- target-ppc/translate.c| 12 ++-- target-s390x/translate.c | 12 ++-- target-sh4/translate.c| 12 ++-- target-sparc/translate.c | 12 ++-- target-unicore32/translate.c | 12 ++-- target-xtensa/translate.c | 10 +- tcg/tcg.h |3 +++ translate-all.c |9 +++-- 18 files changed, 100 insertions(+), 103 deletions(-) -- 1.7.9.5
Re: [Qemu-devel] [PATCH v6 0/7] TCG global variables clean-up
On 11/12/2012 01:27 PM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed 0.7% speed-up of code generation. Probably, this is due to better data caching. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 731.5 max: 734.8 avg: 733.0 standard deviation: ~2 ~= 0.2% Average cycles/op = 733 +- 2 After clean-up: min: 725.0 max: 730.5 avg: 727.8 standard deviation: ~3 ~= 0.4% Average cycles/op = 728 +- 3 Speed-up of TCG code generation = 0.7% Changelog: v5-v6: Fixed broken patches. Rebased. v4-v5: Rebased. Fixed authorship. All patches are reviewed-by Richard Henderson r...@twiddle.net v3-v4: Rebased. Added target-cris/translate.c: Code style clean-up v2-v3: Removed tcg_cur_ctx since it gives slow-down on gcc-4.5. Rebased. v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny Voevodin (7): target-cris/translate.c: Code style clean-up tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. TCG: Remove unused global variables gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 5040 + target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 +- tcg/tcg-op.h | 324 +-- tcg/tcg.c | 85 +- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 2859 insertions(+), 2817 deletions(-) Ping? -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH v6 0/7] TCG global variables clean-up
This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed 0.7% speed-up of code generation. Probably, this is due to better data caching. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 731.5 max: 734.8 avg: 733.0 standard deviation: ~2 ~= 0.2% Average cycles/op = 733 +- 2 After clean-up: min: 725.0 max: 730.5 avg: 727.8 standard deviation: ~3 ~= 0.4% Average cycles/op = 728 +- 3 Speed-up of TCG code generation = 0.7% Changelog: v5-v6: Fixed broken patches. Rebased. v4-v5: Rebased. Fixed authorship. All patches are reviewed-by Richard Henderson r...@twiddle.net v3-v4: Rebased. Added target-cris/translate.c: Code style clean-up v2-v3: Removed tcg_cur_ctx since it gives slow-down on gcc-4.5. Rebased. v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny Voevodin (7): target-cris/translate.c: Code style clean-up tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. TCG: Remove unused global variables gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 5040 + target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 +- tcg/tcg-op.h | 324 +-- tcg/tcg.c | 85 +- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 2859 insertions(+), 2817 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH v6 3/7] TCG: Use gen_opc_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- target-alpha/translate.c |8 ++--- target-arm/translate.c|8 ++--- target-cris/translate.c | 10 +++--- target-i386/translate.c |8 ++--- target-lm32/translate.c | 10 +++--- target-m68k/translate.c |8 ++--- target-microblaze/translate.c | 10 +++--- target-mips/translate.c |9 +++--- target-openrisc/translate.c | 10 +++--- target-ppc/translate.c|9 +++--- target-s390x/translate.c |9 +++--- target-sh4/translate.c|8 ++--- target-sparc/translate.c |8 ++--- target-unicore32/translate.c |8 ++--- target-xtensa/translate.c |6 ++-- tcg/tcg-op.h | 70 - tcg/tcg.c | 16 +- 17 files changed, 109 insertions(+), 106 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 8c4dd02..f160f83 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3432,7 +3432,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, or exhaust instruction count, stop generation. */ if (ret == NO_EXIT ((ctx.pc (TARGET_PAGE_SIZE - 1)) == 0 -|| gen_opc_ptr = gen_opc_end +|| tcg_ctx.gen_opc_ptr = gen_opc_end || num_insns = max_insns || singlestep || env-singlestep_enabled)) { @@ -3463,9 +3463,9 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index 7d8f8e5..014f358 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9881,7 +9881,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, * Also stop translation when a page boundary is reached. This * ensures prefetch aborts occur at the right place. */ num_insns ++; -} while (!dc-is_jmp gen_opc_ptr gen_opc_end +} while (!dc-is_jmp tcg_ctx.gen_opc_ptr gen_opc_end !env-singlestep_enabled !singlestep dc-pc next_page_start @@ -9962,7 +9962,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, done_generating: gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; #ifdef DEBUG_DISAS if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) { @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 023980e..02969d4 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) { @@ -3381,7 +3381,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, break; } } while (!dc-is_jmp !dc-cpustate_changed - gen_opc_ptr gen_opc_end + tcg_ctx.gen_opc_ptr gen_opc_end !singlestep (dc-pc next_page_start) num_insns max_insns); @@ -3434,9 +3434,9 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, } } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj
[Qemu-devel] [PATCH v6 5/7] TCG: Use gen_opc_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- target-alpha/translate.c |6 ++-- target-arm/translate.c|6 ++-- target-cris/translate.c |8 +++--- target-i386/translate.c |6 ++-- target-lm32/translate.c |9 +++--- target-m68k/translate.c |6 ++-- target-microblaze/translate.c |9 +++--- target-mips/translate.c |6 ++-- target-openrisc/translate.c |9 +++--- target-ppc/translate.c|6 ++-- target-s390x/translate.c |6 ++-- target-sh4/translate.c|6 ++-- target-sparc/translate.c |6 ++-- target-unicore32/translate.c |6 ++-- target-xtensa/translate.c |4 +-- tcg/optimize.c| 62 - tcg/tcg.c | 30 ++-- 17 files changed, 97 insertions(+), 94 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index f160f83..4045f78 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3373,7 +3373,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, int max_insns; pc_start = tb-pc; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; ctx.tb = tb; ctx.env = env; @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3465,7 +3465,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index 014f358..c42110a 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9727,7 +9727,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-pc = pc_start; @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 02969d4..0b0e86d 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, dc-env = env; dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-ppc = pc_start; @@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) { @@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) { gen_opc_instr_start[lj++] = 0; @@ -3452,7 +3452,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, log_target_disas(env, pc_start, dc-pc - pc_start, dc-env-pregs[PR_VR]); qemu_log(\nisize=%d osize=%td\n, -dc-pc - pc_start, tcg_ctx.gen_opc_ptr - gen_opc_buf); +dc-pc - pc_start, tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf); } #endif #endif diff --git a/target-i386/translate.c b/target-i386/translate.c index 2658bf2..8e676ba 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -7962,7 +7962,7 @@ static inline void gen_intermediate_code_internal(CPUX86State *env, cpu_ptr0 = tcg_temp_new_ptr
[Qemu-devel] [PATCH v6 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- gen-icount.h |2 +- tcg/tcg-op.h | 254 +- tcg/tcg.c| 36 - 3 files changed, 146 insertions(+), 146 deletions(-) diff --git a/gen-icount.h b/gen-icount.h index 430cb44..248cf5b 100644 --- a/gen-icount.h +++ b/gen-icount.h @@ -16,7 +16,7 @@ static inline void gen_icount_start(void) count = tcg_temp_local_new_i32(); tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32)); /* This is a horrid hack to allow fixing up the value later. */ -icount_arg = gen_opparam_ptr + 1; +icount_arg = tcg_ctx.gen_opparam_ptr + 1; tcg_gen_subi_i32(count, count, 0xdeadbeef); tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 9bc890f..0b3cb0b 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc) static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); } static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); } static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg1; } static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); } static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); } static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGv_i32 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = GET_TCGV_I32(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3); } static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGv_i64 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = GET_TCGV_I64(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3); } static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void
[Qemu-devel] [PATCH v6 2/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- tcg/tcg.h |6 ++ 1 file changed, 6 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index c2ae873..6ffec1d 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -450,6 +450,12 @@ struct TCGContext { int goto_tb_issue_mask; #endif +uint16_t gen_opc_buf[OPC_BUF_SIZE]; +TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; + +uint16_t *gen_opc_ptr; +TCGArg *gen_opparam_ptr; + #if defined(CONFIG_QEMU_LDST_OPTIMIZATION) defined(CONFIG_SOFTMMU) /* labels info for qemu_ld/st IRs The labels help to generate TLB miss case codes at the end of TB */ -- 1.7.9.5
[Qemu-devel] [PATCH v6 6/7] TCG: Use gen_opparam_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- tcg/tcg.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index a039001..ea0bd3a 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s) #endif s-gen_opc_ptr = s-gen_opc_buf; -s-gen_opparam_ptr = gen_opparam_buf; +s-gen_opparam_ptr = s-gen_opparam_buf; #if defined(CONFIG_QEMU_LDST_OPTIMIZATION) defined(CONFIG_SOFTMMU) /* Initialize qemu_ld/st labels to assist code generation at the end of TB @@ -897,7 +897,7 @@ void tcg_dump_ops(TCGContext *s) first_insn = 1; opc_ptr = s-gen_opc_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; while (opc_ptr s-gen_opc_ptr) { c = *opc_ptr++; def = tcg_op_defs[c]; @@ -1440,8 +1440,9 @@ static void tcg_liveness_analysis(TCGContext *s) op_index--; } -if (args != gen_opparam_buf) +if (args != s-gen_opparam_buf) { tcg_abort(); +} } #else /* dummy liveness analysis */ @@ -,7 +2223,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, #ifdef USE_TCG_OPTIMIZATIONS s-gen_opparam_ptr = -tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs); +tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs); #endif #ifdef CONFIG_PROFILER @@ -2249,7 +2250,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, s-code_buf = gen_code_buf; s-code_ptr = gen_code_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; op_index = 0; for(;;) { -- 1.7.9.5
[Qemu-devel] [PATCH v6 7/7] TCG: Remove unused global variables
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- tcg/tcg.c |4 tcg/tcg.h |4 translate-all.c |3 --- 3 files changed, 11 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index ea0bd3a..4f75696 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs); static TCGRegSet tcg_target_available_regs[2]; static TCGRegSet tcg_target_call_clobber_regs; -/* XXX: move that inside the context */ -uint16_t *gen_opc_ptr; -TCGArg *gen_opparam_ptr; - static inline void tcg_out8(TCGContext *s, uint8_t v) { *s-code_ptr++ = v; diff --git a/tcg/tcg.h b/tcg/tcg.h index 6ffec1d..9481e35 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -465,10 +465,6 @@ struct TCGContext { }; extern TCGContext tcg_ctx; -extern uint16_t *gen_opc_ptr; -extern TCGArg *gen_opparam_ptr; -extern uint16_t gen_opc_buf[]; -extern TCGArg gen_opparam_buf[]; /* pool based memory allocation */ diff --git a/translate-all.c b/translate-all.c index 5bd2d37..d9c2e57 100644 --- a/translate-all.c +++ b/translate-all.c @@ -33,9 +33,6 @@ /* code generation context */ TCGContext tcg_ctx; -uint16_t gen_opc_buf[OPC_BUF_SIZE]; -TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; - target_ulong gen_opc_pc[OPC_BUF_SIZE]; uint16_t gen_opc_icount[OPC_BUF_SIZE]; uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; -- 1.7.9.5
Re: [Qemu-devel] [PATCH v5 5/7] TCG: Use gen_opc_buf from context instead of global variable.
On 11/10/2012 04:39 PM, Blue Swirl wrote: On Tue, Nov 6, 2012 at 4:41 AM, Evgeny Voevodin e.voevo...@samsung.com wrote: Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- target-alpha/translate.c |6 ++-- target-arm/translate.c|6 ++-- target-cris/translate.c |9 +++--- target-i386/translate.c |6 ++-- target-lm32/translate.c |9 +++--- target-m68k/translate.c |6 ++-- target-microblaze/translate.c |9 +++--- target-mips/translate.c |6 ++-- target-openrisc/translate.c |9 +++--- target-ppc/translate.c|6 ++-- target-s390x/translate.c |6 ++-- target-sh4/translate.c|6 ++-- target-sparc/translate.c |6 ++-- target-unicore32/translate.c |6 ++-- target-xtensa/translate.c |4 +-- tcg/optimize.c| 62 - tcg/tcg.c | 30 ++-- 17 files changed, 98 insertions(+), 94 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 6676cbf..91c761a 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3373,7 +3373,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, int max_insns; pc_start = tb-pc; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; ctx.tb = tb; ctx.env = env; @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3465,7 +3465,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index ff5d294..0602b31 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9727,7 +9727,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-pc = pc_start; @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index e34288e..0adc07b 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, dc-env = env; dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-ppc = pc_start; @@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) { @@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) { gen_opc_instr_start[lj++] = 0; @@ -3452,7 +3452,8 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, log_target_disas(pc_start, dc-pc - pc_start, dc-env-pregs[PR_VR]); qemu_log(\nisize=%d osize=%td\n, -dc-pc - pc_start, gtcg_ctx.en_opc_ptr - gen_opc_buf); +dc-pc - pc_start, gtcg_ctx.en_opc_ptr - tcg_ctx.gen_opc_buf); +tcg_ctx.gen_opc_buf); Broken patch: /src/qemu/target-cris/translate.c:3456: error: statement with no effect /src
Re: [Qemu-devel] [PATCH v5 0/7] TCG global variables clean-up
On 11/06/2012 08:41 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed 0.7% speed-up of code generation. Probably, this is due to better data caching. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 731.5 max: 734.8 avg: 733.0 standard deviation: ~2 ~= 0.2% Average cycles/op = 733 +- 2 After clean-up: min: 725.0 max: 730.5 avg: 727.8 standard deviation: ~3 ~= 0.4% Average cycles/op = 728 +- 3 Speed-up of TCG code generation = 0.7% Changelog: v4-v5: Rebased. Fixed authorship. All patches are reviewed-by Richard Henderson r...@twiddle.net v3-v4: Rebased. Added target-cris/translate.c: Code style clean-up v2-v3: Removed tcg_cur_ctx since it gives slow-down on gcc-4.5. Rebased. v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny Voevodin (7): target-cris/translate.c: Code style clean-up tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. TCG: Remove unused global variables gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 5041 + target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 +- tcg/tcg-op.h | 324 +-- tcg/tcg.c | 85 +- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 2860 insertions(+), 2817 deletions(-) Is anybody going to apply this before I have to rebase again? -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH v5 0/7] TCG global variables clean-up
This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed 0.7% speed-up of code generation. Probably, this is due to better data caching. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 731.5 max: 734.8 avg: 733.0 standard deviation: ~2 ~= 0.2% Average cycles/op = 733 +- 2 After clean-up: min: 725.0 max: 730.5 avg: 727.8 standard deviation: ~3 ~= 0.4% Average cycles/op = 728 +- 3 Speed-up of TCG code generation = 0.7% Changelog: v4-v5: Rebased. Fixed authorship. All patches are reviewed-by Richard Henderson r...@twiddle.net v3-v4: Rebased. Added target-cris/translate.c: Code style clean-up v2-v3: Removed tcg_cur_ctx since it gives slow-down on gcc-4.5. Rebased. v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny Voevodin (7): target-cris/translate.c: Code style clean-up tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. TCG: Remove unused global variables gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 5041 + target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 +- tcg/tcg-op.h | 324 +-- tcg/tcg.c | 85 +- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 2860 insertions(+), 2817 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH v5 2/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- tcg/tcg.h |6 ++ 1 file changed, 6 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index c2ae873..6ffec1d 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -450,6 +450,12 @@ struct TCGContext { int goto_tb_issue_mask; #endif +uint16_t gen_opc_buf[OPC_BUF_SIZE]; +TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; + +uint16_t *gen_opc_ptr; +TCGArg *gen_opparam_ptr; + #if defined(CONFIG_QEMU_LDST_OPTIMIZATION) defined(CONFIG_SOFTMMU) /* labels info for qemu_ld/st IRs The labels help to generate TLB miss case codes at the end of TB */ -- 1.7.9.5
[Qemu-devel] [PATCH v5 6/7] TCG: Use gen_opparam_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- tcg/tcg.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index ea27bd4..d281af9 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s) #endif s-gen_opc_ptr = s-gen_opc_buf; -s-gen_opparam_ptr = gen_opparam_buf; +s-gen_opparam_ptr = s-gen_opparam_buf; #if defined(CONFIG_QEMU_LDST_OPTIMIZATION) defined(CONFIG_SOFTMMU) /* Initialize qemu_ld/st labels to assist code generation at the end of TB @@ -897,7 +897,7 @@ void tcg_dump_ops(TCGContext *s) first_insn = 1; opc_ptr = s-gen_opc_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; while (opc_ptr s-gen_opc_ptr) { c = *opc_ptr++; def = tcg_op_defs[c]; @@ -1440,8 +1440,9 @@ static void tcg_liveness_analysis(TCGContext *s) op_index--; } -if (args != gen_opparam_buf) +if (args != s-gen_opparam_buf) { tcg_abort(); +} } #else /* dummy liveness analysis */ @@ -,7 +2223,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, #ifdef USE_TCG_OPTIMIZATIONS s-gen_opparam_ptr = -tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs); +tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs); #endif #ifdef CONFIG_PROFILER @@ -2249,7 +2250,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, s-code_buf = gen_code_buf; s-code_ptr = gen_code_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; op_index = 0; for(;;) { -- 1.7.9.5
[Qemu-devel] [PATCH v5 5/7] TCG: Use gen_opc_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- target-alpha/translate.c |6 ++-- target-arm/translate.c|6 ++-- target-cris/translate.c |9 +++--- target-i386/translate.c |6 ++-- target-lm32/translate.c |9 +++--- target-m68k/translate.c |6 ++-- target-microblaze/translate.c |9 +++--- target-mips/translate.c |6 ++-- target-openrisc/translate.c |9 +++--- target-ppc/translate.c|6 ++-- target-s390x/translate.c |6 ++-- target-sh4/translate.c|6 ++-- target-sparc/translate.c |6 ++-- target-unicore32/translate.c |6 ++-- target-xtensa/translate.c |4 +-- tcg/optimize.c| 62 - tcg/tcg.c | 30 ++-- 17 files changed, 98 insertions(+), 94 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 6676cbf..91c761a 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3373,7 +3373,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, int max_insns; pc_start = tb-pc; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; ctx.tb = tb; ctx.env = env; @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3465,7 +3465,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index ff5d294..0602b31 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9727,7 +9727,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-pc = pc_start; @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index e34288e..0adc07b 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, dc-env = env; dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-ppc = pc_start; @@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) { @@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) { gen_opc_instr_start[lj++] = 0; @@ -3452,7 +3452,8 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, log_target_disas(pc_start, dc-pc - pc_start, dc-env-pregs[PR_VR]); qemu_log(\nisize=%d osize=%td\n, -dc-pc - pc_start, gtcg_ctx.en_opc_ptr - gen_opc_buf); +dc-pc - pc_start, gtcg_ctx.en_opc_ptr - tcg_ctx.gen_opc_buf); +tcg_ctx.gen_opc_buf); } #endif #endif diff --git a/target-i386/translate.c b/target-i386/translate.c index 5f977d9..1563677 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -7958,7 +7958,7 @@ static inline void gen_intermediate_code_internal(CPUX86State *env
[Qemu-devel] [PATCH v5 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- gen-icount.h |2 +- tcg/tcg-op.h | 254 +- tcg/tcg.c| 36 - 3 files changed, 146 insertions(+), 146 deletions(-) diff --git a/gen-icount.h b/gen-icount.h index 430cb44..248cf5b 100644 --- a/gen-icount.h +++ b/gen-icount.h @@ -16,7 +16,7 @@ static inline void gen_icount_start(void) count = tcg_temp_local_new_i32(); tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32)); /* This is a horrid hack to allow fixing up the value later. */ -icount_arg = gen_opparam_ptr + 1; +icount_arg = tcg_ctx.gen_opparam_ptr + 1; tcg_gen_subi_i32(count, count, 0xdeadbeef); tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 9bc890f..0b3cb0b 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc) static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); } static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); } static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg1; } static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); } static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); } static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGv_i32 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = GET_TCGV_I32(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3); } static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGv_i64 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = GET_TCGV_I64(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3); } static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void
[Qemu-devel] [PATCH v5 7/7] TCG: Remove unused global variables
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- tcg/tcg.c |4 tcg/tcg.h |4 translate-all.c |3 --- 3 files changed, 11 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index d281af9..359be16 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs); static TCGRegSet tcg_target_available_regs[2]; static TCGRegSet tcg_target_call_clobber_regs; -/* XXX: move that inside the context */ -uint16_t *gen_opc_ptr; -TCGArg *gen_opparam_ptr; - static inline void tcg_out8(TCGContext *s, uint8_t v) { *s-code_ptr++ = v; diff --git a/tcg/tcg.h b/tcg/tcg.h index 6ffec1d..9481e35 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -465,10 +465,6 @@ struct TCGContext { }; extern TCGContext tcg_ctx; -extern uint16_t *gen_opc_ptr; -extern TCGArg *gen_opparam_ptr; -extern uint16_t gen_opc_buf[]; -extern TCGArg gen_opparam_buf[]; /* pool based memory allocation */ diff --git a/translate-all.c b/translate-all.c index 5bd2d37..d9c2e57 100644 --- a/translate-all.c +++ b/translate-all.c @@ -33,9 +33,6 @@ /* code generation context */ TCGContext tcg_ctx; -uint16_t gen_opc_buf[OPC_BUF_SIZE]; -TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; - target_ulong gen_opc_pc[OPC_BUF_SIZE]; uint16_t gen_opc_icount[OPC_BUF_SIZE]; uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; -- 1.7.9.5
[Qemu-devel] [PATCH v5 3/7] TCG: Use gen_opc_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Reviewed-by: Richard Henderson r...@twiddle.net --- target-alpha/translate.c |8 ++--- target-arm/translate.c|8 ++--- target-cris/translate.c | 10 +++--- target-i386/translate.c |8 ++--- target-lm32/translate.c | 10 +++--- target-m68k/translate.c |8 ++--- target-microblaze/translate.c | 10 +++--- target-mips/translate.c |9 +++--- target-openrisc/translate.c | 10 +++--- target-ppc/translate.c|9 +++--- target-s390x/translate.c |9 +++--- target-sh4/translate.c|8 ++--- target-sparc/translate.c |8 ++--- target-unicore32/translate.c |8 ++--- target-xtensa/translate.c |6 ++-- tcg/tcg-op.h | 70 - tcg/tcg.c | 16 +- 17 files changed, 109 insertions(+), 106 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index f707d8d..6676cbf 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3432,7 +3432,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, or exhaust instruction count, stop generation. */ if (ret == NO_EXIT ((ctx.pc (TARGET_PAGE_SIZE - 1)) == 0 -|| gen_opc_ptr = gen_opc_end +|| tcg_ctx.gen_opc_ptr = gen_opc_end || num_insns = max_insns || singlestep || env-singlestep_enabled)) { @@ -3463,9 +3463,9 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index 25433da..ff5d294 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9881,7 +9881,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, * Also stop translation when a page boundary is reached. This * ensures prefetch aborts occur at the right place. */ num_insns ++; -} while (!dc-is_jmp gen_opc_ptr gen_opc_end +} while (!dc-is_jmp tcg_ctx.gen_opc_ptr gen_opc_end !env-singlestep_enabled !singlestep dc-pc next_page_start @@ -9962,7 +9962,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, done_generating: gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; #ifdef DEBUG_DISAS if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) { @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 27b82cf..e34288e 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) { @@ -3381,7 +3381,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, break; } } while (!dc-is_jmp !dc-cpustate_changed - gen_opc_ptr gen_opc_end + tcg_ctx.gen_opc_ptr gen_opc_end !singlestep (dc-pc next_page_start) num_insns max_insns); @@ -3434,9 +3434,9 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, } } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj
Re: [Qemu-devel] [PATCH v4 0/7] TCG global variables clean-up
On 10/31/2012 11:01 PM, Richard Henderson wrote: On 2012-10-31 16:19, Evgeny Voevodin wrote: Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (5): target-cris/translate.c: Code style clean-up TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 5041 + target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 +- tcg/tcg-op.h | 324 +-- tcg/tcg.c | 85 +- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 2860 insertions(+), 2817 deletions(-) Reviewed-by: Richard Henderson r...@twiddle.net r~ Thanks. Will anybody apply this? -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH v3 0/6] TCG global variables clean-up
On 10/30/2012 10:46 PM, Blue Swirl wrote: On Mon, Oct 29, 2012 at 9:14 AM, Evgeny Voevodin e.voevo...@samsung.com wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Patches don't apply, please rebase. Ok. When I sent they applied correctly. I'll rebase. Also checkpatch.pl complains about tabs. There are tabs everywhere in the target-cris/translate.c Should I remove all tabs from patches only or from whole file? Actually clean-up tabs was out of my scope... Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed 0.7% speed-up of code generation. Probably, this is due to better data caching. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 731.5 max: 734.8 avg: 733.0 standard deviation: ~2 ~= 0.2% Average cycles/op = 733 +- 2 After clean-up: min: 725.0 max: 730.5 avg: 727.8 standard deviation: ~3 ~= 0.4% Average cycles/op = 728 +- 3 Speed-up of TCG code generation = 0.7% Changelog: v2-v3: Removed tcg_cur_ctx since it gives slow-down on gcc-4.5. Rebased. v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (4): TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 85 ++- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 326 insertions(+), 323 deletions(-) -- 1.7.9.5 -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH v4 3/7] TCG: Use gen_opc_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |8 ++--- target-arm/translate.c|8 ++--- target-cris/translate.c | 10 +++--- target-i386/translate.c |8 ++--- target-lm32/translate.c | 10 +++--- target-m68k/translate.c |8 ++--- target-microblaze/translate.c | 10 +++--- target-mips/translate.c |9 +++--- target-openrisc/translate.c | 10 +++--- target-ppc/translate.c|9 +++--- target-s390x/translate.c |9 +++--- target-sh4/translate.c|8 ++--- target-sparc/translate.c |8 ++--- target-unicore32/translate.c |8 ++--- target-xtensa/translate.c |6 ++-- tcg/tcg-op.h | 70 - tcg/tcg.c | 16 +- 17 files changed, 109 insertions(+), 106 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index f707d8d..6676cbf 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3432,7 +3432,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, or exhaust instruction count, stop generation. */ if (ret == NO_EXIT ((ctx.pc (TARGET_PAGE_SIZE - 1)) == 0 -|| gen_opc_ptr = gen_opc_end +|| tcg_ctx.gen_opc_ptr = gen_opc_end || num_insns = max_insns || singlestep || env-singlestep_enabled)) { @@ -3463,9 +3463,9 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index 25433da..ff5d294 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9881,7 +9881,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, * Also stop translation when a page boundary is reached. This * ensures prefetch aborts occur at the right place. */ num_insns ++; -} while (!dc-is_jmp gen_opc_ptr gen_opc_end +} while (!dc-is_jmp tcg_ctx.gen_opc_ptr gen_opc_end !env-singlestep_enabled !singlestep dc-pc next_page_start @@ -9962,7 +9962,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, done_generating: gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; #ifdef DEBUG_DISAS if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) { @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 27b82cf..e34288e 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) { @@ -3381,7 +3381,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, break; } } while (!dc-is_jmp !dc-cpustate_changed - gen_opc_ptr gen_opc_end + tcg_ctx.gen_opc_ptr gen_opc_end !singlestep (dc-pc next_page_start) num_insns max_insns); @@ -3434,9 +3434,9 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, } } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) { gen_opc_instr_start[lj
[Qemu-devel] [PATCH v4 0/7] TCG global variables clean-up
This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed 0.7% speed-up of code generation. Probably, this is due to better data caching. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 731.5 max: 734.8 avg: 733.0 standard deviation: ~2 ~= 0.2% Average cycles/op = 733 +- 2 After clean-up: min: 725.0 max: 730.5 avg: 727.8 standard deviation: ~3 ~= 0.4% Average cycles/op = 728 +- 3 Speed-up of TCG code generation = 0.7% Changelog: v3-v4: Rebased. Added target-cris/translate.c: Code style clean-up v2-v3: Removed tcg_cur_ctx since it gives slow-down on gcc-4.5. Rebased. v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (5): target-cris/translate.c: Code style clean-up TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 5041 + target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 +- tcg/tcg-op.h | 324 +-- tcg/tcg.c | 85 +- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 2860 insertions(+), 2817 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH v4 2/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext
From: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.h |6 ++ 1 file changed, 6 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index a6c9256..b229061 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -431,6 +431,12 @@ struct TCGContext { int temps_in_use; int goto_tb_issue_mask; #endif + +uint16_t gen_opc_buf[OPC_BUF_SIZE]; +TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; + +uint16_t *gen_opc_ptr; +TCGArg *gen_opparam_ptr; }; extern TCGContext tcg_ctx; -- 1.7.9.5
[Qemu-devel] [PATCH v4 6/7] TCG: Use gen_opparam_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index a7c3832..1fd1731 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s) #endif s-gen_opc_ptr = s-gen_opc_buf; -s-gen_opparam_ptr = gen_opparam_buf; +s-gen_opparam_ptr = s-gen_opparam_buf; } static inline void tcg_temp_alloc(TCGContext *s, int n) @@ -889,7 +889,7 @@ void tcg_dump_ops(TCGContext *s) first_insn = 1; opc_ptr = s-gen_opc_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; while (opc_ptr s-gen_opc_ptr) { c = *opc_ptr++; def = tcg_op_defs[c]; @@ -1432,8 +1432,9 @@ static void tcg_liveness_analysis(TCGContext *s) op_index--; } -if (args != gen_opparam_buf) +if (args != s-gen_opparam_buf) { tcg_abort(); +} } #else /* dummy liveness analysis */ @@ -2214,7 +2215,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, #ifdef USE_TCG_OPTIMIZATIONS s-gen_opparam_ptr = -tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs); +tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs); #endif #ifdef CONFIG_PROFILER @@ -2241,7 +2242,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, s-code_buf = gen_code_buf; s-code_ptr = gen_code_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; op_index = 0; for(;;) { -- 1.7.9.5
[Qemu-devel] [PATCH v4 5/7] TCG: Use gen_opc_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |6 ++-- target-arm/translate.c|6 ++-- target-cris/translate.c |9 +++--- target-i386/translate.c |6 ++-- target-lm32/translate.c |9 +++--- target-m68k/translate.c |6 ++-- target-microblaze/translate.c |9 +++--- target-mips/translate.c |6 ++-- target-openrisc/translate.c |9 +++--- target-ppc/translate.c|6 ++-- target-s390x/translate.c |6 ++-- target-sh4/translate.c|6 ++-- target-sparc/translate.c |6 ++-- target-unicore32/translate.c |6 ++-- target-xtensa/translate.c |4 +-- tcg/optimize.c| 62 - tcg/tcg.c | 30 ++-- 17 files changed, 98 insertions(+), 94 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 6676cbf..91c761a 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3373,7 +3373,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, int max_insns; pc_start = tb-pc; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; ctx.tb = tb; ctx.env = env; @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3465,7 +3465,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index ff5d294..0602b31 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9727,7 +9727,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-pc = pc_start; @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index e34288e..0adc07b 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, dc-env = env; dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-ppc = pc_start; @@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) { @@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) { gen_opc_instr_start[lj++] = 0; @@ -3452,7 +3452,8 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, log_target_disas(pc_start, dc-pc - pc_start, dc-env-pregs[PR_VR]); qemu_log(\nisize=%d osize=%td\n, -dc-pc - pc_start, gtcg_ctx.en_opc_ptr - gen_opc_buf); +dc-pc - pc_start, gtcg_ctx.en_opc_ptr - tcg_ctx.gen_opc_buf); +tcg_ctx.gen_opc_buf); } #endif #endif diff --git a/target-i386/translate.c b/target-i386/translate.c index 5f977d9..1563677 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -7958,7 +7958,7 @@ static inline void gen_intermediate_code_internal(CPUX86State *env, cpu_ptr0 = tcg_temp_new_ptr(); cpu_ptr1
[Qemu-devel] [PATCH v4 7/7] TCG: Remove unused global variables
From: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.c |4 tcg/tcg.h |4 translate-all.c |3 --- 3 files changed, 11 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index 1fd1731..a9c9d6f 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs); static TCGRegSet tcg_target_available_regs[2]; static TCGRegSet tcg_target_call_clobber_regs; -/* XXX: move that inside the context */ -uint16_t *gen_opc_ptr; -TCGArg *gen_opparam_ptr; - static inline void tcg_out8(TCGContext *s, uint8_t v) { *s-code_ptr++ = v; diff --git a/tcg/tcg.h b/tcg/tcg.h index b229061..c09c188 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -440,10 +440,6 @@ struct TCGContext { }; extern TCGContext tcg_ctx; -extern uint16_t *gen_opc_ptr; -extern TCGArg *gen_opparam_ptr; -extern uint16_t gen_opc_buf[]; -extern TCGArg gen_opparam_buf[]; /* pool based memory allocation */ diff --git a/translate-all.c b/translate-all.c index 5bd2d37..d9c2e57 100644 --- a/translate-all.c +++ b/translate-all.c @@ -33,9 +33,6 @@ /* code generation context */ TCGContext tcg_ctx; -uint16_t gen_opc_buf[OPC_BUF_SIZE]; -TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; - target_ulong gen_opc_pc[OPC_BUF_SIZE]; uint16_t gen_opc_icount[OPC_BUF_SIZE]; uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; -- 1.7.9.5
[Qemu-devel] [PATCH v4 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- gen-icount.h |2 +- tcg/tcg-op.h | 254 +- tcg/tcg.c| 36 - 3 files changed, 146 insertions(+), 146 deletions(-) diff --git a/gen-icount.h b/gen-icount.h index 430cb44..248cf5b 100644 --- a/gen-icount.h +++ b/gen-icount.h @@ -16,7 +16,7 @@ static inline void gen_icount_start(void) count = tcg_temp_local_new_i32(); tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32)); /* This is a horrid hack to allow fixing up the value later. */ -icount_arg = gen_opparam_ptr + 1; +icount_arg = tcg_ctx.gen_opparam_ptr + 1; tcg_gen_subi_i32(count, count, 0xdeadbeef); tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 9bc890f..0b3cb0b 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc) static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); } static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); } static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg1; } static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); } static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); } static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGv_i32 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = GET_TCGV_I32(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3); } static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGv_i64 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = GET_TCGV_I64(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3); } static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void tcg_gen_ldst_op_i32(TCGOpcode opc, TCGv_i32 val
Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up
On 10/27/2012 06:34 PM, Blue Swirl wrote: On Fri, Oct 26, 2012 at 6:32 AM, Evgeny Voevodin e.voevo...@samsung.com wrote: Today I made more precise testing with usage of --enable-profiler. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results: Before clean-up: min: 731.9 max: 735.8 avg: 734.3 standard deviation: ~2 = 0.3% Avarage cycles/op = 734 +- 2 After clean-up: min: 747.2 max: 751.7 avg: 750.5 standard deviation: ~2 = 0.3% Avarage cycles/op = 750 +- 2 Slow-down of TCG code generation = 2.2% After clean-up with TCGContext *const tcg_cur_ctx: min: 730.6 max: 733.2 avg: 728.7 standard deviation: ~2 = 0.3% Avarage cycles/op = 729 +- 2 Slow-down of TCG code generation = 0% I suggest to define tcg_cur_ctx as TCGContext *const. Then we will get rid of TCG code generation slow-down and also will have no usage of global variables. How does this compare with the original version without pointers? I think that version may be safer to be assumed to be optimized by the compiler. I did more testing with different gcc versions and different patch series: gcc verion v1 clean-up, no pointer v2 clean-up, const pointer master gcc-4.4754.3 752.1 769.8 gcc-4.5770.8 779.8 774.8 gcc-4.6731.8 729.8 737 Conclusion: - First clean-up series without pointer operates faster than master in all cases. It's probably because data is cached more efficiently. - Second clean-up series with constant pointer operates faster than master in the case of gcc-4.4 and gcc-4.6. In the case of gcc-4.5 it seems that const pointer is not optimised as I assumed. I think that it's worth to generate third series without pointer and with code clean-up included in second. How do you think? On 10/25/2012 10:45 AM, Evgeny Voevodin wrote: Here are the results of tests before and after this patch series was applied: * EEMBC CoreMark (before - after) - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image) - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux - Results: 1148.105626 - 1161.440186 (+1.16%) * nbench (before - after) - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image) - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux - Results . MEMORY INDEX: 1.864 - 1.862 (-0.11%) . INTEGER INDEX: 2.518 - 2.523 (+0.2%) . FLOATING-POINT INDEX: 0.385 - 0.394 (+2.34%) Those tests show that it became even faster :)) But I'm quite sceptical about such results. The thing is that in case of nbench it prints the warning if results are not 95% statistically accurate. So we can be sure that nbench result is 95% accurate. And it's obvious that result shown above are in the scope of this accuracy. I don't know the accuracy of CoreMark. So, the main decision we can make that this patch series didn't introduce any slow-down comparable to inaccuracy of the measurement. Is this enough? On 10/23/2012 10:21 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Changelog: v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (5): translate-all.c: Introduce TCGContext *tcg_cur_ctx TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c
[Qemu-devel] [PATCH v3 3/6] TCG: Use gen_opparam_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- gen-icount.h |2 +- tcg/tcg-op.h | 254 +- tcg/tcg.c| 36 - 3 files changed, 146 insertions(+), 146 deletions(-) diff --git a/gen-icount.h b/gen-icount.h index 430cb44..248cf5b 100644 --- a/gen-icount.h +++ b/gen-icount.h @@ -16,7 +16,7 @@ static inline void gen_icount_start(void) count = tcg_temp_local_new_i32(); tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32)); /* This is a horrid hack to allow fixing up the value later. */ -icount_arg = gen_opparam_ptr + 1; +icount_arg = tcg_ctx.gen_opparam_ptr + 1; tcg_gen_subi_i32(count, count, 0xdeadbeef); tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 50b1c62..d6daea4 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc) static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); } static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); } static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg1; } static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); } static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); } static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; -*gen_opparam_ptr++ = arg2; +*tcg_ctx.gen_opparam_ptr++ = arg1; +*tcg_ctx.gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGv_i32 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = GET_TCGV_I32(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3); } static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGv_i64 arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = GET_TCGV_I64(arg3); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3); } static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGArg arg3) { *tcg_ctx.gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_ctx.gen_opparam_ptr++ = arg3; } static inline void tcg_gen_ldst_op_i32(TCGOpcode opc, TCGv_i32 val
[Qemu-devel] [PATCH v3 5/6] TCG: Use gen_opparam_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index c4e663b..f332463 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s) #endif s-gen_opc_ptr = s-gen_opc_buf; -s-gen_opparam_ptr = gen_opparam_buf; +s-gen_opparam_ptr = s-gen_opparam_buf; } static inline void tcg_temp_alloc(TCGContext *s, int n) @@ -885,7 +885,7 @@ void tcg_dump_ops(TCGContext *s) first_insn = 1; opc_ptr = s-gen_opc_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; while (opc_ptr s-gen_opc_ptr) { c = *opc_ptr++; def = tcg_op_defs[c]; @@ -1409,8 +1409,9 @@ static void tcg_liveness_analysis(TCGContext *s) op_index--; } -if (args != gen_opparam_buf) +if (args != s-gen_opparam_buf) { tcg_abort(); +} } #else /* dummy liveness analysis */ @@ -2104,7 +2105,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, #ifdef USE_TCG_OPTIMIZATIONS s-gen_opparam_ptr = -tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs); +tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs); #endif #ifdef CONFIG_PROFILER @@ -2131,7 +2132,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, s-code_buf = gen_code_buf; s-code_ptr = gen_code_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; op_index = 0; for(;;) { -- 1.7.9.5
[Qemu-devel] [PATCH v3 0/6] TCG global variables clean-up
This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on Exynos4210 target. After this patchset was aplied, I noticed 0.7% speed-up of code generation. Probably, this is due to better data caching. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results (tested on gcc-4.6): Before clean-up: min: 731.5 max: 734.8 avg: 733.0 standard deviation: ~2 ~= 0.2% Average cycles/op = 733 +- 2 After clean-up: min: 725.0 max: 730.5 avg: 727.8 standard deviation: ~3 ~= 0.4% Average cycles/op = 728 +- 3 Speed-up of TCG code generation = 0.7% Changelog: v2-v3: Removed tcg_cur_ctx since it gives slow-down on gcc-4.5. Rebased. v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (4): TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 85 ++- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 326 insertions(+), 323 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH v3 6/6] TCG: Remove unused global variables
From: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.c |4 tcg/tcg.h |4 translate-all.c |3 --- 3 files changed, 11 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index f332463..53bf109 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs); static TCGRegSet tcg_target_available_regs[2]; static TCGRegSet tcg_target_call_clobber_regs; -/* XXX: move that inside the context */ -uint16_t *gen_opc_ptr; -TCGArg *gen_opparam_ptr; - static inline void tcg_out8(TCGContext *s, uint8_t v) { *s-code_ptr++ = v; diff --git a/tcg/tcg.h b/tcg/tcg.h index 43b4317..b1f4e49 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -431,10 +431,6 @@ struct TCGContext { }; extern TCGContext tcg_ctx; -extern uint16_t *gen_opc_ptr; -extern TCGArg *gen_opparam_ptr; -extern uint16_t gen_opc_buf[]; -extern TCGArg gen_opparam_buf[]; /* pool based memory allocation */ diff --git a/translate-all.c b/translate-all.c index 5bd2d37..d9c2e57 100644 --- a/translate-all.c +++ b/translate-all.c @@ -33,9 +33,6 @@ /* code generation context */ TCGContext tcg_ctx; -uint16_t gen_opc_buf[OPC_BUF_SIZE]; -TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; - target_ulong gen_opc_pc[OPC_BUF_SIZE]; uint16_t gen_opc_icount[OPC_BUF_SIZE]; uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; -- 1.7.9.5
[Qemu-devel] [PATCH v3 4/6] TCG: Use gen_opc_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |6 ++-- target-arm/translate.c|6 ++-- target-cris/translate.c |9 +++--- target-i386/translate.c |6 ++-- target-lm32/translate.c |9 +++--- target-m68k/translate.c |6 ++-- target-microblaze/translate.c |9 +++--- target-mips/translate.c |6 ++-- target-openrisc/translate.c |9 +++--- target-ppc/translate.c|6 ++-- target-s390x/translate.c |6 ++-- target-sh4/translate.c|6 ++-- target-sparc/translate.c |6 ++-- target-unicore32/translate.c |6 ++-- target-xtensa/translate.c |4 +-- tcg/optimize.c| 62 - tcg/tcg.c | 30 ++-- 17 files changed, 98 insertions(+), 94 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 6676cbf..91c761a 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3373,7 +3373,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, int max_insns; pc_start = tb-pc; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; ctx.tb = tb; ctx.env = env; @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3465,7 +3465,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index ff5d294..0602b31 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9727,7 +9727,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-pc = pc_start; @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = tcg_ctx.gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 903907b..c54e3df 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3202,7 +3202,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, dc-env = env; dc-tb = tb; - gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; + gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-ppc = pc_start; @@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { - j = tcg_ctx.gen_opc_ptr - gen_opc_buf; + j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3401,7 +3401,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, gen_icount_end(tb, num_insns); *tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { - j = tcg_ctx.gen_opc_ptr - gen_opc_buf; + j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; @@ -3416,7 +3416,8 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, log_target_disas(pc_start, dc-pc - pc_start, dc-env-pregs[PR_VR]); qemu_log(\nisize=%d osize=%td\n, - dc-pc - pc_start, tcg_ctx.gen_opc_ptr - gen_opc_buf); + dc-pc - pc_start, tcg_ctx.gen_opc_ptr - + tcg_ctx.gen_opc_buf); } #endif #endif diff --git a/target-i386/translate.c b/target-i386/translate.c index 5f977d9..1563677 100644 --- a/target-i386/translate.c +++ b
[Qemu-devel] [PATCH v3 1/6] tcg/tcg.h: Duplicate global TCG variables in TCGContext
From: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.h |6 ++ 1 file changed, 6 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index 45e94f5..43b4317 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -422,6 +422,12 @@ struct TCGContext { int temps_in_use; int goto_tb_issue_mask; #endif + +uint16_t gen_opc_buf[OPC_BUF_SIZE]; +TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; + +uint16_t *gen_opc_ptr; +TCGArg *gen_opparam_ptr; }; extern TCGContext tcg_ctx; -- 1.7.9.5
[Qemu-devel] [PATCH v3 2/6] TCG: Use gen_opc_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |8 ++--- target-arm/translate.c|8 ++--- target-cris/translate.c | 10 +++--- target-i386/translate.c |8 ++--- target-lm32/translate.c | 10 +++--- target-m68k/translate.c |8 ++--- target-microblaze/translate.c | 10 +++--- target-mips/translate.c |9 +++--- target-openrisc/translate.c | 10 +++--- target-ppc/translate.c|9 +++--- target-s390x/translate.c |9 +++--- target-sh4/translate.c|8 ++--- target-sparc/translate.c |8 ++--- target-unicore32/translate.c |8 ++--- target-xtensa/translate.c |6 ++-- tcg/tcg-op.h | 70 - tcg/tcg.c | 16 +- 17 files changed, 109 insertions(+), 106 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index f707d8d..6676cbf 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3432,7 +3432,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, or exhaust instruction count, stop generation. */ if (ret == NO_EXIT ((ctx.pc (TARGET_PAGE_SIZE - 1)) == 0 -|| gen_opc_ptr = gen_opc_end +|| tcg_ctx.gen_opc_ptr = gen_opc_end || num_insns = max_insns || singlestep || env-singlestep_enabled)) { @@ -3463,9 +3463,9 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index 25433da..ff5d294 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9834,7 +9834,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9881,7 +9881,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, * Also stop translation when a page boundary is reached. This * ensures prefetch aborts occur at the right place. */ num_insns ++; -} while (!dc-is_jmp gen_opc_ptr gen_opc_end +} while (!dc-is_jmp tcg_ctx.gen_opc_ptr gen_opc_end !env-singlestep_enabled !singlestep dc-pc next_page_start @@ -9962,7 +9962,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, done_generating: gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_ctx.gen_opc_ptr = INDEX_op_end; #ifdef DEBUG_DISAS if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) { @@ -9974,7 +9974,7 @@ done_generating: } #endif if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_ctx.gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 755de65..903907b 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { - j = gen_opc_ptr - gen_opc_buf; + j = tcg_ctx.gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3348,7 +3348,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, if (!(tb-pc 1) env-singlestep_enabled) break; } while (!dc-is_jmp !dc-cpustate_changed - gen_opc_ptr gen_opc_end + tcg_ctx.gen_opc_ptr gen_opc_end !singlestep (dc-pc next_page_start) num_insns max_insns); @@ -3399,9 +3399,9 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, } } gen_icount_end(tb, num_insns); - *gen_opc_ptr = INDEX_op_end; + *tcg_ctx.gen_opc_ptr = INDEX_op_end
Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up
Today I made more precise testing with usage of --enable-profiler. Here is the test procedure: 1. Boot Linux Kernel 5 times. 2. For each iteration wait while JIT cycles is stable for ~10 seconds 3. Write down the cycles/op Here are the results: Before clean-up: min: 731.9 max: 735.8 avg: 734.3 standard deviation: ~2 = 0.3% Avarage cycles/op = 734 +- 2 After clean-up: min: 747.2 max: 751.7 avg: 750.5 standard deviation: ~2 = 0.3% Avarage cycles/op = 750 +- 2 Slow-down of TCG code generation = 2.2% After clean-up with TCGContext *const tcg_cur_ctx: min: 730.6 max: 733.2 avg: 728.7 standard deviation: ~2 = 0.3% Avarage cycles/op = 729 +- 2 Slow-down of TCG code generation = 0% I suggest to define tcg_cur_ctx as TCGContext *const. Then we will get rid of TCG code generation slow-down and also will have no usage of global variables. On 10/25/2012 10:45 AM, Evgeny Voevodin wrote: Here are the results of tests before and after this patch series was applied: * EEMBC CoreMark (before - after) - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image) - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux - Results: 1148.105626 - 1161.440186 (+1.16%) * nbench (before - after) - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image) - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux - Results . MEMORY INDEX: 1.864 - 1.862 (-0.11%) . INTEGER INDEX: 2.518 - 2.523 (+0.2%) . FLOATING-POINT INDEX: 0.385 - 0.394 (+2.34%) Those tests show that it became even faster :)) But I'm quite sceptical about such results. The thing is that in case of nbench it prints the warning if results are not 95% statistically accurate. So we can be sure that nbench result is 95% accurate. And it's obvious that result shown above are in the scope of this accuracy. I don't know the accuracy of CoreMark. So, the main decision we can make that this patch series didn't introduce any slow-down comparable to inaccuracy of the measurement. Is this enough? On 10/23/2012 10:21 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Changelog: v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (5): translate-all.c: Introduce TCGContext *tcg_cur_ctx TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 85 ++- tcg/tcg.h | 11 +- translate-all.c |4 +- 21 files changed, 328 insertions(+), 323 deletions(-) -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up
Here are the results of tests before and after this patch series was applied: * EEMBC CoreMark (before - after) - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image) - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux - Results: 1148.105626 - 1161.440186 (+1.16%) * nbench (before - after) - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image) - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux - Results . MEMORY INDEX: 1.864 - 1.862 (-0.11%) . INTEGER INDEX: 2.518 - 2.523 (+0.2%) . FLOATING-POINT INDEX: 0.385 - 0.394 (+2.34%) Those tests show that it became even faster :)) But I'm quite sceptical about such results. The thing is that in case of nbench it prints the warning if results are not 95% statistically accurate. So we can be sure that nbench result is 95% accurate. And it's obvious that result shown above are in the scope of this accuracy. I don't know the accuracy of CoreMark. So, the main decision we can make that this patch series didn't introduce any slow-down comparable to inaccuracy of the measurement. Is this enough? On 10/23/2012 10:21 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Changelog: v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (5): translate-all.c: Introduce TCGContext *tcg_cur_ctx TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 85 ++- tcg/tcg.h | 11 +- translate-all.c |4 +- 21 files changed, 328 insertions(+), 323 deletions(-) -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up
Any other comments on the patches? I didn't get the consensus. Do we need a pointer to tcg context? As I said before, I didn't notice any slow-down with it. On 10/23/2012 10:21 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Changelog: v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (5): translate-all.c: Introduce TCGContext *tcg_cur_ctx TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 85 ++- tcg/tcg.h | 11 +- translate-all.c |4 +- 21 files changed, 328 insertions(+), 323 deletions(-) -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up
On 10/25/2012 07:17 AM, 陳韋任 (Wei-Ren Chen) wrote: On Thu, Oct 25, 2012 at 07:06:37AM +0400, Evgeny Voevodin wrote: Any other comments on the patches? I didn't get the consensus. Do we need a pointer to tcg context? As I said before, I didn't notice any slow-down with it. On 10/23/2012 10:21 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Would you like to try to run some benchmark after the kernel booting? Like Yeongkyoon Lee done with his qemu_ld/qemu_st work [1], EEMBC, nbench , or even SPEC. ;) Regards, chenwj [1] http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03630.html Sure. -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH v2 7/7] TCG: Remove unused global variables
From: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.c |4 tcg/tcg.h |4 translate-all.c |3 --- 3 files changed, 11 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index f332463..53bf109 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs); static TCGRegSet tcg_target_available_regs[2]; static TCGRegSet tcg_target_call_clobber_regs; -/* XXX: move that inside the context */ -uint16_t *gen_opc_ptr; -TCGArg *gen_opparam_ptr; - static inline void tcg_out8(TCGContext *s, uint8_t v) { *s-code_ptr++ = v; diff --git a/tcg/tcg.h b/tcg/tcg.h index d326b36..19426f9 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -432,10 +432,6 @@ struct TCGContext { extern TCGContext tcg_ctx; extern TCGContext *tcg_cur_ctx; -extern uint16_t *gen_opc_ptr; -extern TCGArg *gen_opparam_ptr; -extern uint16_t gen_opc_buf[]; -extern TCGArg gen_opparam_buf[]; /* pool based memory allocation */ diff --git a/translate-all.c b/translate-all.c index ccdcddf..3351012 100644 --- a/translate-all.c +++ b/translate-all.c @@ -34,9 +34,6 @@ TCGContext tcg_ctx; TCGContext *tcg_cur_ctx = tcg_ctx; -uint16_t gen_opc_buf[OPC_BUF_SIZE]; -TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; - target_ulong gen_opc_pc[OPC_BUF_SIZE]; uint16_t gen_opc_icount[OPC_BUF_SIZE]; uint8_t gen_opc_instr_start[OPC_BUF_SIZE]; -- 1.7.9.5
[Qemu-devel] [PATCH v2 1/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext
From: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny e.voevo...@samsung.com Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.h |6 ++ 1 file changed, 6 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index 45e94f5..43b4317 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -422,6 +422,12 @@ struct TCGContext { int temps_in_use; int goto_tb_issue_mask; #endif + +uint16_t gen_opc_buf[OPC_BUF_SIZE]; +TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; + +uint16_t *gen_opc_ptr; +TCGArg *gen_opparam_ptr; }; extern TCGContext tcg_ctx; -- 1.7.9.5
[Qemu-devel] [PATCH v2 3/7] TCG: Use gen_opc_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |8 ++--- target-arm/translate.c|8 ++--- target-cris/translate.c | 10 +++--- target-i386/translate.c |8 ++--- target-lm32/translate.c | 10 +++--- target-m68k/translate.c |8 ++--- target-microblaze/translate.c | 10 +++--- target-mips/translate.c |9 +++--- target-openrisc/translate.c | 10 +++--- target-ppc/translate.c|9 +++--- target-s390x/translate.c |9 +++--- target-sh4/translate.c|8 ++--- target-sparc/translate.c |8 ++--- target-unicore32/translate.c |8 ++--- target-xtensa/translate.c |6 ++-- tcg/tcg-op.h | 70 - tcg/tcg.c | 16 +- 17 files changed, 109 insertions(+), 106 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index f707d8d..751d457 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3432,7 +3432,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, or exhaust instruction count, stop generation. */ if (ret == NO_EXIT ((ctx.pc (TARGET_PAGE_SIZE - 1)) == 0 -|| gen_opc_ptr = gen_opc_end +|| tcg_cur_ctx-gen_opc_ptr = gen_opc_end || num_insns = max_insns || singlestep || env-singlestep_enabled)) { @@ -3463,9 +3463,9 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_cur_ctx-gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index daccb15..41b1671 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9825,7 +9825,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9872,7 +9872,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, * Also stop translation when a page boundary is reached. This * ensures prefetch aborts occur at the right place. */ num_insns ++; -} while (!dc-is_jmp gen_opc_ptr gen_opc_end +} while (!dc-is_jmp tcg_cur_ctx-gen_opc_ptr gen_opc_end !env-singlestep_enabled !singlestep dc-pc next_page_start @@ -9953,7 +9953,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, done_generating: gen_icount_end(tb, num_insns); -*gen_opc_ptr = INDEX_op_end; +*tcg_cur_ctx-gen_opc_ptr = INDEX_op_end; #ifdef DEBUG_DISAS if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) { @@ -9965,7 +9965,7 @@ done_generating: } #endif if (search_pc) { -j = gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 755de65..9bec8b5 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { - j = gen_opc_ptr - gen_opc_buf; + j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3348,7 +3348,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, if (!(tb-pc 1) env-singlestep_enabled) break; } while (!dc-is_jmp !dc-cpustate_changed - gen_opc_ptr gen_opc_end + tcg_cur_ctx-gen_opc_ptr gen_opc_end !singlestep (dc-pc next_page_start) num_insns max_insns); @@ -3399,9 +3399,9 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, } } gen_icount_end(tb, num_insns); - *gen_opc_ptr = INDEX_op_end; + *tcg_cur_ctx
[Qemu-devel] [PATCH v2 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- gen-icount.h |2 +- tcg/tcg-op.h | 254 +- tcg/tcg.c| 36 - 3 files changed, 146 insertions(+), 146 deletions(-) diff --git a/gen-icount.h b/gen-icount.h index 430cb44..be0bd7e 100644 --- a/gen-icount.h +++ b/gen-icount.h @@ -16,7 +16,7 @@ static inline void gen_icount_start(void) count = tcg_temp_local_new_i32(); tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32)); /* This is a horrid hack to allow fixing up the value later. */ -icount_arg = gen_opparam_ptr + 1; +icount_arg = tcg_cur_ctx-gen_opparam_ptr + 1; tcg_gen_subi_i32(count, count, 0xdeadbeef); tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 4793909..39ce9de 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc) static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1); } static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1); } static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; +*tcg_cur_ctx-gen_opparam_ptr++ = arg1; } static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg2); } static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg2); } static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = arg2; +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = arg1; -*gen_opparam_ptr++ = arg2; +*tcg_cur_ctx-gen_opparam_ptr++ = arg1; +*tcg_cur_ctx-gen_opparam_ptr++ = arg2; } static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGv_i32 arg3) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = GET_TCGV_I32(arg3); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg3); } static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGv_i64 arg3) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = GET_TCGV_I64(arg3); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg2); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg3); } static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2, TCGArg arg3) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I32(arg1); -*gen_opparam_ptr++ = GET_TCGV_I32(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1); +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg2); +*tcg_cur_ctx-gen_opparam_ptr++ = arg3; } static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2, TCGArg arg3) { *tcg_cur_ctx-gen_opc_ptr++ = opc; -*gen_opparam_ptr++ = GET_TCGV_I64(arg1); -*gen_opparam_ptr++ = GET_TCGV_I64(arg2); -*gen_opparam_ptr++ = arg3; +*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1); +*tcg_cur_ctx-gen_opparam_ptr
[Qemu-devel] [PATCH v2 5/7] TCG: Use gen_opc_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- target-alpha/translate.c |6 ++-- target-arm/translate.c|6 ++-- target-cris/translate.c |9 +++--- target-i386/translate.c |6 ++-- target-lm32/translate.c |9 +++--- target-m68k/translate.c |6 ++-- target-microblaze/translate.c |9 +++--- target-mips/translate.c |6 ++-- target-openrisc/translate.c |9 +++--- target-ppc/translate.c|6 ++-- target-s390x/translate.c |6 ++-- target-sh4/translate.c|6 ++-- target-sparc/translate.c |6 ++-- target-unicore32/translate.c |6 ++-- target-xtensa/translate.c |4 +-- tcg/optimize.c| 62 - tcg/tcg.c | 30 ++-- 17 files changed, 98 insertions(+), 94 deletions(-) diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 751d457..7454ebd 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -3373,7 +3373,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, int max_insns; pc_start = tb-pc; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_cur_ctx-gen_opc_buf + OPC_MAX_SIZE; ctx.tb = tb; ctx.env = env; @@ -3406,7 +3406,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, } } if (search_pc) { -j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3465,7 +3465,7 @@ static inline void gen_intermediate_code_internal(CPUAlphaState *env, gen_icount_end(tb, num_insns); *tcg_cur_ctx-gen_opc_ptr = INDEX_op_end; if (search_pc) { -j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-arm/translate.c b/target-arm/translate.c index 41b1671..3c82a0d 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9718,7 +9718,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, dc-tb = tb; -gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; +gen_opc_end = tcg_cur_ctx-gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-pc = pc_start; @@ -9825,7 +9825,7 @@ static inline void gen_intermediate_code_internal(CPUARMState *env, } } if (search_pc) { -j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -9965,7 +9965,7 @@ done_generating: } #endif if (search_pc) { -j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; +j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; diff --git a/target-cris/translate.c b/target-cris/translate.c index 9bec8b5..587df16 100644 --- a/target-cris/translate.c +++ b/target-cris/translate.c @@ -3202,7 +3202,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, dc-env = env; dc-tb = tb; - gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; + gen_opc_end = tcg_cur_ctx-gen_opc_buf + OPC_MAX_SIZE; dc-is_jmp = DISAS_NEXT; dc-ppc = pc_start; @@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, check_breakpoint(env, dc); if (search_pc) { - j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; + j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf; if (lj j) { lj++; while (lj j) @@ -3401,7 +3401,7 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, gen_icount_end(tb, num_insns); *tcg_cur_ctx-gen_opc_ptr = INDEX_op_end; if (search_pc) { - j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf; + j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf; lj++; while (lj = j) gen_opc_instr_start[lj++] = 0; @@ -3416,7 +3416,8 @@ gen_intermediate_code_internal(CPUCRISState *env, TranslationBlock *tb, log_target_disas(pc_start, dc-pc - pc_start, dc-env-pregs[PR_VR]); qemu_log(\nisize=%d osize=%td\n, - dc-pc - pc_start, tcg_cur_ctx-gen_opc_ptr - gen_opc_buf); + dc-pc - pc_start, tcg_cur_ctx-gen_opc_ptr - + tcg_cur_ctx-gen_opc_buf); } #endif #endif diff --git a/target-i386
[Qemu-devel] [PATCH v2 6/7] TCG: Use gen_opparam_buf from context instead of global variable.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index c4e663b..f332463 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s) #endif s-gen_opc_ptr = s-gen_opc_buf; -s-gen_opparam_ptr = gen_opparam_buf; +s-gen_opparam_ptr = s-gen_opparam_buf; } static inline void tcg_temp_alloc(TCGContext *s, int n) @@ -885,7 +885,7 @@ void tcg_dump_ops(TCGContext *s) first_insn = 1; opc_ptr = s-gen_opc_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; while (opc_ptr s-gen_opc_ptr) { c = *opc_ptr++; def = tcg_op_defs[c]; @@ -1409,8 +1409,9 @@ static void tcg_liveness_analysis(TCGContext *s) op_index--; } -if (args != gen_opparam_buf) +if (args != s-gen_opparam_buf) { tcg_abort(); +} } #else /* dummy liveness analysis */ @@ -2104,7 +2105,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, #ifdef USE_TCG_OPTIMIZATIONS s-gen_opparam_ptr = -tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs); +tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs); #endif #ifdef CONFIG_PROFILER @@ -2131,7 +2132,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, s-code_buf = gen_code_buf; s-code_ptr = gen_code_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; op_index = 0; for(;;) { -- 1.7.9.5
[Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up
This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Changelog: v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables Evgeny Voevodin (5): translate-all.c: Introduce TCGContext *tcg_cur_ctx TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 85 ++- tcg/tcg.h | 11 +- translate-all.c |4 +- 21 files changed, 328 insertions(+), 323 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH v2 2/7] translate-all.c: Introduce TCGContext *tcg_cur_ctx
We will use this pointer from functions where we don't have an interface to pass tcg_ctx as a parameter. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- tcg/tcg.h |1 + translate-all.c |1 + 2 files changed, 2 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index 43b4317..d326b36 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -431,6 +431,7 @@ struct TCGContext { }; extern TCGContext tcg_ctx; +extern TCGContext *tcg_cur_ctx; extern uint16_t *gen_opc_ptr; extern TCGArg *gen_opparam_ptr; extern uint16_t gen_opc_buf[]; diff --git a/translate-all.c b/translate-all.c index 5bd2d37..ccdcddf 100644 --- a/translate-all.c +++ b/translate-all.c @@ -32,6 +32,7 @@ /* code generation context */ TCGContext tcg_ctx; +TCGContext *tcg_cur_ctx = tcg_ctx; uint16_t gen_opc_buf[OPC_BUF_SIZE]; TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE]; -- 1.7.9.5
Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up
On 10/23/2012 10:21 AM, Evgeny Voevodin wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Changelog: v1-v2: Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't have an interface to pass pointer to tcg_ctx. Code style clean-up Evgeny (2): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Remove unused global variables It seems that I cherry-picked commits that were made before I correctly set a user name. Hope I don't need to generate v3 because of that. Evgeny Voevodin (5): translate-all.c: Introduce TCGContext *tcg_cur_ctx TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 10 +- target-cris/translate.c | 13 +- target-i386/translate.c | 10 +- target-lm32/translate.c | 13 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 13 +- target-mips/translate.c | 11 +- target-openrisc/translate.c | 13 +- target-ppc/translate.c| 11 +- target-s390x/translate.c | 11 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 10 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 85 ++- tcg/tcg.h | 11 +- translate-all.c |4 +- 21 files changed, 328 insertions(+), 323 deletions(-) -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH v2 2/7] translate-all.c: Introduce TCGContext *tcg_cur_ctx
On 10/24/2012 01:18 AM, Richard Henderson wrote: On 2012-10-23 16:21, Evgeny Voevodin wrote: We will use this pointer from functions where we don't have an interface to pass tcg_ctx as a parameter. I don't think this is worthwhile. It'll just make the whole thing slower, passing around unnecessary pointers. r~ 1. I didn't noticed any slow-down of kernel boot process. Maybe it's worth to make more tests with self modifying code but I don't think so, because 2. The most intensive usage of tcg_cur_ctx is in tcg/tcg-op.h functions. If we look carefully at them then we will see that there are only few functions for which single excessive dereferencing of a pointer leads to any significant slow-down. These functions are those which make just one or two operations and exit. And we should keep in mind that there is only single dereference of a pointer since it is stored in the register for further operations. Of course some slow-down should present but I found it negligible (actually I didn't find it at all). If there are some common tests for TCG generation speed I can try to run them and report results. Also we can specify tcg_cur_ctx as const and in that case I guess that dereferencing of tcg_cur_ctx should not lead to any slow-down. Also I can drop tcg_cur_ctx and use tcg_ctx.xxx instead as was in the first series. What about the rest patches? -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH 0/6] *** TCG global variables clean-up ***
On 10/19/2012 09:55 PM, Blue Swirl wrote: On Fri, Oct 19, 2012 at 12:42 PM, Evgeny e.voevo...@samsung.com wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Where it was possible I used s-... Where we don't have an interface to pass a pointer to tcg_ctx, I used tcg_ctx.xxx since it is a global variable too. Maybe a pointer should be added so that the references become tcg_ctx_ptr-xxx. This would incur unnecessary pointer dereference penalties though. I thought about usage of TCGContext * tcg_cur_ctx, but I decided that we don't need this until we want to use multiple TCG contexts. If we would like to introduce such a pointer, then I'd prefer to hold it in CPUArchState inside CPU_COMMON and initialize it in qemu_tcg_cpu_thread_fn for each CPU: env-tcg_cur_ctx = tcg_ctx; Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Evgeny (6): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. TCG: Remove unused global variables gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 12 +- target-cris/translate.c | 12 +- target-i386/translate.c | 12 +- target-lm32/translate.c | 12 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 12 +- target-mips/translate.c | 10 +- target-openrisc/translate.c | 12 +- target-ppc/translate.c| 10 +- target-s390x/translate.c | 10 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 12 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 84 +-- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 321 insertions(+), 326 deletions(-) -- 1.7.9.5 -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH 5/6] TCG: Use gen_opparam_buf from context instead of global variable.
On 10/19/2012 09:53 PM, Blue Swirl wrote: On Fri, Oct 19, 2012 at 12:42 PM, Evgeny e.voevo...@samsung.com wrote: Signed-off-by: Evgeny e.voevo...@samsung.com --- tcg/tcg.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index 3da1d83..77b15a0 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -302,7 +302,7 @@ void tcg_func_start(TCGContext *s) #endif s-gen_opc_ptr = s-gen_opc_buf; -s-gen_opparam_ptr = gen_opparam_buf; +s-gen_opparam_ptr = s-gen_opparam_buf; } static inline void tcg_temp_alloc(TCGContext *s, int n) @@ -889,7 +889,7 @@ void tcg_dump_ops(TCGContext *s) first_insn = 1; opc_ptr = s-gen_opc_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; while (opc_ptr s-gen_opc_ptr) { c = *opc_ptr++; def = tcg_op_defs[c]; @@ -1413,7 +1413,7 @@ static void tcg_liveness_analysis(TCGContext *s) op_index--; } -if (args != gen_opparam_buf) +if (args != s-gen_opparam_buf) Please add braces. Ok. Maybe I should introduce a little code clean-up in the scope of my patches? I mean, remove tabs and so on... Then maybe it would better be a separate patch? tcg_abort(); } #else @@ -2108,7 +2108,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, #ifdef USE_TCG_OPTIMIZATIONS s-gen_opparam_ptr = -tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs); +tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs); #endif #ifdef CONFIG_PROFILER @@ -2135,7 +2135,7 @@ static inline int tcg_gen_code_common(TCGContext *s, uint8_t *gen_code_buf, s-code_buf = gen_code_buf; s-code_ptr = gen_code_buf; -args = gen_opparam_buf; +args = s-gen_opparam_buf; op_index = 0; for(;;) { -- 1.7.9.5 -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH 0/6] *** TCG global variables clean-up ***
On 10/22/2012 07:40 AM, Evgeny Voevodin wrote: On 10/19/2012 09:55 PM, Blue Swirl wrote: On Fri, Oct 19, 2012 at 12:42 PM, Evgeny e.voevo...@samsung.com wrote: This set of patches moves global variables to tcg_ctx: gen_opc_ptr gen_opparam_ptr gen_opc_buf gen_opparam_buf Where it was possible I used s-... Where we don't have an interface to pass a pointer to tcg_ctx, I used tcg_ctx.xxx since it is a global variable too. Maybe a pointer should be added so that the references become tcg_ctx_ptr-xxx. This would incur unnecessary pointer dereference penalties though. I thought about usage of TCGContext * tcg_cur_ctx, but I decided that we don't need this until we want to use multiple TCG contexts. If we would like to introduce such a pointer, then I'd prefer to hold it in CPUArchState inside CPU_COMMON and initialize it in qemu_tcg_cpu_thread_fn for each CPU: env-tcg_cur_ctx = tcg_ctx; Oh, I just found out that we don't have an access to CPUArchState form tcg-op.h, so we can't get rid of usage of global variable here until we change interfaces to all functions... Build tested for all targets. Execution tested on ARM. I didn't notice any slow-down of kernel boot after this set was applied. Evgeny (6): tcg/tcg.h: Duplicate global TCG variables in TCGContext TCG: Use gen_opc_ptr from context instead of global variable. TCG: Use gen_opparam_ptr from context instead of global variable. TCG: Use gen_opc_buf from context instead of global variable. TCG: Use gen_opparam_buf from context instead of global variable. TCG: Remove unused global variables gen-icount.h |2 +- target-alpha/translate.c | 10 +- target-arm/translate.c| 12 +- target-cris/translate.c | 12 +- target-i386/translate.c | 12 +- target-lm32/translate.c | 12 +- target-m68k/translate.c | 10 +- target-microblaze/translate.c | 12 +- target-mips/translate.c | 10 +- target-openrisc/translate.c | 12 +- target-ppc/translate.c| 10 +- target-s390x/translate.c | 10 +- target-sh4/translate.c| 10 +- target-sparc/translate.c | 10 +- target-unicore32/translate.c | 12 +- target-xtensa/translate.c |8 +- tcg/optimize.c| 62 tcg/tcg-op.h | 324 - tcg/tcg.c | 84 +-- tcg/tcg.h | 10 +- translate-all.c |3 - 21 files changed, 321 insertions(+), 326 deletions(-) -- 1.7.9.5 -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] Building QEMU with multiple CPU targets.
On 10/08/2012 02:54 PM, Peter Maydell wrote: On 8 October 2012 07:39, Peter Crosthwaite peter.crosthwa...@petalogix.com wrote: Im currently investigating the possibility of building QEMU with multiple CPU architectures active concurrently. That is, I have a binary with both an target-arm and target-microblaze and wish to run them as a heterogeneous multiprocessor platform. Given the recent QOM development in making CPUs just another object, shouldn't be possible with a bit of Makefile and configure rework to build qemu-system-arm+microblaze and then create machine models instantiating both CPU types? Are the major complications here from either a Make or QOM perspective? I certainly think this would be a nice feature to have, but I suspect the makefile/QOM bits are probably the easy parts :-) At the moment things like the translated code cache are basically globals and would need to be moved to be per-CPU. Also there are still various settings that are compile time which would need to become runtime (though we just got rid of the 'size of physical address type' one at least). Did anybody start this work? I'm interested in localiation of tcg per cpu. -- PMM -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH] hw/arm_gic.c: Fix improper DPRINTF output.
On 10/02/2012 07:50 AM, Evgeny Voevodin wrote: s-cpu_enabled is an array, so s-cpu_enabled ? En : Dis returns En always. We should use s-cpu_enabled[cpu] here. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/arm_gic.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm_gic.c b/hw/arm_gic.c index 55871fa..4024dae 100644 --- a/hw/arm_gic.c +++ b/hw/arm_gic.c @@ -566,7 +566,7 @@ static void gic_cpu_write(gic_state *s, int cpu, int offset, uint32_t value) switch (offset) { case 0x00: /* Control */ s-cpu_enabled[cpu] = (value 1); -DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled ? En : Dis); +DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled[cpu] ? En : Dis); break; case 0x04: /* Priority mask */ s-priority_mask[cpu] = (value 0xff); Did anybody pick this up? -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH] hw/arm_gic.c: Fix improper DPRINTF output.
s-cpu_enabled is a massive, so s-cpu_enabled ? En : Dis returns En always. We should use s-cpu_enabled[cpu] here. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/arm_gic.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm_gic.c b/hw/arm_gic.c index 55871fa..4024dae 100644 --- a/hw/arm_gic.c +++ b/hw/arm_gic.c @@ -566,7 +566,7 @@ static void gic_cpu_write(gic_state *s, int cpu, int offset, uint32_t value) switch (offset) { case 0x00: /* Control */ s-cpu_enabled[cpu] = (value 1); -DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled ? En : Dis); +DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled[cpu] ? En : Dis); break; case 0x04: /* Priority mask */ s-priority_mask[cpu] = (value 0xff); -- 1.7.9.5
[Qemu-devel] [PATCH] hw/arm_gic.c: Fix improper DPRINTF output.
s-cpu_enabled is an array, so s-cpu_enabled ? En : Dis returns En always. We should use s-cpu_enabled[cpu] here. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/arm_gic.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/arm_gic.c b/hw/arm_gic.c index 55871fa..4024dae 100644 --- a/hw/arm_gic.c +++ b/hw/arm_gic.c @@ -566,7 +566,7 @@ static void gic_cpu_write(gic_state *s, int cpu, int offset, uint32_t value) switch (offset) { case 0x00: /* Control */ s-cpu_enabled[cpu] = (value 1); -DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled ? En : Dis); +DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled[cpu] ? En : Dis); break; case 0x04: /* Priority mask */ s-priority_mask[cpu] = (value 0xff); -- 1.7.9.5
Re: [Qemu-devel] [RFC v2 00/12] Virtio-mmio refactoring.
On 09/27/2012 09:31 PM, KONRAD Frédéric wrote: Hi, We actually want to add virtio models for arm, therefore these patches are really helpful. We will try it, start looking at the issues. Any feedback ? Ok. Feel free. On 17/09/2012 12:00, Evgeny Voevodin wrote: Previous RFC you can find at http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03665.html Yes, long time ago... Since I'm not sure when I'll be able to continue on this, I'm publishing this work as is. In this patchset I tried to split virtio-xxx-pci devices into virtio-pci + virtio-xxx (blk, net, serial,...). Also virtio-mmio transport is introduced based on Peter's work which is accessible here: http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html The main idea was to let users specify -device virtio-pci,id=virtio-pci.0 -device virtio-blk,transport=virtio-pci.0,... and -device virtio-mmio,id=virtio-mmio.0 -device virtio-blk,transport=virtio-mmio.0,... I created virtio-pci and virtio-mmio transport devices and tried to enclose back-end functionality into virtio-blk, virtio-net, etc. On initialization of transport device it creates a bus to which a back-end device could be connected. Each back-end device is implemented in corresponding source file. As for PCI transport, I temporary placed it in a new virtio-pci-new.c file to not break a functionality of still presented virtio-xxx-pci devices. Known issues to be resolved: 1. On creation of back-end we need to resolve somehow if props were explicitly set by user. 2. Back-end device can't be initialized if there are no free bus created by transport, so you can't specify -device virtio-blk,transport=virtio-pci.0,... -device virtio-pci,id=virtio-pci.0 3. Implement virtio-xxx-devices such that they just create virtio-pci and virtio-xxx devices during initialization. 4. Refactor all remaining back-ends since I just tried blk, net, serial and balloon. 5. Refactor s390 6. Further? Evgeny Voevodin (9): Virtio: Add transport bindings. hw/qdev-properties.c: Add transport property. hw/pci.c: Make pci_add_option_rom global visible hw/virtio-serial-bus.c: Add virtio-serial device. hw/virtio-balloon.c: Add virtio-balloon device. hw/virtio-net.c: Add virtio-net device. hw/virtio-blk.c: Add virtio-blk device. hw/virtio-pci-new.c: Add VirtIOPCI device. hw/exynos4210.c: Create two virtio-mmio transport instances. Peter Maydell (3): virtio: Add support for guest setting of queue size virtio: Support transports which can specify the vring alignment Add MMIO based virtio transport hw/Makefile.objs |3 + hw/exynos4210.c| 13 + hw/pci.c |3 +- hw/pci.h |2 + hw/qdev-properties.c | 29 ++ hw/qdev.h |3 + hw/virtio-balloon.c| 42 +++ hw/virtio-balloon.h|9 + hw/virtio-blk.c| 65 hw/virtio-blk.h| 15 + hw/virtio-mmio.c | 400 + hw/virtio-net.c| 59 +++ hw/virtio-net.h| 16 + hw/virtio-pci-new.c| 925 hw/virtio-pci.h| 18 + hw/virtio-serial-bus.c | 44 +++ hw/virtio-serial.h | 11 + hw/virtio-transport.c | 147 hw/virtio-transport.h | 74 hw/virtio.c| 20 +- hw/virtio.h|2 + 21 files changed, 1896 insertions(+), 4 deletions(-) create mode 100644 hw/virtio-mmio.c create mode 100644 hw/virtio-pci-new.c create mode 100644 hw/virtio-transport.c create mode 100644 hw/virtio-transport.h -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
[Qemu-devel] [RFC v2 08/12] hw/virtio-balloon.c: Add virtio-balloon device.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/virtio-balloon.c | 42 ++ hw/virtio-balloon.h |9 + 2 files changed, 51 insertions(+) diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c index dd1a650..d6fe2aa 100644 --- a/hw/virtio-balloon.c +++ b/hw/virtio-balloon.c @@ -20,6 +20,7 @@ #include cpu.h #include balloon.h #include virtio-balloon.h +#include virtio-transport.h #include kvm.h #include exec-memory.h @@ -272,3 +273,44 @@ void virtio_balloon_exit(VirtIODevice *vdev) unregister_savevm(s-qdev, virtio-balloon, s); virtio_cleanup(vdev); } + +/ VirtIOBaloon Device **/ + +static int virtio_balloondev_init(DeviceState *dev) +{ +VirtIODevice *vdev; +VirtIOBaloonState *s = VIRTIO_BALLOON_FROM_QDEV(dev); +vdev = virtio_balloon_init(dev); +if (!vdev) { +return -1; +} + +assert(s-trl != NULL); + +return virtio_call_backend_init_cb(dev, s-trl, vdev); +} + +static Property virtio_balloon_properties[] = { +DEFINE_PROP_END_OF_LIST(), +}; + +static void virtio_balloon_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +dc-init = virtio_balloondev_init; +dc-props = virtio_balloon_properties; +} + +static TypeInfo virtio_balloon_info = { +.name = virtio-balloon, +.parent = TYPE_DEVICE, +.instance_size = sizeof(VirtIOBaloonState), +.class_init = virtio_balloon_class_init, +}; + +static void virtio_baloon_register_types(void) +{ +type_register_static(virtio_balloon_info); +} + +type_init(virtio_baloon_register_types) diff --git a/hw/virtio-balloon.h b/hw/virtio-balloon.h index 73300dd..b925186 100644 --- a/hw/virtio-balloon.h +++ b/hw/virtio-balloon.h @@ -15,8 +15,10 @@ #ifndef _QEMU_VIRTIO_BALLOON_H #define _QEMU_VIRTIO_BALLOON_H +#include sysbus.h #include virtio.h #include pci.h +#include virtio-transport.h /* from Linux's linux/virtio_balloon.h */ @@ -52,4 +54,11 @@ typedef struct VirtIOBalloonStat { uint64_t val; } QEMU_PACKED VirtIOBalloonStat; +typedef struct { +DeviceState qdev; +VirtIOTransportLink *trl; +} VirtIOBaloonState; + +#define VIRTIO_BALLOON_FROM_QDEV(dev) DO_UPCAST(VirtIOBaloonState, qdev, dev) + #endif -- 1.7.9.5
[Qemu-devel] [RFC v2 01/12] virtio: Add support for guest setting of queue size
From: Peter Maydell peter.mayd...@linaro.org The MMIO virtio transport spec allows the guest to tell the host how large the queue size is. Add virtio_queue_set_num() function which implements this in the QEMU common virtio support code. Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/virtio.c |6 ++ hw/virtio.h |1 + 2 files changed, 7 insertions(+) diff --git a/hw/virtio.c b/hw/virtio.c index 209c763..5334326 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -628,6 +628,12 @@ target_phys_addr_t virtio_queue_get_addr(VirtIODevice *vdev, int n) return vdev-vq[n].pa; } +void virtio_queue_set_num(VirtIODevice *vdev, int n, int num) +{ +vdev-vq[n].vring.num = num; +virtqueue_init(vdev-vq[n]); +} + int virtio_queue_get_num(VirtIODevice *vdev, int n) { return vdev-vq[n].vring.num; diff --git a/hw/virtio.h b/hw/virtio.h index 7a4f564..eb9953f 100644 --- a/hw/virtio.h +++ b/hw/virtio.h @@ -177,6 +177,7 @@ void virtio_config_writew(VirtIODevice *vdev, uint32_t addr, uint32_t data); void virtio_config_writel(VirtIODevice *vdev, uint32_t addr, uint32_t data); void virtio_queue_set_addr(VirtIODevice *vdev, int n, target_phys_addr_t addr); target_phys_addr_t virtio_queue_get_addr(VirtIODevice *vdev, int n); +void virtio_queue_set_num(VirtIODevice *vdev, int n, int num); int virtio_queue_get_num(VirtIODevice *vdev, int n); void virtio_queue_notify(VirtIODevice *vdev, int n); uint16_t virtio_queue_vector(VirtIODevice *vdev, int n); -- 1.7.9.5
[Qemu-devel] [RFC v2 03/12] Virtio: Add transport bindings.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/Makefile.objs |1 + hw/virtio-transport.c | 147 + hw/virtio-transport.h | 74 + 3 files changed, 222 insertions(+) create mode 100644 hw/virtio-transport.c create mode 100644 hw/virtio-transport.h diff --git a/hw/Makefile.objs b/hw/Makefile.objs index 6dfebd2..db4c16d 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -2,6 +2,7 @@ hw-obj-y = usb/ ide/ hw-obj-y += loader.o hw-obj-$(CONFIG_VIRTIO) += virtio-console.o hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o +hw-obj-$(CONFIG_VIRTIO) += virtio-transport.o hw-obj-y += fw_cfg.o hw-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o hw-obj-$(CONFIG_PCI) += msix.o msi.o diff --git a/hw/virtio-transport.c b/hw/virtio-transport.c new file mode 100644 index 000..76360ba --- /dev/null +++ b/hw/virtio-transport.c @@ -0,0 +1,147 @@ +/* + * Virtio transport bindings + * + * Copyright (c) 2011 - 2012 Samsung Electronics Co., Ltd. + * + * Author: + * Evgeny Voevodin e.voevo...@samsung.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, see http://www.gnu.org/licenses/. + */ + +#include virtio-transport.h + +#define VIRTIO_TRANSPORT_BUS virtio-transport + +static QTAILQ_HEAD(, VirtIOTransportLink) transport_links = +QTAILQ_HEAD_INITIALIZER(transport_links); + +/* + * Find transport device by its ID. + */ +VirtIOTransportLink* virtio_find_transport(const char *name) +{ +VirtIOTransportLink *trl; + +assert(name != NULL); + +QTAILQ_FOREACH(trl, transport_links, sibling) { +if (trl-tr-id != NULL) { +if (!strcmp(name, trl-tr-id)) { +return trl; +} +} +} + +return NULL; +} + +/* + * Count transport devices by ID. + */ +uint32_t virtio_count_transports(const char *name) +{ +VirtIOTransportLink *trl; +uint32_t i = 0; + +QTAILQ_FOREACH(trl, transport_links, sibling) { +if (name == NULL) { +i++; +continue; +} + +if (trl-tr-id != NULL) { +if (!strncmp(name, trl-tr-id,strlen(name))) { +i++; +} +} +} +return i; +} + +/* + * Initialize new transport device + */ +char* virtio_init_transport(DeviceState *dev, VirtIOTransportLink **trl, +const char* name, virtio_backend_init_cb cb) +{ +VirtIOTransportLink *link = g_malloc0(sizeof(VirtIOTransportLink)); +char *buf; +size_t len; +uint32_t i; + +assert(dev != NULL); +assert(name != NULL); +assert(trl != NULL); + +i = virtio_count_transports(name); +len = strlen(name) + 16; +buf = g_malloc(len); +snprintf(buf, len, %s.%d, name, i); +qbus_create(TYPE_VIRTIO_BUS, dev, buf); + +/* Add new transport */ +QTAILQ_INSERT_TAIL(transport_links, link, sibling); +link-tr = dev; +link-cb = cb; +// TODO: Add a link property +*trl = link; +return buf; +} + +/* + * Unplug back-end from system bus and plug it into transport bus. + */ +void virtio_plug_into_transport(DeviceState *dev, VirtIOTransportLink *trl) +{ +BusChild *kid; + +/* Unplug back-end from system bus */ +QTAILQ_FOREACH(kid, qdev_get_parent_bus(dev)-children, sibling) { +if (kid-child == dev) { +QTAILQ_REMOVE(qdev_get_parent_bus(dev)-children, kid, sibling); +break; +} +} + +/* Plug back-end into transport's bus */ +qdev_set_parent_bus(dev, QLIST_FIRST(trl-tr-child_bus)); + +} + +/* + * Execute call-back on back-end initialization. + * Performs initialization of MMIO or PCI transport. + */ +int virtio_call_backend_init_cb(DeviceState *dev, VirtIOTransportLink *trl, +VirtIODevice *vdev) +{ +if (trl-cb) { +return trl-cb(dev, vdev, trl); +} + +return 0; +} + +static const TypeInfo virtio_bus_info = { +.name = TYPE_VIRTIO_BUS, +.parent = TYPE_BUS, +.instance_size = sizeof(BusState), +}; + +static void virtio_register_types(void) +{ +type_register_static(virtio_bus_info); +} + +type_init(virtio_register_types) diff --git a/hw/virtio-transport.h b/hw/virtio-transport.h new file mode 100644 index 000..43200dc --- /dev/null +++ b/hw/virtio-transport.h @@ -0,0 +1,74 @@ +/* + * Virtio transport header + * + * Copyright (c) 2011 - 2012 Samsung Electronics Co., Ltd. + * + * Author: + * Evgeny Voevodin e.voevo...@samsung.com
[Qemu-devel] [RFC v2 02/12] virtio: Support transports which can specify the vring alignment
From: Peter Maydell peter.mayd...@linaro.org Support virtio transports which can specify the vring alignment (ie where the guest communicates this to the host) by providing a new virtio_queue_set_align() function. (The default alignment remains as before.) FIXME save/load support for this new field! Signed-off-by: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/virtio.c | 14 -- hw/virtio.h |1 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/hw/virtio.c b/hw/virtio.c index 5334326..4f47d4e 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -19,7 +19,9 @@ #include qemu-barrier.h /* The alignment to use between consumer and producer parts of vring. - * x86 pagesize again. */ + * x86 pagesize again. This is the default, used by transports like PCI + * which don't provide a means for the guest to tell the host the alignment. + */ #define VIRTIO_PCI_VRING_ALIGN 4096 typedef struct VRingDesc @@ -53,6 +55,7 @@ typedef struct VRingUsed typedef struct VRing { unsigned int num; +unsigned int align; target_phys_addr_t desc; target_phys_addr_t avail; target_phys_addr_t used; @@ -90,7 +93,7 @@ static void virtqueue_init(VirtQueue *vq) vq-vring.avail = pa + vq-vring.num * sizeof(VRingDesc); vq-vring.used = vring_align(vq-vring.avail + offsetof(VRingAvail, ring[vq-vring.num]), - VIRTIO_PCI_VRING_ALIGN); + vq-vring.align); } static inline uint64_t vring_desc_addr(target_phys_addr_t desc_pa, int i) @@ -646,6 +649,12 @@ int virtio_queue_get_id(VirtQueue *vq) return vq - vdev-vq[0]; } +void virtio_queue_set_align(VirtIODevice *vdev, int n, int align) +{ +vdev-vq[n].vring.align = align; +virtqueue_init(vdev-vq[n]); +} + void virtio_queue_notify_vq(VirtQueue *vq) { if (vq-vring.desc) { @@ -686,6 +695,7 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size, abort(); vdev-vq[i].vring.num = queue_size; +vdev-vq[i].vring.align = VIRTIO_PCI_VRING_ALIGN; vdev-vq[i].handle_output = handle_output; return vdev-vq[i]; diff --git a/hw/virtio.h b/hw/virtio.h index eb9953f..3f16367 100644 --- a/hw/virtio.h +++ b/hw/virtio.h @@ -179,6 +179,7 @@ void virtio_queue_set_addr(VirtIODevice *vdev, int n, target_phys_addr_t addr); target_phys_addr_t virtio_queue_get_addr(VirtIODevice *vdev, int n); void virtio_queue_set_num(VirtIODevice *vdev, int n, int num); int virtio_queue_get_num(VirtIODevice *vdev, int n); +void virtio_queue_set_align(VirtIODevice *vdev, int n, int align); void virtio_queue_notify(VirtIODevice *vdev, int n); uint16_t virtio_queue_vector(VirtIODevice *vdev, int n); void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector); -- 1.7.9.5
[Qemu-devel] [RFC v2 06/12] Add MMIO based virtio transport
From: Peter Maydell peter.mayd...@linaro.org Add support for the generic MMIO based virtio transport. This patch is a modyfied patch of Peter Maydell peter.mayd...@linaro.org. Changes are to have virtio-mmio bridge device which provides virtio-mmio bus. To this bus virtio-mmio-transport device is connected and in turn provides virtio-transport bus. Then virtio backends could be connected to this bus. Also this patch includes some fixes for bugs spotted by Ying-Shiuan Pan ys...@itri.org.tw. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Conflicts: Makefile.objs --- hw/Makefile.objs |1 + hw/virtio-mmio.c | 400 ++ 2 files changed, 401 insertions(+) create mode 100644 hw/virtio-mmio.c diff --git a/hw/Makefile.objs b/hw/Makefile.objs index db4c16d..0c112af 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -3,6 +3,7 @@ hw-obj-y += loader.o hw-obj-$(CONFIG_VIRTIO) += virtio-console.o hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o hw-obj-$(CONFIG_VIRTIO) += virtio-transport.o +hw-obj-$(CONFIG_VIRTIO) += virtio-mmio.o hw-obj-y += fw_cfg.o hw-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o hw-obj-$(CONFIG_PCI) += msix.o msi.o diff --git a/hw/virtio-mmio.c b/hw/virtio-mmio.c new file mode 100644 index 000..88e5d9f --- /dev/null +++ b/hw/virtio-mmio.c @@ -0,0 +1,400 @@ +/* + * Virtio MMIO bindings + * + * Copyright (c) 2011 Linaro Limited + * + * Authors: + * Peter Maydell peter.mayd...@linaro.org + * Evgeny Voevodin e.voevo...@samsung.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, see http://www.gnu.org/licenses/. + */ + +/* TODO: + * * save/load support + * * test net, serial, balloon + */ + +#include sysbus.h +#include virtio.h +#include virtio-transport.h +#include virtio-blk.h +#include virtio-net.h +#include virtio-serial.h +#include host-utils.h + +/* #define DEBUG_VIRTIO_MMIO */ + +#ifdef DEBUG_VIRTIO_MMIO + +#define DPRINTF(fmt, ...) \ +do { printf(virtio_mmio: fmt , ## __VA_ARGS__); } while (0) +#else +#define DPRINTF(fmt, ...) do {} while (0) +#endif + +/* Memory mapped register offsets */ +#define VIRTIO_MMIO_MAGIC 0x0 +#define VIRTIO_MMIO_VERSION 0x4 +#define VIRTIO_MMIO_DEVICEID 0x8 +#define VIRTIO_MMIO_VENDORID 0xc +#define VIRTIO_MMIO_HOSTFEATURES 0x10 +#define VIRTIO_MMIO_HOSTFEATURESSEL 0x14 +#define VIRTIO_MMIO_GUESTFEATURES 0x20 +#define VIRTIO_MMIO_GUESTFEATURESSEL 0x24 +#define VIRTIO_MMIO_GUESTPAGESIZE 0x28 +#define VIRTIO_MMIO_QUEUESEL 0x30 +#define VIRTIO_MMIO_QUEUENUMMAX 0x34 +#define VIRTIO_MMIO_QUEUENUM 0x38 +#define VIRTIO_MMIO_QUEUEALIGN 0x3c +#define VIRTIO_MMIO_QUEUEPFN 0x40 +#define VIRTIO_MMIO_QUEUENOTIFY 0x50 +#define VIRTIO_MMIO_INTERRUPTSTATUS 0x60 +#define VIRTIO_MMIO_INTERRUPTACK 0x64 +#define VIRTIO_MMIO_STATUS 0x70 +/* Device specific config space starts here */ +#define VIRTIO_MMIO_CONFIG 0x100 + +#define VIRT_MAGIC 0x74726976 /* 'virt' */ +#define VIRT_VERSION 1 +#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ + +enum VIRTIO_MMIO_MAPPINGS { +VIRTIO_MMIO_IOMAP, +VIRTIO_MMIO_IOMEM, +}; + +typedef struct { +SysBusDevice busdev; +VirtIODevice *vdev; +VirtIOTransportLink *trl; + +MemoryRegion iomap; /* hold base address */ +MemoryRegion iomem; /* hold io funcs */ +MemoryRegion alias; +qemu_irq irq; +uint32_t int_enable; +uint32_t host_features; +uint32_t host_features_sel; +uint32_t guest_features_sel; +uint32_t guest_page_shift; +} VirtIOMMIO; + +static uint64_t virtio_mmio_read(void *opaque, target_phys_addr_t offset, + unsigned size) +{ +VirtIOMMIO *s = (VirtIOMMIO *)opaque; +VirtIODevice *vdev = s-vdev; +DPRINTF(virtio_mmio_read offset 0x%x\n, (int)offset); +if (offset = VIRTIO_MMIO_CONFIG) { +offset -= VIRTIO_MMIO_CONFIG; +switch (size) { +case 1: +return virtio_config_readb(vdev, offset); +case 2: +return virtio_config_readw(vdev, offset); +case 4: +return virtio_config_readl(vdev, offset); +default: +abort(); +} +} +if (size != 4) { +DPRINTF(wrong size access to register!\n); +return 0; +} +switch (offset) { +case VIRTIO_MMIO_MAGIC: +return VIRT_MAGIC; +case VIRTIO_MMIO_VERSION: +return VIRT_VERSION; +case VIRTIO_MMIO_DEVICEID: +return vdev-device_id; +case
[Qemu-devel] [RFC v2 07/12] hw/virtio-serial-bus.c: Add virtio-serial device.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/virtio-serial-bus.c | 44 hw/virtio-serial.h | 11 +++ 2 files changed, 55 insertions(+) diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index 82073f5..699a485 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -24,6 +24,7 @@ #include sysbus.h #include trace.h #include virtio-serial.h +#include virtio-transport.h /* The virtio-serial bus on top of which the ports will ride as devices */ struct VirtIOSerialBus { @@ -1014,3 +1015,46 @@ static void virtio_serial_register_types(void) } type_init(virtio_serial_register_types) + +/ VirtIOSer Device **/ + +static int virtio_serialdev_init(DeviceState *dev) +{ +VirtIODevice *vdev; +VirtIOSerState *s = VIRTIO_SERIAL_FROM_QDEV(dev); +vdev = virtio_serial_init(dev, s-serial); +if (!vdev) { +return -1; +} + +assert(s-trl != NULL); + +return virtio_call_backend_init_cb(dev, s-trl, vdev); +} + +static Property virtio_serial_properties[] = { +DEFINE_PROP_UINT32(max_ports, VirtIOSerState, + serial.max_virtserial_ports, 31), +DEFINE_PROP_END_OF_LIST(), +}; + +static void virtio_serial_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +dc-init = virtio_serialdev_init; +dc-props = virtio_serial_properties; +} + +static TypeInfo virtio_serial_info = { +.name = virtio-serial, +.parent = TYPE_DEVICE, +.instance_size = sizeof(VirtIOSerState), +.class_init = virtio_serial_class_init, +}; + +static void virtio_ser_register_types(void) +{ +type_register_static(virtio_serial_info); +} + +type_init(virtio_ser_register_types) diff --git a/hw/virtio-serial.h b/hw/virtio-serial.h index 16e3982..c6b916a 100644 --- a/hw/virtio-serial.h +++ b/hw/virtio-serial.h @@ -15,8 +15,10 @@ #ifndef _QEMU_VIRTIO_SERIAL_H #define _QEMU_VIRTIO_SERIAL_H +#include sysbus.h #include qdev.h #include virtio.h +#include virtio-transport.h /* == Interface shared between the guest kernel and qemu == */ @@ -173,6 +175,15 @@ struct VirtIOSerialPort { bool throttled; }; +typedef struct { +DeviceState qdev; +/* virtio-serial */ +virtio_serial_conf serial; +VirtIOTransportLink *trl; +} VirtIOSerState; + +#define VIRTIO_SERIAL_FROM_QDEV(dev) DO_UPCAST(VirtIOSerState, qdev, dev) + /* Interface to the virtio-serial bus */ /* -- 1.7.9.5
[Qemu-devel] [RFC v2 09/12] hw/virtio-net.c: Add virtio-net device.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/virtio-net.c | 59 +++ hw/virtio-net.h | 16 +++ 2 files changed, 75 insertions(+) diff --git a/hw/virtio-net.c b/hw/virtio-net.c index b1998b2..b7cfb1c 100644 --- a/hw/virtio-net.c +++ b/hw/virtio-net.c @@ -13,6 +13,8 @@ #include iov.h #include virtio.h +#include virtio-transport.h +#include virtio-pci.h #include net.h #include net/checksum.h #include net/tap.h @@ -1080,3 +1082,60 @@ void virtio_net_exit(VirtIODevice *vdev) qemu_del_net_client(n-nic-nc); virtio_cleanup(n-vdev); } + +/ VirtIONet Device **/ + +static int virtio_netdev_init(DeviceState *dev) +{ +VirtIODevice *vdev; +VirtIONetState *s = VIRTIO_NET_FROM_QDEV(dev); + +assert(s-trl != NULL); + +vdev = virtio_net_init(dev, s-nic, s-net); + +/* Pass default host_features to transport */ +s-trl-host_features = s-host_features; + +if (virtio_call_backend_init_cb(dev, s-trl, vdev) != 0) { +return -1; +} + +/* Binding should be ready here, let's get final features */ +if (vdev-binding-get_features) { + s-host_features = vdev-binding-get_features(vdev-binding_opaque); +} +return 0; +} + +static Property virtio_net_properties[] = { +DEFINE_VIRTIO_NET_FEATURES(VirtIONetState, host_features), +DEFINE_NIC_PROPERTIES(VirtIONetState, nic), +DEFINE_PROP_UINT32(x-txtimer, VirtIONetState, net.txtimer, +TX_TIMER_INTERVAL), +DEFINE_PROP_INT32(x-txburst, VirtIONetState, net.txburst, TX_BURST), +DEFINE_PROP_STRING(tx, VirtIONetState, net.tx), +DEFINE_PROP_TRANSPORT(transport, VirtIONetState, trl), +DEFINE_PROP_END_OF_LIST(), +}; + +static void virtio_net_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +dc-init = virtio_netdev_init; +dc-props = virtio_net_properties; +} + +static TypeInfo virtio_net_info = { +.name = virtio-net, +.parent = TYPE_DEVICE, +.instance_size = sizeof(VirtIONetState), +.class_init = virtio_net_class_init, +}; + +static void virtio_net_register_types(void) +{ +type_register_static(virtio_net_info); +} + +type_init(virtio_net_register_types) diff --git a/hw/virtio-net.h b/hw/virtio-net.h index 36aa463..8dd49d3 100644 --- a/hw/virtio-net.h +++ b/hw/virtio-net.h @@ -14,7 +14,9 @@ #ifndef _QEMU_VIRTIO_NET_H #define _QEMU_VIRTIO_NET_H +#include sysbus.h #include virtio.h +#include virtio-transport.h #include net.h #include pci.h @@ -187,4 +189,18 @@ struct virtio_net_ctrl_mac { DEFINE_PROP_BIT(ctrl_rx, _state, _field, VIRTIO_NET_F_CTRL_RX, true), \ DEFINE_PROP_BIT(ctrl_vlan, _state, _field, VIRTIO_NET_F_CTRL_VLAN, true), \ DEFINE_PROP_BIT(ctrl_rx_extra, _state, _field, VIRTIO_NET_F_CTRL_RX_EXTRA, true) + +typedef struct { +DeviceState qdev; +/* virtio-net */ +NICConf nic; +virtio_net_conf net; + +uint32_t host_features; + +VirtIOTransportLink *trl; +} VirtIONetState; + +#define VIRTIO_NET_FROM_QDEV(dev) DO_UPCAST(VirtIONetState, qdev, dev) + #endif -- 1.7.9.5
[Qemu-devel] [RFC v2 11/12] hw/virtio-pci-new.c: Add VirtIOPCI device.
This commit adds VirtIOPCI device implementation which is temporary held in virtio-pci-new.c file. We need this file until virtio-xxx-pci devices in hw/virtio-pci.c are not implemented in the way that they just create virtio-pci and virtio-xxx devices during initialization. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/Makefile.objs|1 + hw/virtio-pci-new.c | 925 +++ hw/virtio-pci.h | 18 + 3 files changed, 944 insertions(+) create mode 100644 hw/virtio-pci-new.c diff --git a/hw/Makefile.objs b/hw/Makefile.objs index 0c112af..e5bda7f 100644 --- a/hw/Makefile.objs +++ b/hw/Makefile.objs @@ -2,6 +2,7 @@ hw-obj-y = usb/ ide/ hw-obj-y += loader.o hw-obj-$(CONFIG_VIRTIO) += virtio-console.o hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o +hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci-new.o hw-obj-$(CONFIG_VIRTIO) += virtio-transport.o hw-obj-$(CONFIG_VIRTIO) += virtio-mmio.o hw-obj-y += fw_cfg.o diff --git a/hw/virtio-pci-new.c b/hw/virtio-pci-new.c new file mode 100644 index 000..c1650b5 --- /dev/null +++ b/hw/virtio-pci-new.c @@ -0,0 +1,925 @@ +/* + * Virtio PCI Bindings + * + * Copyright IBM, Corp. 2007 + * Copyright (c) 2009 CodeSourcery + * + * Authors: + * Anthony Liguori aligu...@us.ibm.com + * Paul Brookp...@codesourcery.com + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Contributions after 2012-01-13 are licensed under the terms of the + * GNU GPL, version 2 or (at your option) any later version. + */ + +#include inttypes.h + +#include virtio.h +#include virtio-transport.h +#include virtio-blk.h +#include virtio-net.h +#include virtio-serial.h +#include virtio-scsi.h +#include virtio-balloon.h +#include pci.h +#include qemu-error.h +#include msi.h +#include msix.h +#include net.h +#include loader.h +#include kvm.h +#include blockdev.h +#include virtio-pci.h +#include range.h + +/* from Linux's linux/virtio_pci.h */ + +/* A 32-bit r/o bitmask of the features supported by the host */ +#define VIRTIO_PCI_HOST_FEATURES0 + +/* A 32-bit r/w bitmask of features activated by the guest */ +#define VIRTIO_PCI_GUEST_FEATURES 4 + +/* A 32-bit r/w PFN for the currently selected queue */ +#define VIRTIO_PCI_QUEUE_PFN8 + +/* A 16-bit r/o queue size for the currently selected queue */ +#define VIRTIO_PCI_QUEUE_NUM12 + +/* A 16-bit r/w queue selector */ +#define VIRTIO_PCI_QUEUE_SEL14 + +/* A 16-bit r/w queue notifier */ +#define VIRTIO_PCI_QUEUE_NOTIFY 16 + +/* An 8-bit device status register. */ +#define VIRTIO_PCI_STATUS 18 + +/* An 8-bit r/o interrupt status register. Reading the value will return the + * current contents of the ISR and will also clear it. This is effectively + * a read-and-acknowledge. */ +#define VIRTIO_PCI_ISR 19 + +/* MSI-X registers: only enabled if MSI-X is enabled. */ +/* A 16-bit vector for configuration changes. */ +#define VIRTIO_MSI_CONFIG_VECTOR20 +/* A 16-bit vector for selected queue notifications. */ +#define VIRTIO_MSI_QUEUE_VECTOR 22 + +/* Config space size */ +#define VIRTIO_PCI_CONFIG_NOMSI 20 +#define VIRTIO_PCI_CONFIG_MSI 24 +#define VIRTIO_PCI_REGION_SIZE(dev) (msix_present(dev) ? \ + VIRTIO_PCI_CONFIG_MSI : \ + VIRTIO_PCI_CONFIG_NOMSI) + +/* The remaining space is defined by each driver as the per-driver + * configuration space */ +#define VIRTIO_PCI_CONFIG(dev) (msix_enabled(dev) ? \ + VIRTIO_PCI_CONFIG_MSI : \ + VIRTIO_PCI_CONFIG_NOMSI) + +/* How many bits to shift physical queue address written to QUEUE_PFN. + * 12 is historical, and due to x86 page size. */ +#define VIRTIO_PCI_QUEUE_ADDR_SHIFT12 + +/* Flags track per-device state like workarounds for quirks in older guests. */ +#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG (1 0) + +/* QEMU doesn't strictly need write barriers since everything runs in + * lock-step. We'll leave the calls to wmb() in though to make it obvious for + * KVM or if kqemu gets SMP support. + */ +#define wmb() do { } while (0) + +/* HACK for virtio to determine if it's running a big endian guest */ +bool virtio_is_big_endian(void); + +/* virtio device */ + +static void virtio_pci_notify(void *opaque, uint16_t vector) +{ +VirtIOPCI *s = opaque; +if (msix_enabled(s-pci_dev)) { +msix_notify(s-pci_dev, vector); +} +else { +qemu_set_irq(s-pci_dev.irq[0], s-vdev-isr 1); +} +} + +static void virtio_pci_save_config(void * opaque, QEMUFile *f) +{ +VirtIOPCI *s = opaque; +pci_device_save(s-pci_dev, f); +msix_save(s-pci_dev, f); +if (msix_present(s-pci_dev)) { +qemu_put_be16(f, s-vdev-config_vector); +} +} + +static
[Qemu-devel] [RFC v2 10/12] hw/virtio-blk.c: Add virtio-blk device.
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/virtio-blk.c | 65 +++ hw/virtio-blk.h | 15 + 2 files changed, 80 insertions(+) diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 6f6d172..0a23352 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -16,6 +16,8 @@ #include trace.h #include hw/block-common.h #include blockdev.h +#include virtio-transport.h +#include virtio-pci.h #include virtio-blk.h #include scsi-defs.h #ifdef __linux__ @@ -665,3 +667,66 @@ void virtio_blk_exit(VirtIODevice *vdev) blockdev_mark_auto_del(s-bs); virtio_cleanup(vdev); } + +/ VirtIOBlk Device **/ + +static int virtio_blkdev_init(DeviceState *dev) +{ +VirtIODevice *vdev; +VirtIOBlockState *s = VIRTIO_BLK_FROM_QDEV(dev); + +assert(s-trl != NULL); + +vdev = virtio_blk_init(dev, s-blk); +if (!vdev) { +return -1; +} + +/* Pass default host_features to transport */ +s-trl-host_features = s-host_features; + +if (virtio_call_backend_init_cb(dev, s-trl, vdev) != 0) { +return -1; +} + +/* Binding should be ready here, let's get final features */ +if (vdev-binding-get_features) { + s-host_features = vdev-binding-get_features(vdev-binding_opaque); +} +return 0; +} + +static Property virtio_blkdev_properties[] = { +DEFINE_BLOCK_PROPERTIES(VirtIOBlockState, blk.conf), +DEFINE_BLOCK_CHS_PROPERTIES(VirtIOBlockState, blk.conf), +DEFINE_PROP_STRING(serial, VirtIOBlockState, blk.serial), +#ifdef __linux__ +DEFINE_PROP_BIT(scsi, VirtIOBlockState, blk.scsi, 0, true), +#endif +DEFINE_PROP_BIT(config-wce, VirtIOBlockState, blk.config_wce, 0, true), +DEFINE_VIRTIO_BLK_FEATURES(VirtIOBlockState, host_features), + +DEFINE_PROP_TRANSPORT(transport, VirtIOBlockState, trl), +DEFINE_PROP_END_OF_LIST(), +}; + +static void virtio_blkdev_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +dc-init = virtio_blkdev_init; +dc-props = virtio_blkdev_properties; +} + +static TypeInfo virtio_blkdev_info = { +.name = virtio-blk, +.parent = TYPE_DEVICE, +.instance_size = sizeof(VirtIOBlockState), +.class_init = virtio_blkdev_class_init, +}; + +static void virtio_blk_register_types(void) +{ +type_register_static(virtio_blkdev_info); +} + +type_init(virtio_blk_register_types) diff --git a/hw/virtio-blk.h b/hw/virtio-blk.h index f0740d0..0886818 100644 --- a/hw/virtio-blk.h +++ b/hw/virtio-blk.h @@ -14,7 +14,9 @@ #ifndef _QEMU_VIRTIO_BLK_H #define _QEMU_VIRTIO_BLK_H +#include sysbus.h #include virtio.h +#include virtio-transport.h #include hw/block-common.h /* from Linux's linux/virtio_blk.h */ @@ -111,4 +113,17 @@ struct VirtIOBlkConf DEFINE_VIRTIO_COMMON_FEATURES(_state, _field), \ DEFINE_PROP_BIT(config-wce, _state, _field, VIRTIO_BLK_F_CONFIG_WCE, true) + +typedef struct { +DeviceState qdev; +/* virtio-blk */ +VirtIOBlkConf blk; + +uint32_t host_features; + +VirtIOTransportLink *trl; +} VirtIOBlockState; + +#define VIRTIO_BLK_FROM_QDEV(dev) DO_UPCAST(VirtIOBlockState, qdev, dev) + #endif -- 1.7.9.5
[Qemu-devel] [RFC v2 00/12] Virtio-mmio refactoring.
Previous RFC you can find at http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03665.html Yes, long time ago... Since I'm not sure when I'll be able to continue on this, I'm publishing this work as is. In this patchset I tried to split virtio-xxx-pci devices into virtio-pci + virtio-xxx (blk, net, serial,...). Also virtio-mmio transport is introduced based on Peter's work which is accessible here: http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html The main idea was to let users specify -device virtio-pci,id=virtio-pci.0 -device virtio-blk,transport=virtio-pci.0,... and -device virtio-mmio,id=virtio-mmio.0 -device virtio-blk,transport=virtio-mmio.0,... I created virtio-pci and virtio-mmio transport devices and tried to enclose back-end functionality into virtio-blk, virtio-net, etc. On initialization of transport device it creates a bus to which a back-end device could be connected. Each back-end device is implemented in corresponding source file. As for PCI transport, I temporary placed it in a new virtio-pci-new.c file to not break a functionality of still presented virtio-xxx-pci devices. Known issues to be resolved: 1. On creation of back-end we need to resolve somehow if props were explicitly set by user. 2. Back-end device can't be initialized if there are no free bus created by transport, so you can't specify -device virtio-blk,transport=virtio-pci.0,... -device virtio-pci,id=virtio-pci.0 3. Implement virtio-xxx-devices such that they just create virtio-pci and virtio-xxx devices during initialization. 4. Refactor all remaining back-ends since I just tried blk, net, serial and balloon. 5. Refactor s390 6. Further? Evgeny Voevodin (9): Virtio: Add transport bindings. hw/qdev-properties.c: Add transport property. hw/pci.c: Make pci_add_option_rom global visible hw/virtio-serial-bus.c: Add virtio-serial device. hw/virtio-balloon.c: Add virtio-balloon device. hw/virtio-net.c: Add virtio-net device. hw/virtio-blk.c: Add virtio-blk device. hw/virtio-pci-new.c: Add VirtIOPCI device. hw/exynos4210.c: Create two virtio-mmio transport instances. Peter Maydell (3): virtio: Add support for guest setting of queue size virtio: Support transports which can specify the vring alignment Add MMIO based virtio transport hw/Makefile.objs |3 + hw/exynos4210.c| 13 + hw/pci.c |3 +- hw/pci.h |2 + hw/qdev-properties.c | 29 ++ hw/qdev.h |3 + hw/virtio-balloon.c| 42 +++ hw/virtio-balloon.h|9 + hw/virtio-blk.c| 65 hw/virtio-blk.h| 15 + hw/virtio-mmio.c | 400 + hw/virtio-net.c| 59 +++ hw/virtio-net.h| 16 + hw/virtio-pci-new.c| 925 hw/virtio-pci.h| 18 + hw/virtio-serial-bus.c | 44 +++ hw/virtio-serial.h | 11 + hw/virtio-transport.c | 147 hw/virtio-transport.h | 74 hw/virtio.c| 20 +- hw/virtio.h|2 + 21 files changed, 1896 insertions(+), 4 deletions(-) create mode 100644 hw/virtio-mmio.c create mode 100644 hw/virtio-pci-new.c create mode 100644 hw/virtio-transport.c create mode 100644 hw/virtio-transport.h -- 1.7.9.5
[Qemu-devel] [RFC v2 12/12] hw/exynos4210.c: Create two virtio-mmio transport instances.
NB: This is for test purposes only. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/exynos4210.c | 13 + 1 file changed, 13 insertions(+) diff --git a/hw/exynos4210.c b/hw/exynos4210.c index 00d4db8..70fcdd6 100644 --- a/hw/exynos4210.c +++ b/hw/exynos4210.c @@ -26,6 +26,7 @@ #include sysbus.h #include arm-misc.h #include loader.h +#include virtio-transport.h #include exynos4210.h #define EXYNOS4210_CHIPID_ADDR 0x1000 @@ -72,6 +73,10 @@ /* Display controllers (FIMD) */ #define EXYNOS4210_FIMD0_BASE_ADDR 0x11C0 +/* VirtIO MMIO */ +#define EXYNOS4210_VIRTIO_MMIO0_BASE_ADDR 0x10AD +#define EXYNOS4210_VIRTIO_MMIO1_BASE_ADDR 0x10AC + static uint8_t chipid_and_omr[] = { 0x11, 0x02, 0x21, 0x43, 0x09, 0x00, 0x00, 0x00 }; @@ -334,5 +339,13 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem, s-irq_table[exynos4210_get_irq(11, 2)], NULL); +sysbus_create_simple(VIRTIO_MMIO, + EXYNOS4210_VIRTIO_MMIO0_BASE_ADDR, + s-irq_table[exynos4210_get_irq(37, 3)]); + +sysbus_create_simple(VIRTIO_MMIO, + EXYNOS4210_VIRTIO_MMIO1_BASE_ADDR, + s-irq_table[exynos4210_get_irq(37, 2)]); + return s; } -- 1.7.9.5
[Qemu-devel] [RFC v2 05/12] hw/pci.c: Make pci_add_option_rom global visible
We need to use this function to load rom for virtio-net backend. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/pci.c |3 +-- hw/pci.h |2 ++ 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index f855cf3..bba69ef 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -77,7 +77,6 @@ static const TypeInfo pci_bus_info = { static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); static void pci_update_mappings(PCIDevice *d); static void pci_set_irq(void *opaque, int irq_num, int level); -static int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom); static void pci_del_option_rom(PCIDevice *pdev); static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET; @@ -1733,7 +1732,7 @@ static void pci_patch_ids(PCIDevice *pdev, uint8_t *ptr, int size) } /* Add an option rom for the device */ -static int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom) +int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom) { int size; char *path; diff --git a/hw/pci.h b/hw/pci.h index 4b6ab3d..5f47618 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -274,6 +274,8 @@ void pci_register_bar(PCIDevice *pci_dev, int region_num, uint8_t attr, MemoryRegion *memory); pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int region_num); +int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom); + int pci_add_capability(PCIDevice *pdev, uint8_t cap_id, uint8_t offset, uint8_t size); -- 1.7.9.5
[Qemu-devel] [RFC v2 04/12] hw/qdev-properties.c: Add transport property.
Virtio back-end devices can be plugged into both transports: VIRTIO_PCI and VIRTIO_MMIO. In order to choose the desired transport we have a property transport in every back-end state struct. By specifying -device virtio-blk-pci user chooses VIRTIO_PCI transport and transport property is set automatically. But in order to provide full control to user we need to have transport property available to be set through command line: -device virtio-pci,id=virtio-pci.0 -device virtio-blk,transport=virtio-pci.0,... Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/qdev-properties.c | 29 + hw/qdev.h|3 +++ 2 files changed, 32 insertions(+) diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c index 8aca0d4..4226a02 100644 --- a/hw/qdev-properties.c +++ b/hw/qdev-properties.c @@ -2,6 +2,7 @@ #include qdev.h #include qerror.h #include blockdev.h +#include virtio-transport.h #include hw/block-common.h #include net/hub.h @@ -526,6 +527,34 @@ PropertyInfo qdev_prop_drive = { .release = release_drive, }; +/* --- virtio transport --- */ + +static int parse_transport(DeviceState *dev, const char *str, void **ptr) +{ +VirtIOTransportLink *trl; + +trl = virtio_find_transport(str); + +if (trl == NULL) { +return -ENOENT; +} + +*ptr = trl; + +return 0; +} + +static void set_transport(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +set_pointer(obj, v, opaque, parse_transport, name, errp); +} + +PropertyInfo qdev_prop_transport = { +.name = transport, +.set = set_transport, +}; + /* --- character device --- */ static int parse_chr(DeviceState *dev, const char *str, void **ptr) diff --git a/hw/qdev.h b/hw/qdev.h index d699194..bd6aa6e 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -233,6 +233,7 @@ extern PropertyInfo qdev_prop_macaddr; extern PropertyInfo qdev_prop_losttickpolicy; extern PropertyInfo qdev_prop_bios_chs_trans; extern PropertyInfo qdev_prop_drive; +extern PropertyInfo qdev_prop_transport; extern PropertyInfo qdev_prop_netdev; extern PropertyInfo qdev_prop_vlan; extern PropertyInfo qdev_prop_pci_devfn; @@ -294,6 +295,8 @@ extern PropertyInfo qdev_prop_pci_host_devaddr; DEFINE_PROP(_n, _s, _f, qdev_prop_vlan, NetClientState*) #define DEFINE_PROP_DRIVE(_n, _s, _f) \ DEFINE_PROP(_n, _s, _f, qdev_prop_drive, BlockDriverState *) +#define DEFINE_PROP_TRANSPORT(_n, _s, _f) \ +DEFINE_PROP(_n, _s, _f, qdev_prop_transport, VirtIOTransportLink *) #define DEFINE_PROP_MACADDR(_n, _s, _f) \ DEFINE_PROP(_n, _s, _f, qdev_prop_macaddr, MACAddr) #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \ -- 1.7.9.5
Re: [Qemu-devel] [RFC v2 04/12] hw/qdev-properties.c: Add transport property.
On 09/17/2012 04:42 PM, Paolo Bonzini wrote: Il 17/09/2012 12:00, Evgeny Voevodin ha scritto: Virtio back-end devices can be plugged into both transports: VIRTIO_PCI and VIRTIO_MMIO. In order to choose the desired transport we have a property transport in every back-end state struct. By specifying -device virtio-blk-pci user chooses VIRTIO_PCI transport and transport property is set automatically. But in order to provide full control to user we need to have transport property available to be set through command line: -device virtio-pci,id=virtio-pci.0 -device virtio-blk,transport=virtio-pci.0,... What's the difference between this and bus? i.e. -device virtio-pci,id=virtio-pci-0 -device virtio-blk,bus=virtio-pci-0.0,... Paolo The difference is that with transport I used a linked list like, say, bdrv_states in block.c. It's much simpler then use buses. Also I was planning to use a link. In this approach buses are used only to reflect hierarchy of devices in emulator manager. And yes, cover letter contains quite misleading information because attach to transport is based on a list of links, not on buses. Sorry, I forgot that when wrote the cover. -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
[Qemu-devel] [Bug 1036987] [NEW] compilation error due to bug in savevm.c
Public bug reported: Since 302dfbeb21fc5154c24ca50d296e865a3778c7da Add xbzrle_encode_buffer and xbzrle_decode_buffer functions For performance we are encoding long word at a time. For nzrun we use long-word-at-a-time NULL-detection tricks from strcmp(): using ((lword - 0x0101010101010101) (~lword) 0x8080808080808080) test to find out if any byte in the long word is zero. Signed-off-by: Benoit Hudzia benoit.hud...@sap.com Signed-off-by: Petter Svard pett...@cs.umu.se Signed-off-by: Aidan Shribman aidan.shrib...@sap.com Signed-off-by: Orit Wasserman owass...@redhat.com Signed-off-by: Eric Blake ebl...@redhat.com Reviewed-by: Luiz Capitulino lcapitul...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com commit arrived into master barnch, I can't compile qemu at all: savevm.c:2476:13: error: overflow in implicit constant conversion [-Werror=overflow] ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1036987 Title: compilation error due to bug in savevm.c Status in QEMU: New Bug description: Since 302dfbeb21fc5154c24ca50d296e865a3778c7da Add xbzrle_encode_buffer and xbzrle_decode_buffer functions For performance we are encoding long word at a time. For nzrun we use long-word-at-a-time NULL-detection tricks from strcmp(): using ((lword - 0x0101010101010101) (~lword) 0x8080808080808080) test to find out if any byte in the long word is zero. Signed-off-by: Benoit Hudzia benoit.hud...@sap.com Signed-off-by: Petter Svard pett...@cs.umu.se Signed-off-by: Aidan Shribman aidan.shrib...@sap.com Signed-off-by: Orit Wasserman owass...@redhat.com Signed-off-by: Eric Blake ebl...@redhat.com Reviewed-by: Luiz Capitulino lcapitul...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com commit arrived into master barnch, I can't compile qemu at all: savevm.c:2476:13: error: overflow in implicit constant conversion [-Werror=overflow] To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1036987/+subscriptions
[Qemu-devel] [PATCH] savevm.c: Fix compilation error on 32bit host.
Casting of 0x0101010101010101ULL to long will truncate it to 32 bits on 32bit hosts, and won't truncate on 64bit hosts. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- savevm.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/savevm.c b/savevm.c index 0ea10c9..9ab4d83 100644 --- a/savevm.c +++ b/savevm.c @@ -2473,7 +2473,7 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen, /* word at a time for speed, use of 32-bit long okay */ if (!res) { /* truncation to 32-bit long okay */ -long mask = 0x0101010101010101ULL; +long mask = (long)0x0101010101010101ULL; while (i slen) { xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i); if ((xor - mask) ~xor (mask 7)) { -- 1.7.9.5
Re: [Qemu-devel] [PATCH] savevm.c: Fix compilation error on 32bit host.
On 08/15/2012 01:30 PM, Peter Maydell wrote: On 15 August 2012 10:10, Evgeny Voevodin e.voevo...@samsung.com wrote: Casting of 0x0101010101010101ULL to long will truncate it to 32 bits on 32bit hosts, and won't truncate on 64bit hosts. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com Dup of http://patchwork.ozlabs.org/patch/177217/ I'm afraid. -- PMM Don't be afraid, it's true. I didn't see it in the mailing list and didn't know that the bug is already fixed there. -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH 08/23] exynos4: Suppress unused default drives
On 09.08.2012 17:31, Markus Armbruster wrote: Cc: Evgeny Voevodin e.voevo...@samsung.com Cc: Maksim Kozlov m.koz...@samsung.com Cc: Igor Mitsyanko i.mitsya...@samsung.com Cc: Dmitry Solodkiy d.solod...@samsung.com Suppress default floppy, CD-ROM and SD card drives for machines nuri, smdkc210. Signed-off-by: Markus Armbruster arm...@redhat.com --- hw/exynos4_boards.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/hw/exynos4_boards.c b/hw/exynos4_boards.c index 4bb0a60..1512c27 100644 --- a/hw/exynos4_boards.c +++ b/hw/exynos4_boards.c @@ -160,12 +160,18 @@ static QEMUMachine exynos4_machines[EXYNOS4_NUM_OF_BOARDS] = { .desc = Samsung NURI board (Exynos4210), .init = nuri_init, .max_cpus = EXYNOS4210_NCPUS, +.no_floppy = 1, +.no_cdrom = 1, +.no_sdcard = 1, }, [EXYNOS4_BOARD_SMDKC210] = { .name = smdkc210, .desc = Samsung SMDKC210 board (Exynos4210), .init = smdkc210_init, .max_cpus = EXYNOS4210_NCPUS, +.no_floppy = 1, +.no_cdrom = 1, +.no_sdcard = 1, }, }; Recently I looked in monitor and thought about how to dismiss them from our board ) Acked-by: Evgeny Voevodin e.voevo...@samsung.com -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, Samsung Moscow Research Center, e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [RFC][PATCH v2 4/4] configure: add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization
On 05.07.2012 17:55, Andreas Färber wrote: Am 05.07.2012 15:23, schrieb Yeongkyoon Lee: Add an option --enable-ldst-optimization to enable CONFIG_QEMU_LDST_OPTIMIZATION macro for TCG qemu_ld/st optimization. It only works with CONFIG_SOFTMMU and doesn't work with CONFIG_TCG_PASS_AREG0. Signed-off-by: Yeongkyoon Lee yeongkyoon@samsung.com --- configure | 15 +++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/configure b/configure index 9f071b7..2b364cc 100755 --- a/configure +++ b/configure [...] @@ -3463,6 +3466,11 @@ echo EXESUF=$EXESUF $config_host_mak echo LIBS_QGA+=$libs_qga $config_host_mak echo POD2MAN=$POD2MAN $config_host_mak +if [ $ldst_optimization = yes -a $cpu != i386 -a $cpu != x86_64 ] ; then + echo ERROR: qemu_ld/st optimization is only available on i386 or x86_64 hosts + exit 1 +fi [snip] I assume that Samsung is interested in optimizing the Exynos emulation. Nope ) Originally it's from x86 Tizen emulator ) I think there was already a patchset posted converting target-arm to CONFIG_PASS_TCG_AREG0, only with some slowdowns to be investigated... What is the obstacle for supporting AREG0 mode in your optimization? Regards, Andreas + # generate list of library paths for linker script $ld --verbose -v 2 /dev/null | grep SEARCH_DIR ${config_host_ld} @@ -3696,11 +3704,18 @@ fi symlink $source_path/Makefile.target $target_dir/Makefile +target_ldst_optimization=$ldst_optimization + case $target_arch2 in alpha | sparc* | xtensa* | ppc*) echo CONFIG_TCG_PASS_AREG0=y $config_target_mak +# qemu_ld/st optimization is not available with CONFIG_TCG_PASS_AREG0 +target_ldst_optimization=no ;; esac +if [ $target_ldst_optimization = yes -a $target_softmmu = yes ] ; then +echo CONFIG_QEMU_LDST_OPTIMIZATION=y $config_target_mak +fi echo TARGET_SHORT_ALIGNMENT=$target_short_alignment $config_target_mak echo TARGET_INT_ALIGNMENT=$target_int_alignment $config_target_mak -- Kind regards, Evgeny Voevodin, Technical Leader, Mobile Group, SMRC, Samsung Electronics e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH 2/3] hw/exynos4210_pwm.c: Fix STOP status in tick handler.
START/STOP bit was not cleaned correctly. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/exynos4210_pwm.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/exynos4210_pwm.c b/hw/exynos4210_pwm.c index 6243e59..0c22828 100644 --- a/hw/exynos4210_pwm.c +++ b/hw/exynos4210_pwm.c @@ -200,7 +200,7 @@ static void exynos4210_pwm_tick(void *opaque) ptimer_run(p-timer[id].ptimer, 1); } else { /* stop timer, set status to STOP, see Basic Timer Operation */ -p-reg_tcon = ~TCON_TIMER_START(id); +p-reg_tcon = ~TCON_TIMER_START(id); ptimer_stop(p-timer[id].ptimer); } } -- 1.7.9.5
[Qemu-devel] [PATCH 0/3] ARM: Exynos4210 bugfixes
First patch is on the list: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg03717.html It fixes a critical bug in MCT that leads to hanged linux kernel v3.0. I preferred to pick this patch into this patch set. Second patch fixes STOP status bit setting in PWM (not critical only since latest kernels use MCT as clock-source) Third patch fixes misleading initialization of IROM mirror Evgeny Voevodin (2): hw/exynos4210_pwm.c: Fix STOP status in tick handler. hw/exynos4210.c: Fix misleading initialization of IROM mirror Stanislav Vorobiov (1): ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel. hw/exynos4210.c |2 +- hw/exynos4210_mct.c |4 hw/exynos4210_pwm.c |2 +- 3 files changed, 2 insertions(+), 6 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH 3/3] hw/exynos4210.c: Fix misleading initialization of IROM mirror
We want to mirror whole IROM and should pass zero instead of EXYNOS4210_IROM_BASE_ADDR (though it equals to zero too) since memory_region_init_alias takes an offset within an original region as an argument. Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/exynos4210.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/exynos4210.c b/hw/exynos4210.c index 9c20b3f..80a00b9 100644 --- a/hw/exynos4210.c +++ b/hw/exynos4210.c @@ -216,7 +216,7 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem, /* mirror of iROM */ memory_region_init_alias(s-irom_alias_mem, exynos4210.irom_alias, s-irom_mem, - EXYNOS4210_IROM_BASE_ADDR, + 0, EXYNOS4210_IROM_SIZE); memory_region_set_readonly(s-irom_alias_mem, true); memory_region_add_subregion(system_mem, EXYNOS4210_IROM_MIRROR_BASE_ADDR, -- 1.7.9.5
[Qemu-devel] [PATCH 1/3] ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel.
From: Stanislav Vorobiov s.vorob...@samsung.com After some long period of time Linux kernel hanged due to ptimer_get_count may return 0 before timer interrupt occurs, thus, causing FRC to jump back in time Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/exynos4210_mct.c |4 1 file changed, 4 deletions(-) diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c index 7474fcf..7a22b1f 100644 --- a/hw/exynos4210_mct.c +++ b/hw/exynos4210_mct.c @@ -376,10 +376,6 @@ static uint64_t exynos4210_gfrc_get_count(Exynos4210MCTGT *s) { uint64_t count = 0; count = ptimer_get_count(s-ptimer_frc); -if (!count) { -/* Timer event was generated and s-reg.cnt holds adequate value */ -return s-reg.cnt; -} count = s-count - count; return s-reg.cnt + count; } -- 1.7.9.5
Re: [Qemu-devel] [PATCH] Exynos4: added RTC device
: +s-reg_almhour = (value 0x3f); +break; +case ALMDAY: +s-reg_almday = (value 0x3f); +break; +case ALMMON: +s-reg_almmon = (value 0x1f); +break; +case ALMYEAR: +s-reg_almyear = (value 0x0fff); +break; + +case BCDSEC: +if (s-reg_rtccon RTC_ENABLE) { +s-current_tm.tm_sec = rtc_from_bcd(value, 2); +} +break; +case BDCMIN: +if (s-reg_rtccon RTC_ENABLE) { +s-current_tm.tm_min = rtc_from_bcd(value, 2); +} +break; +case BCDHOUR: +if (s-reg_rtccon RTC_ENABLE) { +s-current_tm.tm_hour = rtc_from_bcd(value, 2); +} +break; +case BCDDAYWEEK: +if (s-reg_rtccon RTC_ENABLE) { +s-current_tm.tm_wday = rtc_from_bcd(value, 2); +} +break; +case BCDDAY: +if (s-reg_rtccon RTC_ENABLE) { +s-current_tm.tm_mday = rtc_from_bcd(value, 2); +} +break; +case BCDMON: +if (s-reg_rtccon RTC_ENABLE) { +s-current_tm.tm_mon = rtc_from_bcd(value, 2) - 1; +} +break; +case BCDYEAR: +if (s-reg_rtccon RTC_ENABLE) { +s-current_tm.tm_year = rtc_from_bcd(value, 3); +} +break; + +default: +fprintf(stderr, +[exynos4210.rtc: bad write offset TARGET_FMT_plx ]\n, +offset); +break; + +} +} + +/* + * Set default values to timer fields and registers + */ +static void exynos4210_rtc_reset(DeviceState *d) +{ +Exynos4210RTCState *s = (Exynos4210RTCState *)d; + +struct tm tm; + +qemu_get_timedate(tm, 0); +s-current_tm = tm; + +DPRINTF(Get time from host: %d-%d-%d %2d:%02d:%02d\n, +s-current_tm.tm_year, s-current_tm.tm_mon, s-current_tm.tm_mday, +s-current_tm.tm_hour, s-current_tm.tm_min, s-current_tm.tm_sec); + +s-reg_intp = 0; +s-reg_rtccon = 0; +s-reg_ticcnt = 0; +s-reg_rtcalm = 0; +s-reg_almsec = 0; +s-reg_almmin = 0; +s-reg_almhour = 0; +s-reg_almday = 0; +s-reg_almmon = 0; +s-reg_almyear = 0; + +s-reg_curticcnt = 0; + +exynos4210_rtc_update_freq(s, s-reg_rtccon); +ptimer_stop(s-ptimer); +ptimer_stop(s-ptimer_1Hz); +} + +static const MemoryRegionOps exynos4210_rtc_ops = { +.read = exynos4210_rtc_read, +.write = exynos4210_rtc_write, +.endianness = DEVICE_NATIVE_ENDIAN, +}; + +/* + * RTC timer initialization + */ +static int exynos4210_rtc_init(SysBusDevice *dev) +{ +Exynos4210RTCState *s = FROM_SYSBUS(Exynos4210RTCState, dev); +QEMUBH *bh; + +bh = qemu_bh_new(exynos4210_rtc_tick, s); +s-ptimer = ptimer_init(bh); +ptimer_set_freq(s-ptimer, RTC_BASE_FREQ); +exynos4210_rtc_update_freq(s, 0); + +bh = qemu_bh_new(exynos4210_rtc_1Hz_tick, s); +s-ptimer_1Hz = ptimer_init(bh); +ptimer_set_freq(s-ptimer_1Hz, RTC_BASE_FREQ); + +sysbus_init_irq(dev,s-alm_irq); +sysbus_init_irq(dev,s-tick_irq); + +memory_region_init_io(s-iomem,exynos4210_rtc_ops, s, exynos4210-rtc, +EXYNOS4210_RTC_REG_MEM_SIZE); +sysbus_init_mmio(dev,s-iomem); + +return 0; +} + +static void exynos4210_rtc_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass); + +k-init = exynos4210_rtc_init; +dc-reset = exynos4210_rtc_reset; +dc-vmsd =vmstate_exynos4210_rtc_state; +} + +static const TypeInfo exynos4210_rtc_info = { +.name = exynos4210.rtc, +.parent= TYPE_SYS_BUS_DEVICE, +.instance_size = sizeof(Exynos4210RTCState), +.class_init= exynos4210_rtc_class_init, +}; + +static void exynos4210_rtc_register_types(void) +{ +type_register_static(exynos4210_rtc_info); +} + +type_init(exynos4210_rtc_register_types) Reviewed-by: Evgeny Voevodin e.voevo...@samsung.com -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH] Exynos4: added RTC device
On 25.06.2012 13:24, Andreas Färber wrote: Am 25.06.2012 09:55, schrieb Oleg Ogurtsov: Signed-off-by: Oleg Ogurtsovo.ogurt...@samsung.com --- hw/arm/Makefile.objs |1 + hw/exynos4210.c |8 + hw/exynos4210_rtc.c | 607 ++ 3 files changed, 616 insertions(+), 0 deletions(-) create mode 100644 hw/exynos4210_rtc.c This RTC like many other Exynos devices has no dependency on the CPU. I have a patch in preparation that moves such devices from hw/arm/Makefile.objs to hw/Makefile.objs. I don't object to this patch, not even minor style nits spotted, compliment, but if you have to respin for some reason, it would be nice if you could consider that improvement. These devices are SOC specific and this SOC is based on ARM only. Do we really need to move them? -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] [PATCH] Exynos4: added RTC device
On 25.06.2012 16:00, Andreas Färber wrote: Am 25.06.2012 13:46, schrieb Evgeny Voevodin: On 25.06.2012 13:24, Andreas Färber wrote: Am 25.06.2012 09:55, schrieb Oleg Ogurtsov: Signed-off-by: Oleg Ogurtsovo.ogurt...@samsung.com --- hw/arm/Makefile.objs |1 + hw/exynos4210.c |8 + hw/exynos4210_rtc.c | 607 ++ 3 files changed, 616 insertions(+), 0 deletions(-) create mode 100644 hw/exynos4210_rtc.c This RTC like many other Exynos devices has no dependency on the CPU. I have a patch in preparation that moves such devices from hw/arm/Makefile.objs to hw/Makefile.objs. I don't object to this patch, not even minor style nits spotted, compliment, but if you have to respin for some reason, it would be nice if you could consider that improvement. These devices are SOC specific and this SOC is based on ARM only. Do we really need to move them? For one, they do not need to be rebuilt when cpu.h changes and they should get the usual device poisoning for proper modeling. For another, someone on IRC started work on an armeb-softmmu, for which we would probably not want to compile in the Exynos devices. Or if we do, we certainly don't want to compile everything twice (cf. xilinx). If devices are ARM-specific and need access to the CPU (e.g., machines, PICs) then according to Paolo they should be placed in hw/arm/ with the new scheme. I'm trying to stay away from moving other people's files around, but the Makefile changes are pretty non-intrusive and can go through arm-devs.next. Cheers, Andreas Oh, I see. Should we place this device to hw/Makefile.objs in v2? -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
[Qemu-devel] [PATCH] ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel.
From: Stanislav Vorobiov s.vorob...@samsung.com After some long period of time Linux kernel hanged due to ptimer_get_count may return 0 before timer interrupt occurs, thus, causing FRC to jump back in time Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com --- hw/exynos4210_mct.c |4 1 file changed, 4 deletions(-) diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c index 7474fcf..7a22b1f 100644 --- a/hw/exynos4210_mct.c +++ b/hw/exynos4210_mct.c @@ -376,10 +376,6 @@ static uint64_t exynos4210_gfrc_get_count(Exynos4210MCTGT *s) { uint64_t count = 0; count = ptimer_get_count(s-ptimer_frc); -if (!count) { -/* Timer event was generated and s-reg.cnt holds adequate value */ -return s-reg.cnt; -} count = s-count - count; return s-reg.cnt + count; } -- 1.7.9.5
Re: [Qemu-devel] [PATCH] ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel.
On 22.06.2012 11:56, Peter Crosthwaite wrote: Hi Evgeny, Im just speculating here, but I recently ran into Linux hangs on Microblaze due to ptimer issues and I think you may be suffering the same base issue. The Microblaze timer (hw/xilinx_timer.c) has a similar implementation to the exynos (chained one-shot ptimer). Recently Peter Chubb put patch cf36b31db209a261ee3bc2737e788e1ced0a1bec through, which modified ptimer to not set excessively short periods. Problem is, that only works for periodic ptimers. No, that's another problem. MCT was developed earlier then this commit landed and it contains similar code to avoid such situation. But our patch fixes bug in logic. The thing is that since ptimer uses BHs, it can be the situation when ptimer is stopped but BH is not called yet. And exactly in this moment target reads counter value that was incorrect calculated. Anyways, you may find that changing your set_count() calls to set_limit (i.e. the function designed for periodic timers) calls works for you, without changing the logic of your device. Heres the change pattern: -ptimer_set_count(ptimer, count); +ptimer_set_limit(ptimer, count, 1); More permanently (and a question for the cheif maintainers) can we look into ways of fixing ptimer properly? Regards, Peter On Fri, Jun 22, 2012 at 5:22 PM, Evgeny Voevodine.voevo...@samsung.com wrote: From: Stanislav Vorobiovs.vorob...@samsung.com After some long period of time Linux kernel hanged due to ptimer_get_count may return 0 before timer interrupt occurs, thus, causing FRC to jump back in time Signed-off-by: Evgeny Voevodine.voevo...@samsung.com Reviewed-by Peter A. G. Crosthwaitepeter.crosthwa...@petalogix.com Thanks. --- hw/exynos4210_mct.c |4 1 file changed, 4 deletions(-) diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c index 7474fcf..7a22b1f 100644 --- a/hw/exynos4210_mct.c +++ b/hw/exynos4210_mct.c @@ -376,10 +376,6 @@ static uint64_t exynos4210_gfrc_get_count(Exynos4210MCTGT *s) { uint64_t count = 0; count = ptimer_get_count(s-ptimer_frc); -if (!count) { -/* Timer event was generated and s-reg.cnt holds adequate value */ -return s-reg.cnt; -} count = s-count - count; return s-reg.cnt + count; } -- 1.7.9.5 -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] Virtio-pci issue
On 30.05.2012 11:56, Stefan Hajnoczi wrote: On Tue, May 29, 2012 at 4:48 AM, Evgeny Voevodine.voevo...@samsung.com wrote: On 28.05.2012 16:37, Stefan Hajnoczi wrote: On Thu, May 24, 2012 at 4:18 AM, Evgeny Voevodine.voevo...@samsung.com wrote: And also there is another problem that I've faced with. It is the ability to plug as many pci back-ends as board wants. I mean that if for each back-end board should create a transport, then user have to know maximum number of transport instances created by board. In the case of mmio transport I think that it's a correct behaviour, but for pci transport seems not. Not sure I understand the problem. Can you rephrase it? Stefan Ok, I'll try ) As I see, to connect a pci device to board it should be enough to specify -device ... on command line. And in the way virtio refactoring is moving, board should create transport pci device to correspond each back-end created by -device ... command. So, if we create more back-ends with -device option then transports created by board then there would be back-ends that will not have corresponding transport device. As result user should know maximum number of transport instances created by board to not overrun it. In the case of mmio I think it's normal, but not in the pci case. Am I right? The only limit to PCI devices should be the number slots available. Where number of slots is defined? For convenience we could continue to have virtio-blk-pci, virtio-net-pci, etc which actually just add a virtio-pci adapter and link it to a virtio device. Users that want full control can specify: -device virtio-pci,id=virtio-pci.0 -device virtio-blk,transport=virtio-pci.0,... The board doesn't need to preallocate virtio-pci adapters. Stefan You suggest transport device to be created by user... In that case an interface would differ from mmio since in the case of mmio a board should specify memory and irq mappings for transport device. -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] Can we improve virtio data structures with QOM?
On 30.05.2012 20:05, Markus Armbruster wrote: Stefan Hajnoczistefa...@gmail.com writes: On Wed, May 30, 2012 at 1:01 PM, Markus Armbrusterarm...@redhat.com wrote: Ordinary device models have a single state struct. The first member is a DeviceState or a specialization of DeviceState, e.g. a PCIDevice. Simple enough. I think Evgeny's virtio mmio patches change all this. In the recent virtio-pci thread we were discussing how the virtio transport (mmio, pci) and virtio devices (net, blk, etc) fit together. The email thread is Virtio-pci issues from Evgeny Voevodin e.voevo...@samsung.com. Thanks for the pointer. It's been a couple of weeks. Evgeny, are you still pursuing this? Yes, but in the past time we have a lot of work in Tizen project, so I delayed this work a bit. If anybody wants I can send latest patches to let you continue the work or maybe improve since I'm not sure if I'll have a time to continue until 15'th of june (but I'll try :). Actually my work is based on Peter's virtio-mmio patch set http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html, so maybe it's worth adding him in the list peter.mayd...@linaro.org. It probably makes sense to first merge Evgeny's virtio refactoring and then ensure it's nicely mapped to QOM. Yes, no good attempting to do too much in one series. Nevertheless, having a sufficiently developed idea of the final state in mind helps. -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com
Re: [Qemu-devel] Can we improve virtio data structures with QOM?
On 30.05.2012 20:05, Markus Armbruster wrote: Stefan Hajnoczistefa...@gmail.com writes: On Wed, May 30, 2012 at 1:01 PM, Markus Armbrusterarm...@redhat.com wrote: Ordinary device models have a single state struct. The first member is a DeviceState or a specialization of DeviceState, e.g. a PCIDevice. Simple enough. I think Evgeny's virtio mmio patches change all this. In the recent virtio-pci thread we were discussing how the virtio transport (mmio, pci) and virtio devices (net, blk, etc) fit together. The email thread is Virtio-pci issues from Evgeny Voevodin e.voevo...@samsung.com. Thanks for the pointer. It's been a couple of weeks. Evgeny, are you still pursuing this? Yes, but in the past time we have a lot of work in Tizen project, so I delayed this work a bit. If anybody wants I can send latest patches to let you continue the work or maybe improve since I'm not sure if I'll have a time to continue until 15'th of june (but I'll try :). Actually my work is based on Peter's virtio-mmio patch set http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html, so maybe it's worth adding him in the list peter.mayd...@linaro.org. It probably makes sense to first merge Evgeny's virtio refactoring and then ensure it's nicely mapped to QOM. Yes, no good attempting to do too much in one series. Nevertheless, having a sufficiently developed idea of the final state in mind helps. -- Kind regards, Evgeny Voevodin, Leading Software Engineer, ASWG, Moscow RD center, Samsung Electronics e-mail: e.voevo...@samsung.com