[Qemu-devel] [PATCH 0/2] Fainal TCG clean-up patches

2013-01-31 Thread Evgeny Voevodin

This set of patches moves rest global variables to tcg_ctx.
Also second patch introduces new TBContext for translation blocks
ans moves translation block globals there. We place tb_ctx inside
tcg_ctx and get noticable speed-up.


After this patchset was aplied,
I noticed ~4-5% speed-up of code generation.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 662.4
max: 696
avg: 672.28
standard deviation: ~17 ~= 3.5%

Average cycles/op = 672 +- 17


After clean-up:
min: 635
max: 650.5
avg: 640.14
standard deviation: ~8 ~= 1.6%

Average cycles/op = 640 +- 8

Evgeny Voevodin (2):
  TCG: Final globals clean-up
  TCG: Move translation block variables to new context inside tcg_ctx:
tb_ctx

 cpu-exec.c  |   18 +++--
 include/exec/exec-all.h |   27 +---
 linux-user/main.c   |6 +-
 tcg/tcg.c   |2 +-
 tcg/tcg.h   |   16 -
 translate-all.c |  173 +++
 6 files changed, 130 insertions(+), 112 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH 2/2] TCG: Move translation block variables to new context inside tcg_ctx: tb_ctx

2013-01-31 Thread Evgeny Voevodin
It's worth to clean-up translation blocks variables and move them
into one context as was suggested by Swirl.
Also if we use this context directly inside tcg_ctx, then it
speeds up code generation a bit.

Signed-off-by: Evgeny Voevodin evgenyvoevo...@gmail.com
---
 cpu-exec.c  |   18 -
 include/exec/exec-all.h |   27 +
 linux-user/main.c   |6 +--
 tcg/tcg.h   |2 +
 translate-all.c |   96 +++
 5 files changed, 79 insertions(+), 70 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 19ebb4a..ff9a884 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -23,8 +23,6 @@
 #include qemu/atomic.h
 #include sysemu/qtest.h
 
-int tb_invalidated_flag;
-
 //#define CONFIG_DEBUG_EXEC
 
 bool qemu_cpu_has_work(CPUState *cpu)
@@ -90,13 +88,13 @@ static TranslationBlock *tb_find_slow(CPUArchState *env,
 tb_page_addr_t phys_pc, phys_page1;
 target_ulong virt_page2;
 
-tb_invalidated_flag = 0;
+tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
 
 /* find translated block using physical mappings */
 phys_pc = get_page_addr_code(env, pc);
 phys_page1 = phys_pc  TARGET_PAGE_MASK;
 h = tb_phys_hash_func(phys_pc);
-ptb1 = tb_phys_hash[h];
+ptb1 = tcg_ctx.tb_ctx.tb_phys_hash[h];
 for(;;) {
 tb = *ptb1;
 if (!tb)
@@ -128,8 +126,8 @@ static TranslationBlock *tb_find_slow(CPUArchState *env,
 /* Move the last found TB to the head of the list */
 if (likely(*ptb1)) {
 *ptb1 = tb-phys_hash_next;
-tb-phys_hash_next = tb_phys_hash[h];
-tb_phys_hash[h] = tb;
+tb-phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
+tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
 }
 /* we add the TB in the virtual pc hash table */
 env-tb_jmp_cache[tb_jmp_cache_hash_func(pc)] = tb;
@@ -563,16 +561,16 @@ int cpu_exec(CPUArchState *env)
 #endif
 }
 #endif /* DEBUG_DISAS || CONFIG_DEBUG_EXEC */
-spin_lock(tb_lock);
+spin_lock(tcg_ctx.tb_ctx.tb_lock);
 tb = tb_find_fast(env);
 /* Note: we do it here to avoid a gcc bug on Mac OS X when
doing it in tb_find_slow */
-if (tb_invalidated_flag) {
+if (tcg_ctx.tb_ctx.tb_invalidated_flag) {
 /* as some TB could have been invalidated because
of memory exceptions while generating the code, we
must recompute the hash index here */
 next_tb = 0;
-tb_invalidated_flag = 0;
+tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
 }
 #ifdef CONFIG_DEBUG_EXEC
 qemu_log_mask(CPU_LOG_EXEC, Trace %p [ TARGET_FMT_lx ] 
%s\n,
@@ -585,7 +583,7 @@ int cpu_exec(CPUArchState *env)
 if (next_tb != 0  tb-page_addr[1] == -1) {
 tb_add_jump((TranslationBlock *)(next_tb  ~3), next_tb  
3, tb);
 }
-spin_unlock(tb_lock);
+spin_unlock(tcg_ctx.tb_ctx.tb_lock);
 
 /* cpu_interrupt might be called while translating the
TB, but before it is linked into a potentially
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index d235ef8..f685c28 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -168,6 +168,25 @@ struct TranslationBlock {
 uint32_t icount;
 };
 
+#include exec/spinlock.h
+
+typedef struct TBContext TBContext;
+
+struct TBContext {
+
+TranslationBlock *tbs;
+TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
+int nb_tbs;
+/* any access to the tbs or the page table must use this lock */
+spinlock_t tb_lock;
+
+/* statistics */
+int tb_flush_count;
+int tb_phys_invalidate_count;
+
+int tb_invalidated_flag;
+};
+
 static inline unsigned int tb_jmp_cache_hash_page(target_ulong pc)
 {
 target_ulong tmp;
@@ -192,8 +211,6 @@ void tb_free(TranslationBlock *tb);
 void tb_flush(CPUArchState *env);
 void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr);
 
-extern TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
-
 #if defined(USE_DIRECT_JUMP)
 
 #if defined(CONFIG_TCG_INTERPRETER)
@@ -275,12 +292,6 @@ static inline void tb_add_jump(TranslationBlock *tb, int n,
 }
 }
 
-#include exec/spinlock.h
-
-extern spinlock_t tb_lock;
-
-extern int tb_invalidated_flag;
-
 /* The return address may point to the start of the next instruction.
Subtracting one gets us the call instruction itself.  */
 #if defined(CONFIG_TCG_INTERPRETER)
diff --git a/linux-user/main.c b/linux-user/main.c
index 0181bc2..8f09abd 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -111,7 +111,7 @@ static int pending_cpus;
 /* Make sure everything is in a consistent state for calling fork().  */
 void fork_start(void)
 {
-pthread_mutex_lock(tb_lock);
+pthread_mutex_lock

[Qemu-devel] [PATCH 1/2] TCG: Final globals clean-up

2013-01-31 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin evgenyvoevo...@gmail.com
---
 tcg/tcg.c   |2 +-
 tcg/tcg.h   |   14 ++--
 translate-all.c |   97 ---
 3 files changed, 61 insertions(+), 52 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 9275e37..c8a843e 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -263,7 +263,7 @@ void tcg_context_init(TCGContext *s)
 void tcg_prologue_init(TCGContext *s)
 {
 /* init global prologue and epilogue */
-s-code_buf = code_gen_prologue;
+s-code_buf = s-code_gen_prologue;
 s-code_ptr = s-code_buf;
 tcg_target_qemu_prologue(s);
 flush_icache_range((tcg_target_ulong)s-code_buf,
diff --git a/tcg/tcg.h b/tcg/tcg.h
index a427972..4086e98 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -462,6 +462,15 @@ struct TCGContext {
 uint16_t gen_opc_icount[OPC_BUF_SIZE];
 uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
 
+/* Code generation */
+int code_gen_max_blocks;
+uint8_t *code_gen_prologue;
+uint8_t *code_gen_buffer;
+size_t code_gen_buffer_size;
+/* threshold to flush the translated code buffer */
+size_t code_gen_buffer_max_size;
+uint8_t *code_gen_ptr;
+
 #if defined(CONFIG_QEMU_LDST_OPTIMIZATION)  defined(CONFIG_SOFTMMU)
 /* labels info for qemu_ld/st IRs
The labels help to generate TLB miss case codes at the end of TB */
@@ -658,12 +667,11 @@ TCGv_i64 tcg_const_i64(int64_t val);
 TCGv_i32 tcg_const_local_i32(int32_t val);
 TCGv_i64 tcg_const_local_i64(int64_t val);
 
-extern uint8_t *code_gen_prologue;
-
 /* TCG targets may use a different definition of tcg_qemu_tb_exec. */
 #if !defined(tcg_qemu_tb_exec)
 # define tcg_qemu_tb_exec(env, tb_ptr) \
-((tcg_target_ulong (*)(void *, void *))code_gen_prologue)(env, tb_ptr)
+((tcg_target_ulong (*)(void *, void *))tcg_ctx.code_gen_prologue)(env, \
+  tb_ptr)
 #endif
 
 void tcg_register_jit(void *buf, size_t buf_size);
diff --git a/translate-all.c b/translate-all.c
index d367fc4..d666562 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -72,21 +72,13 @@
 
 #define SMC_BITMAP_USE_THRESHOLD 10
 
-/* Code generation and translation blocks */
+/* Translation blocks */
 static TranslationBlock *tbs;
-static int code_gen_max_blocks;
 TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
 static int nb_tbs;
 /* any access to the tbs or the page table must use this lock */
 spinlock_t tb_lock = SPIN_LOCK_UNLOCKED;
 
-uint8_t *code_gen_prologue;
-static uint8_t *code_gen_buffer;
-static size_t code_gen_buffer_size;
-/* threshold to flush the translated code buffer */
-static size_t code_gen_buffer_max_size;
-static uint8_t *code_gen_ptr;
-
 typedef struct PageDesc {
 /* list of TBs intersecting this ram page */
 TranslationBlock *first_tb;
@@ -514,7 +506,7 @@ static inline size_t size_code_gen_buffer(size_t tb_size)
 if (tb_size  MAX_CODE_GEN_BUFFER_SIZE) {
 tb_size = MAX_CODE_GEN_BUFFER_SIZE;
 }
-code_gen_buffer_size = tb_size;
+tcg_ctx.code_gen_buffer_size = tb_size;
 return tb_size;
 }
 
@@ -524,7 +516,7 @@ static uint8_t 
static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
 
 static inline void *alloc_code_gen_buffer(void)
 {
-map_exec(static_code_gen_buffer, code_gen_buffer_size);
+map_exec(static_code_gen_buffer, tcg_ctx.code_gen_buffer_size);
 return static_code_gen_buffer;
 }
 #elif defined(USE_MMAP)
@@ -547,8 +539,8 @@ static inline void *alloc_code_gen_buffer(void)
Leave the choice of exact location with the kernel.  */
 flags |= MAP_32BIT;
 /* Cannot expect to map more than 800MB in low memory.  */
-if (code_gen_buffer_size  800u * 1024 * 1024) {
-code_gen_buffer_size = 800u * 1024 * 1024;
+if (tcg_ctx.code_gen_buffer_size  800u * 1024 * 1024) {
+tcg_ctx.code_gen_buffer_size = 800u * 1024 * 1024;
 }
 # elif defined(__sparc__)
 start = 0x4000ul;
@@ -556,17 +548,17 @@ static inline void *alloc_code_gen_buffer(void)
 start = 0x9000ul;
 # endif
 
-buf = mmap((void *)start, code_gen_buffer_size,
+buf = mmap((void *)start, tcg_ctx.code_gen_buffer_size,
PROT_WRITE | PROT_READ | PROT_EXEC, flags, -1, 0);
 return buf == MAP_FAILED ? NULL : buf;
 }
 #else
 static inline void *alloc_code_gen_buffer(void)
 {
-void *buf = g_malloc(code_gen_buffer_size);
+void *buf = g_malloc(tcg_ctx.code_gen_buffer_size);
 
 if (buf) {
-map_exec(buf, code_gen_buffer_size);
+map_exec(buf, tcg_ctx.code_gen_buffer_size);
 }
 return buf;
 }
@@ -574,27 +566,30 @@ static inline void *alloc_code_gen_buffer(void)
 
 static inline void code_gen_alloc(size_t tb_size)
 {
-code_gen_buffer_size = size_code_gen_buffer(tb_size);
-code_gen_buffer = alloc_code_gen_buffer();
-if (code_gen_buffer == NULL) {
+tcg_ctx.code_gen_buffer_size = size_code_gen_buffer(tb_size);
+tcg_ctx.code_gen_buffer

Re: [Qemu-devel] [PATCH] exynos4210/mct: Avoid infinite loop on non incremental timers

2012-12-03 Thread Evgeny Voevodin

On 12/01/2012 09:08 PM, Jean-Christophe DUBOIS wrote:

Check for a 0 distance value to avoid infinite loop when the
expired FCR timer was not programed with auto-increment.

With this change the behavior is coherent with the same type
of code in the exynos4210_gfrc_restart() function in the same
file.

Linux seems to mostly use this timer with auto-increment
which explain why it is not a problem most of the time.

However other OS might have a problem with this if they
don't use the auto-increment feature.

Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net
---
 hw/exynos4210_mct.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c
index e79cd6a..31a41d5 100644
--- a/hw/exynos4210_mct.c
+++ b/hw/exynos4210_mct.c
@@ -568,7 +568,7 @@ static void exynos4210_gfrc_event(void *opaque)
 /* Reload FRC to reach nearest comparator */
 s-g_timer.curr_comp = exynos4210_gcomp_find(s);
 distance = exynos4210_gcomp_get_distance(s, s-g_timer.curr_comp);
-if (distance  MCT_GT_COUNTER_STEP) {
+if ((distance  MCT_GT_COUNTER_STEP) || !distance) {


You don't need additional braces here.


distance = MCT_GT_COUNTER_STEP;
 }
 exynos4210_gfrc_set_count(s-g_timer, distance);
--

1.7.9.5





Doesn't apply to current master, please, rebase:

Applying: exynos4210/mct: Avoid infinite loop on non incremental timers
error: patch failed: hw/exynos4210_mct.c:568
error: hw/exynos4210_mct.c: patch does not apply


--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Centre,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [PATCH v2] exynos4210/mct: Avoid infinite loop on non incremental timers

2012-12-03 Thread Evgeny Voevodin

On 12/04/2012 02:55 AM, Jean-Christophe DUBOIS wrote:

Check for a 0 distance value to avoid infinite loop when the
expired FCR timer was not programed with auto-increment.

With this change the behavior is coherent with the same type
of code in the exynos4210_gfrc_restart() function in the same
file.

Linux seems to mostly use this timer with auto-increment
which explain why it is not a problem most of the time.

However other OS might have a problem with this if they
don't use the auto-increment feature.

Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net
---
  hw/exynos4210_mct.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c
index e79cd6a..37dbda9 100644
--- a/hw/exynos4210_mct.c
+++ b/hw/exynos4210_mct.c
@@ -568,7 +568,7 @@ static void exynos4210_gfrc_event(void *opaque)
  /* Reload FRC to reach nearest comparator */
  s-g_timer.curr_comp = exynos4210_gcomp_find(s);
  distance = exynos4210_gcomp_get_distance(s, s-g_timer.curr_comp);
-if (distance  MCT_GT_COUNTER_STEP) {
+if (distance  MCT_GT_COUNTER_STEP || !distance) {
  distance = MCT_GT_COUNTER_STEP;
  }
  exynos4210_gfrc_set_count(s-g_timer, distance);



Reviewed-by: Evgeny Voevodin e.voevo...@samsung.com

P.S.: Next time, please, don't forget to CC appropriate people to not
let them miss your patch.

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] [PATCH 0/5] TCG global gen_opc_ arrays clean-up

2012-12-03 Thread Evgeny Voevodin

On 11/26/2012 08:19 AM, Evgeny Voevodin wrote:

On 11/21/2012 11:43 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_instr
gen_opparam_icount
gen_opc_pc

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied,
I noticed no speed-up or slow-down of code generation.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 655.5
max: 659.3
avg: 657.2
standard deviation: ~2 ~= 0.4%

Average cycles/op = 657 +- 2



After clean-up:
min: 654.6
max: 657.1
avg: 655.5
standard deviation: ~1 ~= 0.2%

Average cycles/op = 656 +- 1

Evgeny Voevodin (5):
   tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext.
   TCG: Use gen_opc_pc from context instead of global variable.
   TCG: Use gen_opc_icount from context instead of global variable.
   TCG: Use gen_opc_instr_start from context instead of global variable.
   TCG: Remove unused global gen_opc_ arrays.

  exec-all.h|4 
  target-alpha/translate.c  |   12 ++--
  target-arm/translate.c|   12 ++--
  target-cris/translate.c   |   14 +++---
  target-i386/translate.c   |   19 ++-
  target-lm32/translate.c   |   12 ++--
  target-m68k/translate.c   |   12 ++--
  target-microblaze/translate.c |   12 ++--
  target-mips/translate.c   |   12 ++--
  target-openrisc/translate.c   |   12 ++--
  target-ppc/translate.c|   12 ++--
  target-s390x/translate.c  |   12 ++--
  target-sh4/translate.c|   12 ++--
  target-sparc/translate.c  |   12 ++--
  target-unicore32/translate.c  |   12 ++--
  target-xtensa/translate.c |   10 +-
  tcg/tcg.h |3 +++
  translate-all.c   |9 +++--
  18 files changed, 100 insertions(+), 103 deletions(-)



Ping?

+CC: Alexander Graf ag...@suse.de; Paul Brook p...@codesourcery.com



Ping??

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Centre,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [PATCH 0/5] TCG global gen_opc_ arrays clean-up

2012-11-25 Thread Evgeny Voevodin

On 11/21/2012 11:43 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_instr
gen_opparam_icount
gen_opc_pc

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied,
I noticed no speed-up or slow-down of code generation.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 655.5
max: 659.3
avg: 657.2
standard deviation: ~2 ~= 0.4%

Average cycles/op = 657 +- 2



After clean-up:
min: 654.6
max: 657.1
avg: 655.5
standard deviation: ~1 ~= 0.2%

Average cycles/op = 656 +- 1

Evgeny Voevodin (5):
   tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext.
   TCG: Use gen_opc_pc from context instead of global variable.
   TCG: Use gen_opc_icount from context instead of global variable.
   TCG: Use gen_opc_instr_start from context instead of global variable.
   TCG: Remove unused global gen_opc_ arrays.

  exec-all.h|4 
  target-alpha/translate.c  |   12 ++--
  target-arm/translate.c|   12 ++--
  target-cris/translate.c   |   14 +++---
  target-i386/translate.c   |   19 ++-
  target-lm32/translate.c   |   12 ++--
  target-m68k/translate.c   |   12 ++--
  target-microblaze/translate.c |   12 ++--
  target-mips/translate.c   |   12 ++--
  target-openrisc/translate.c   |   12 ++--
  target-ppc/translate.c|   12 ++--
  target-s390x/translate.c  |   12 ++--
  target-sh4/translate.c|   12 ++--
  target-sparc/translate.c  |   12 ++--
  target-unicore32/translate.c  |   12 ++--
  target-xtensa/translate.c |   10 +-
  tcg/tcg.h |3 +++
  translate-all.c   |9 +++--
  18 files changed, 100 insertions(+), 103 deletions(-)



Ping?

+CC: Alexander Graf ag...@suse.de; Paul Brook p...@codesourcery.com

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Centre,
e-mail: e.voevo...@samsung.com




[Qemu-devel] [PATCH 4/5] TCG: Use gen_opc_instr_start from context instead of global variable.

2012-11-21 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |6 +++---
 target-arm/translate.c|6 +++---
 target-cris/translate.c   |6 +++---
 target-i386/translate.c   |8 
 target-lm32/translate.c   |6 +++---
 target-m68k/translate.c   |6 +++---
 target-microblaze/translate.c |6 +++---
 target-mips/translate.c   |6 +++---
 target-openrisc/translate.c   |6 +++---
 target-ppc/translate.c|6 +++---
 target-s390x/translate.c  |6 +++---
 target-sh4/translate.c|6 +++---
 target-sparc/translate.c  |6 +++---
 target-unicore32/translate.c  |6 +++---
 target-xtensa/translate.c |4 ++--
 translate-all.c   |3 ++-
 16 files changed, 47 insertions(+), 46 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 8b73fbb..71fe1a1 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3410,10 +3410,10 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 if (lj  j) {
 lj++;
 while (lj  j)
-gen_opc_instr_start[lj++] = 0;
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
 }
 tcg_ctx.gen_opc_pc[lj] = ctx.pc;
-gen_opc_instr_start[lj] = 1;
+tcg_ctx.gen_opc_instr_start[lj] = 1;
 tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 if (num_insns + 1 == max_insns  (tb-cflags  CF_LAST_IO))
@@ -3468,7 +3468,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
-gen_opc_instr_start[lj++] = 0;
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
 } else {
 tb-size = ctx.pc - pc_start;
 tb-icount = num_insns;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 4695d8b..3cf3604 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9838,11 +9838,11 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 if (lj  j) {
 lj++;
 while (lj  j)
-gen_opc_instr_start[lj++] = 0;
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
 }
 tcg_ctx.gen_opc_pc[lj] = dc-pc;
 gen_opc_condexec_bits[lj] = (dc-condexec_cond  4) | 
(dc-condexec_mask  1);
-gen_opc_instr_start[lj] = 1;
+tcg_ctx.gen_opc_instr_start[lj] = 1;
 tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 
@@ -9977,7 +9977,7 @@ done_generating:
 j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
-gen_opc_instr_start[lj++] = 0;
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
 } else {
 tb-size = dc-pc - pc_start;
 tb-icount = num_insns;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 6ec8c3c..60bdc24 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3301,7 +3301,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 if (lj  j) {
 lj++;
 while (lj  j) {
-gen_opc_instr_start[lj++] = 0;
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
 }
 }
 if (dc-delayed_branch == 1) {
@@ -3309,7 +3309,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 } else {
 tcg_ctx.gen_opc_pc[lj] = dc-pc;
 }
-gen_opc_instr_start[lj] = 1;
+tcg_ctx.gen_opc_instr_start[lj] = 1;
 tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 
@@ -3439,7 +3439,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j) {
-gen_opc_instr_start[lj++] = 0;
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
 }
 } else {
 tb-size = dc-pc - pc_start;
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 80fb695..f394ea6 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7988,11 +7988,11 @@ static inline void 
gen_intermediate_code_internal(CPUX86State *env,
 if (lj  j) {
 lj++;
 while (lj  j)
-gen_opc_instr_start[lj++] = 0;
+tcg_ctx.gen_opc_instr_start[lj++] = 0;
 }
 tcg_ctx.gen_opc_pc[lj] = pc_ptr;
 gen_opc_cc_op[lj] = dc-cc_op;
-gen_opc_instr_start[lj] = 1;
+tcg_ctx.gen_opc_instr_start[lj] = 1;
 tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 if (num_insns + 1 == max_insns  (tb-cflags  CF_LAST_IO))
@@ -8037,7 +8037,7 @@ static inline void

[Qemu-devel] [PATCH 5/5] TCG: Remove unused global gen_opc_ arrays.

2012-11-21 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 exec-all.h  |4 
 translate-all.c |4 
 2 files changed, 8 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 21aacda..b18d4ca 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -70,10 +70,6 @@ typedef struct TranslationBlock TranslationBlock;
 
 #define OPPARAM_BUF_SIZE (OPC_BUF_SIZE * MAX_OPC_PARAM)
 
-extern target_ulong gen_opc_pc[OPC_BUF_SIZE];
-extern uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-extern uint16_t gen_opc_icount[OPC_BUF_SIZE];
-
 #include qemu-log.h
 
 void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
diff --git a/translate-all.c b/translate-all.c
index 2f616bf..f22e3ee 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -33,10 +33,6 @@
 /* code generation context */
 TCGContext tcg_ctx;
 
-target_ulong gen_opc_pc[OPC_BUF_SIZE];
-uint16_t gen_opc_icount[OPC_BUF_SIZE];
-uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-
 void cpu_gen_init(void)
 {
 tcg_context_init(tcg_ctx); 
-- 
1.7.9.5




[Qemu-devel] [PATCH 1/5] tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext.

2012-11-20 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.h |3 +++
 1 file changed, 3 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index 9481e35..f6e255f 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -455,6 +455,9 @@ struct TCGContext {
 
 uint16_t *gen_opc_ptr;
 TCGArg *gen_opparam_ptr;
+target_ulong gen_opc_pc[OPC_BUF_SIZE];
+uint16_t gen_opc_icount[OPC_BUF_SIZE];
+uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
 
 #if defined(CONFIG_QEMU_LDST_OPTIMIZATION)  defined(CONFIG_SOFTMMU)
 /* labels info for qemu_ld/st IRs
-- 
1.7.9.5




[Qemu-devel] [PATCH 3/5] TCG: Use gen_opc_icount from context instead of global variable.

2012-11-20 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |2 +-
 target-arm/translate.c|2 +-
 target-cris/translate.c   |2 +-
 target-i386/translate.c   |2 +-
 target-lm32/translate.c   |2 +-
 target-m68k/translate.c   |2 +-
 target-microblaze/translate.c |2 +-
 target-mips/translate.c   |2 +-
 target-openrisc/translate.c   |2 +-
 target-ppc/translate.c|2 +-
 target-s390x/translate.c  |2 +-
 target-sh4/translate.c|2 +-
 target-sparc/translate.c  |2 +-
 target-unicore32/translate.c  |2 +-
 target-xtensa/translate.c |2 +-
 translate-all.c   |2 +-
 16 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index bcde367..8b73fbb 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3414,7 +3414,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 tcg_ctx.gen_opc_pc[lj] = ctx.pc;
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount[lj] = num_insns;
+tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 if (num_insns + 1 == max_insns  (tb-cflags  CF_LAST_IO))
 gen_io_start();
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 8ea8bba..4695d8b 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9843,7 +9843,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 tcg_ctx.gen_opc_pc[lj] = dc-pc;
 gen_opc_condexec_bits[lj] = (dc-condexec_cond  4) | 
(dc-condexec_mask  1);
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount[lj] = num_insns;
+tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 
 if (num_insns + 1 == max_insns  (tb-cflags  CF_LAST_IO))
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 745cd7a..6ec8c3c 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3310,7 +3310,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 tcg_ctx.gen_opc_pc[lj] = dc-pc;
 }
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount[lj] = num_insns;
+tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 
 /* Pretty disas.  */
diff --git a/target-i386/translate.c b/target-i386/translate.c
index aea843c..80fb695 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7993,7 +7993,7 @@ static inline void 
gen_intermediate_code_internal(CPUX86State *env,
 tcg_ctx.gen_opc_pc[lj] = pc_ptr;
 gen_opc_cc_op[lj] = dc-cc_op;
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount[lj] = num_insns;
+tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 if (num_insns + 1 == max_insns  (tb-cflags  CF_LAST_IO))
 gen_io_start();
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index fcafb06..4e029e0 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1056,7 +1056,7 @@ static void gen_intermediate_code_internal(CPULM32State 
*env,
 }
 tcg_ctx.gen_opc_pc[lj] = dc-pc;
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount[lj] = num_insns;
+tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 
 /* Pretty disas.  */
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 74772dd..0762085 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -3023,7 +3023,7 @@ gen_intermediate_code_internal(CPUM68KState *env, 
TranslationBlock *tb,
 }
 tcg_ctx.gen_opc_pc[lj] = dc-pc;
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount[lj] = num_insns;
+tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 if (num_insns + 1 == max_insns  (tb-cflags  CF_LAST_IO))
 gen_io_start();
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 6803f73..d975756 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1792,7 +1792,7 @@ gen_intermediate_code_internal(CPUMBState *env, 
TranslationBlock *tb,
 }
 tcg_ctx.gen_opc_pc[lj] = dc-pc;
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount[lj] = num_insns;
+tcg_ctx.gen_opc_icount[lj] = num_insns;
 }
 
 /* Pretty disas.  */
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 17d5ece..81807cf 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -15559,7 +15559,7 @@ gen_intermediate_code_internal (CPUMIPSState *env, 
TranslationBlock *tb,
 gen_opc_hflags[lj] = ctx.hflags  MIPS_HFLAG_BMASK;
 gen_opc_btarget[lj] = ctx.btarget;
 gen_opc_instr_start[lj] = 1;
-gen_opc_icount

[Qemu-devel] [PATCH 2/5] TCG: Use gen_opc_pc from context instead of global variable.

2012-11-20 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |4 ++--
 target-arm/translate.c|4 ++--
 target-cris/translate.c   |6 +++---
 target-i386/translate.c   |9 +
 target-lm32/translate.c   |4 ++--
 target-m68k/translate.c   |4 ++--
 target-microblaze/translate.c |4 ++--
 target-mips/translate.c   |4 ++--
 target-openrisc/translate.c   |4 ++--
 target-ppc/translate.c|4 ++--
 target-s390x/translate.c  |4 ++--
 target-sh4/translate.c|4 ++--
 target-sparc/translate.c  |4 ++--
 target-unicore32/translate.c  |4 ++--
 target-xtensa/translate.c |4 ++--
 15 files changed, 34 insertions(+), 33 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 4045f78..bcde367 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3412,7 +3412,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 while (lj  j)
 gen_opc_instr_start[lj++] = 0;
 }
-gen_opc_pc[lj] = ctx.pc;
+tcg_ctx.gen_opc_pc[lj] = ctx.pc;
 gen_opc_instr_start[lj] = 1;
 gen_opc_icount[lj] = num_insns;
 }
@@ -3551,5 +3551,5 @@ CPUAlphaState * cpu_alpha_init (const char *cpu_model)
 
 void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb, int pc_pos)
 {
-env-pc = gen_opc_pc[pc_pos];
+env-pc = tcg_ctx.gen_opc_pc[pc_pos];
 }
diff --git a/target-arm/translate.c b/target-arm/translate.c
index c42110a..8ea8bba 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9840,7 +9840,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 while (lj  j)
 gen_opc_instr_start[lj++] = 0;
 }
-gen_opc_pc[lj] = dc-pc;
+tcg_ctx.gen_opc_pc[lj] = dc-pc;
 gen_opc_condexec_bits[lj] = (dc-condexec_cond  4) | 
(dc-condexec_mask  1);
 gen_opc_instr_start[lj] = 1;
 gen_opc_icount[lj] = num_insns;
@@ -10043,6 +10043,6 @@ void cpu_dump_state(CPUARMState *env, FILE *f, 
fprintf_function cpu_fprintf,
 
 void restore_state_to_opc(CPUARMState *env, TranslationBlock *tb, int pc_pos)
 {
-env-regs[15] = gen_opc_pc[pc_pos];
+env-regs[15] = tcg_ctx.gen_opc_pc[pc_pos];
 env-condexec_bits = gen_opc_condexec_bits[pc_pos];
 }
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 0b0e86d..745cd7a 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3305,9 +3305,9 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 }
 }
 if (dc-delayed_branch == 1) {
-gen_opc_pc[lj] = dc-ppc | 1;
+tcg_ctx.gen_opc_pc[lj] = dc-ppc | 1;
 } else {
-gen_opc_pc[lj] = dc-pc;
+tcg_ctx.gen_opc_pc[lj] = dc-pc;
 }
 gen_opc_instr_start[lj] = 1;
 gen_opc_icount[lj] = num_insns;
@@ -3621,5 +3621,5 @@ CRISCPU *cpu_cris_init(const char *cpu_model)
 
 void restore_state_to_opc(CPUCRISState *env, TranslationBlock *tb, int pc_pos)
 {
-env-pc = gen_opc_pc[pc_pos];
+env-pc = tcg_ctx.gen_opc_pc[pc_pos];
 }
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 8e676ba..aea843c 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7990,7 +7990,7 @@ static inline void 
gen_intermediate_code_internal(CPUX86State *env,
 while (lj  j)
 gen_opc_instr_start[lj++] = 0;
 }
-gen_opc_pc[lj] = pc_ptr;
+tcg_ctx.gen_opc_pc[lj] = pc_ptr;
 gen_opc_cc_op[lj] = dc-cc_op;
 gen_opc_instr_start[lj] = 1;
 gen_opc_icount[lj] = num_insns;
@@ -8081,15 +8081,16 @@ void restore_state_to_opc(CPUX86State *env, 
TranslationBlock *tb, int pc_pos)
 qemu_log(RESTORE:\n);
 for(i = 0;i = pc_pos; i++) {
 if (gen_opc_instr_start[i]) {
-qemu_log(0x%04x:  TARGET_FMT_lx \n, i, gen_opc_pc[i]);
+qemu_log(0x%04x:  TARGET_FMT_lx \n, i,
+tcg_ctx.gen_opc_pc[i]);
 }
 }
 qemu_log(pc_pos=0x%x eip= TARGET_FMT_lx  cs_base=%x\n,
-pc_pos, gen_opc_pc[pc_pos] - tb-cs_base,
+pc_pos, tcg_ctx.gen_opc_pc[pc_pos] - tb-cs_base,
 (uint32_t)tb-cs_base);
 }
 #endif
-env-eip = gen_opc_pc[pc_pos] - tb-cs_base;
+env-eip = tcg_ctx.gen_opc_pc[pc_pos] - tb-cs_base;
 cc_op = gen_opc_cc_op[pc_pos];
 if (cc_op != CC_OP_DYNAMIC)
 env-cc_op = cc_op;
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index af98649..fcafb06 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1054,7 +1054,7 @@ static void gen_intermediate_code_internal(CPULM32State 
*env

[Qemu-devel] [PATCH 0/5] TCG global gen_opc_ arrays clean-up

2012-11-20 Thread Evgeny Voevodin
This set of patches moves global variables to tcg_ctx:
gen_opc_instr
gen_opparam_icount
gen_opc_pc

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied,
I noticed no speed-up or slow-down of code generation.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 655.5
max: 659.3
avg: 657.2
standard deviation: ~2 ~= 0.4%

Average cycles/op = 657 +- 2



After clean-up:
min: 654.6
max: 657.1
avg: 655.5
standard deviation: ~1 ~= 0.2%

Average cycles/op = 656 +- 1

Evgeny Voevodin (5):
  tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext.
  TCG: Use gen_opc_pc from context instead of global variable.
  TCG: Use gen_opc_icount from context instead of global variable.
  TCG: Use gen_opc_instr_start from context instead of global variable.
  TCG: Remove unused global gen_opc_ arrays.

 exec-all.h|4 
 target-alpha/translate.c  |   12 ++--
 target-arm/translate.c|   12 ++--
 target-cris/translate.c   |   14 +++---
 target-i386/translate.c   |   19 ++-
 target-lm32/translate.c   |   12 ++--
 target-m68k/translate.c   |   12 ++--
 target-microblaze/translate.c |   12 ++--
 target-mips/translate.c   |   12 ++--
 target-openrisc/translate.c   |   12 ++--
 target-ppc/translate.c|   12 ++--
 target-s390x/translate.c  |   12 ++--
 target-sh4/translate.c|   12 ++--
 target-sparc/translate.c  |   12 ++--
 target-unicore32/translate.c  |   12 ++--
 target-xtensa/translate.c |   10 +-
 tcg/tcg.h |3 +++
 translate-all.c   |9 +++--
 18 files changed, 100 insertions(+), 103 deletions(-)

-- 
1.7.9.5




Re: [Qemu-devel] [PATCH v6 0/7] TCG global variables clean-up

2012-11-15 Thread Evgeny Voevodin

On 11/12/2012 01:27 PM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied, I noticed 0.7% speed-up of code generation.
Probably, this is due to better data caching.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 731.5
max: 734.8
avg: 733.0
standard deviation: ~2 ~= 0.2%

Average cycles/op = 733 +- 2



After clean-up:
min: 725.0
max: 730.5
avg: 727.8
standard deviation: ~3 ~= 0.4%

Average cycles/op = 728 +- 3
Speed-up of TCG code generation = 0.7%

Changelog:
v5-v6:
Fixed broken patches.
Rebased.
v4-v5:
Rebased.
Fixed authorship.
All patches are reviewed-by Richard Henderson r...@twiddle.net
v3-v4:
Rebased.
Added target-cris/translate.c: Code style clean-up
v2-v3:
Removed tcg_cur_ctx since it gives slow-down on gcc-4.5.
Rebased.
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny Voevodin (7):
   target-cris/translate.c: Code style clean-up
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.
   TCG: Remove unused global variables

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   | 5040 +
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 +-
  tcg/tcg-op.h  |  324 +--
  tcg/tcg.c |   85 +-
  tcg/tcg.h |   10 +-
  translate-all.c   |3 -
  21 files changed, 2859 insertions(+), 2817 deletions(-)



Ping?

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




[Qemu-devel] [PATCH v6 0/7] TCG global variables clean-up

2012-11-12 Thread Evgeny Voevodin
This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied, I noticed 0.7% speed-up of code generation.
Probably, this is due to better data caching.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 731.5
max: 734.8
avg: 733.0
standard deviation: ~2 ~= 0.2%

Average cycles/op = 733 +- 2



After clean-up:
min: 725.0
max: 730.5
avg: 727.8
standard deviation: ~3 ~= 0.4%

Average cycles/op = 728 +- 3
Speed-up of TCG code generation = 0.7%

Changelog:
v5-v6:
Fixed broken patches.
Rebased.
v4-v5:
Rebased.
Fixed authorship.
All patches are reviewed-by Richard Henderson r...@twiddle.net
v3-v4:
Rebased.
Added target-cris/translate.c: Code style clean-up
v2-v3:
Removed tcg_cur_ctx since it gives slow-down on gcc-4.5.
Rebased.
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny Voevodin (7):
  target-cris/translate.c: Code style clean-up
  tcg/tcg.h: Duplicate global TCG variables in TCGContext
  TCG: Use gen_opc_ptr from context instead of global variable.
  TCG: Use gen_opparam_ptr from context instead of global variable.
  TCG: Use gen_opc_buf from context instead of global variable.
  TCG: Use gen_opparam_buf from context instead of global variable.
  TCG: Remove unused global variables

 gen-icount.h  |2 +-
 target-alpha/translate.c  |   10 +-
 target-arm/translate.c|   10 +-
 target-cris/translate.c   | 5040 +
 target-i386/translate.c   |   10 +-
 target-lm32/translate.c   |   13 +-
 target-m68k/translate.c   |   10 +-
 target-microblaze/translate.c |   13 +-
 target-mips/translate.c   |   11 +-
 target-openrisc/translate.c   |   13 +-
 target-ppc/translate.c|   11 +-
 target-s390x/translate.c  |   11 +-
 target-sh4/translate.c|   10 +-
 target-sparc/translate.c  |   10 +-
 target-unicore32/translate.c  |   10 +-
 target-xtensa/translate.c |8 +-
 tcg/optimize.c|   62 +-
 tcg/tcg-op.h  |  324 +--
 tcg/tcg.c |   85 +-
 tcg/tcg.h |   10 +-
 translate-all.c   |3 -
 21 files changed, 2859 insertions(+), 2817 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH v6 3/7] TCG: Use gen_opc_ptr from context instead of global variable.

2012-11-12 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 target-alpha/translate.c  |8 ++---
 target-arm/translate.c|8 ++---
 target-cris/translate.c   |   10 +++---
 target-i386/translate.c   |8 ++---
 target-lm32/translate.c   |   10 +++---
 target-m68k/translate.c   |8 ++---
 target-microblaze/translate.c |   10 +++---
 target-mips/translate.c   |9 +++---
 target-openrisc/translate.c   |   10 +++---
 target-ppc/translate.c|9 +++---
 target-s390x/translate.c  |9 +++---
 target-sh4/translate.c|8 ++---
 target-sparc/translate.c  |8 ++---
 target-unicore32/translate.c  |8 ++---
 target-xtensa/translate.c |6 ++--
 tcg/tcg-op.h  |   70 -
 tcg/tcg.c |   16 +-
 17 files changed, 109 insertions(+), 106 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 8c4dd02..f160f83 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3432,7 +3432,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
or exhaust instruction count, stop generation.  */
 if (ret == NO_EXIT
  ((ctx.pc  (TARGET_PAGE_SIZE - 1)) == 0
-|| gen_opc_ptr = gen_opc_end
+|| tcg_ctx.gen_opc_ptr = gen_opc_end
 || num_insns = max_insns
 || singlestep
 || env-singlestep_enabled)) {
@@ -3463,9 +3463,9 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 7d8f8e5..014f358 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9881,7 +9881,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
  * Also stop translation when a page boundary is reached.  This
  * ensures prefetch aborts occur at the right place.  */
 num_insns ++;
-} while (!dc-is_jmp  gen_opc_ptr  gen_opc_end 
+} while (!dc-is_jmp  tcg_ctx.gen_opc_ptr  gen_opc_end 
  !env-singlestep_enabled 
  !singlestep 
  dc-pc  next_page_start 
@@ -9962,7 +9962,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 done_generating:
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 023980e..02969d4 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 check_breakpoint(env, dc);
 
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j) {
@@ -3381,7 +3381,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 break;
 }
 } while (!dc-is_jmp  !dc-cpustate_changed
- gen_opc_ptr  gen_opc_end
+ tcg_ctx.gen_opc_ptr  gen_opc_end
  !singlestep
  (dc-pc  next_page_start)
  num_insns  max_insns);
@@ -3434,9 +3434,9 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 }
 }
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj

[Qemu-devel] [PATCH v6 5/7] TCG: Use gen_opc_buf from context instead of global variable.

2012-11-12 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 target-alpha/translate.c  |6 ++--
 target-arm/translate.c|6 ++--
 target-cris/translate.c   |8 +++---
 target-i386/translate.c   |6 ++--
 target-lm32/translate.c   |9 +++---
 target-m68k/translate.c   |6 ++--
 target-microblaze/translate.c |9 +++---
 target-mips/translate.c   |6 ++--
 target-openrisc/translate.c   |9 +++---
 target-ppc/translate.c|6 ++--
 target-s390x/translate.c  |6 ++--
 target-sh4/translate.c|6 ++--
 target-sparc/translate.c  |6 ++--
 target-unicore32/translate.c  |6 ++--
 target-xtensa/translate.c |4 +--
 tcg/optimize.c|   62 -
 tcg/tcg.c |   30 ++--
 17 files changed, 97 insertions(+), 94 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index f160f83..4045f78 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3373,7 +3373,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 int max_insns;
 
 pc_start = tb-pc;
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 ctx.tb = tb;
 ctx.env = env;
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3465,7 +3465,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 gen_icount_end(tb, num_insns);
 *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 014f358..c42110a 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9727,7 +9727,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-pc = pc_start;
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 02969d4..0b0e86d 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 dc-env = env;
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-ppc = pc_start;
@@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 check_breakpoint(env, dc);
 
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j) {
@@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 gen_icount_end(tb, num_insns);
 *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j) {
 gen_opc_instr_start[lj++] = 0;
@@ -3452,7 +3452,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 log_target_disas(env, pc_start, dc-pc - pc_start,
  dc-env-pregs[PR_VR]);
 qemu_log(\nisize=%d osize=%td\n,
-dc-pc - pc_start, tcg_ctx.gen_opc_ptr - gen_opc_buf);
+dc-pc - pc_start, tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf);
 }
 #endif
 #endif
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 2658bf2..8e676ba 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7962,7 +7962,7 @@ static inline void 
gen_intermediate_code_internal(CPUX86State *env,
 cpu_ptr0 = tcg_temp_new_ptr

[Qemu-devel] [PATCH v6 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.

2012-11-12 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 gen-icount.h |2 +-
 tcg/tcg-op.h |  254 +-
 tcg/tcg.c|   36 -
 3 files changed, 146 insertions(+), 146 deletions(-)

diff --git a/gen-icount.h b/gen-icount.h
index 430cb44..248cf5b 100644
--- a/gen-icount.h
+++ b/gen-icount.h
@@ -16,7 +16,7 @@ static inline void gen_icount_start(void)
 count = tcg_temp_local_new_i32();
 tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32));
 /* This is a horrid hack to allow fixing up the value later.  */
-icount_arg = gen_opparam_ptr + 1;
+icount_arg = tcg_ctx.gen_opparam_ptr + 1;
 tcg_gen_subi_i32(count, count, 0xdeadbeef);
 
 tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label);
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 9bc890f..0b3cb0b 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc)
 static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
 }
 
 static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
 }
 
 static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
 }
 
 static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
 }
 
 static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
 }
 
 static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
TCGv_i32 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
 }
 
 static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
TCGv_i64 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
 }
 
 static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1,
 TCGv_i32 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1,
 TCGv_i64 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void

[Qemu-devel] [PATCH v6 2/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext

2012-11-12 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 tcg/tcg.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index c2ae873..6ffec1d 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -450,6 +450,12 @@ struct TCGContext {
 int goto_tb_issue_mask;
 #endif
 
+uint16_t gen_opc_buf[OPC_BUF_SIZE];
+TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
+
+uint16_t *gen_opc_ptr;
+TCGArg *gen_opparam_ptr;
+
 #if defined(CONFIG_QEMU_LDST_OPTIMIZATION)  defined(CONFIG_SOFTMMU)
 /* labels info for qemu_ld/st IRs
The labels help to generate TLB miss case codes at the end of TB */
-- 
1.7.9.5




[Qemu-devel] [PATCH v6 6/7] TCG: Use gen_opparam_buf from context instead of global variable.

2012-11-12 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 tcg/tcg.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a039001..ea0bd3a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s)
 #endif
 
 s-gen_opc_ptr = s-gen_opc_buf;
-s-gen_opparam_ptr = gen_opparam_buf;
+s-gen_opparam_ptr = s-gen_opparam_buf;
 
 #if defined(CONFIG_QEMU_LDST_OPTIMIZATION)  defined(CONFIG_SOFTMMU)
 /* Initialize qemu_ld/st labels to assist code generation at the end of TB
@@ -897,7 +897,7 @@ void tcg_dump_ops(TCGContext *s)
 
 first_insn = 1;
 opc_ptr = s-gen_opc_buf;
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 while (opc_ptr  s-gen_opc_ptr) {
 c = *opc_ptr++;
 def = tcg_op_defs[c];
@@ -1440,8 +1440,9 @@ static void tcg_liveness_analysis(TCGContext *s)
 op_index--;
 }
 
-if (args != gen_opparam_buf)
+if (args != s-gen_opparam_buf) {
 tcg_abort();
+}
 }
 #else
 /* dummy liveness analysis */
@@ -,7 +2223,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 
 #ifdef USE_TCG_OPTIMIZATIONS
 s-gen_opparam_ptr =
-tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
+tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs);
 #endif
 
 #ifdef CONFIG_PROFILER
@@ -2249,7 +2250,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 s-code_buf = gen_code_buf;
 s-code_ptr = gen_code_buf;
 
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 op_index = 0;
 
 for(;;) {
-- 
1.7.9.5




[Qemu-devel] [PATCH v6 7/7] TCG: Remove unused global variables

2012-11-12 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 tcg/tcg.c   |4 
 tcg/tcg.h   |4 
 translate-all.c |3 ---
 3 files changed, 11 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index ea0bd3a..4f75696 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs);
 static TCGRegSet tcg_target_available_regs[2];
 static TCGRegSet tcg_target_call_clobber_regs;
 
-/* XXX: move that inside the context */
-uint16_t *gen_opc_ptr;
-TCGArg *gen_opparam_ptr;
-
 static inline void tcg_out8(TCGContext *s, uint8_t v)
 {
 *s-code_ptr++ = v;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 6ffec1d..9481e35 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -465,10 +465,6 @@ struct TCGContext {
 };
 
 extern TCGContext tcg_ctx;
-extern uint16_t *gen_opc_ptr;
-extern TCGArg *gen_opparam_ptr;
-extern uint16_t gen_opc_buf[];
-extern TCGArg gen_opparam_buf[];
 
 /* pool based memory allocation */
 
diff --git a/translate-all.c b/translate-all.c
index 5bd2d37..d9c2e57 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -33,9 +33,6 @@
 /* code generation context */
 TCGContext tcg_ctx;
 
-uint16_t gen_opc_buf[OPC_BUF_SIZE];
-TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-
 target_ulong gen_opc_pc[OPC_BUF_SIZE];
 uint16_t gen_opc_icount[OPC_BUF_SIZE];
 uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH v5 5/7] TCG: Use gen_opc_buf from context instead of global variable.

2012-11-11 Thread Evgeny Voevodin

On 11/10/2012 04:39 PM, Blue Swirl wrote:

On Tue, Nov 6, 2012 at 4:41 AM, Evgeny Voevodin e.voevo...@samsung.com wrote:

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
  target-alpha/translate.c  |6 ++--
  target-arm/translate.c|6 ++--
  target-cris/translate.c   |9 +++---
  target-i386/translate.c   |6 ++--
  target-lm32/translate.c   |9 +++---
  target-m68k/translate.c   |6 ++--
  target-microblaze/translate.c |9 +++---
  target-mips/translate.c   |6 ++--
  target-openrisc/translate.c   |9 +++---
  target-ppc/translate.c|6 ++--
  target-s390x/translate.c  |6 ++--
  target-sh4/translate.c|6 ++--
  target-sparc/translate.c  |6 ++--
  target-unicore32/translate.c  |6 ++--
  target-xtensa/translate.c |4 +--
  tcg/optimize.c|   62 -
  tcg/tcg.c |   30 ++--
  17 files changed, 98 insertions(+), 94 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 6676cbf..91c761a 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3373,7 +3373,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
  int max_insns;

  pc_start = tb-pc;
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;

  ctx.tb = tb;
  ctx.env = env;
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
  }
  }
  if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
  if (lj  j) {
  lj++;
  while (lj  j)
@@ -3465,7 +3465,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
  gen_icount_end(tb, num_insns);
  *tcg_ctx.gen_opc_ptr = INDEX_op_end;
  if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
  lj++;
  while (lj = j)
  gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index ff5d294..0602b31 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9727,7 +9727,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,

  dc-tb = tb;

-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;

  dc-is_jmp = DISAS_NEXT;
  dc-pc = pc_start;
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
  }
  }
  if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
  if (lj  j) {
  lj++;
  while (lj  j)
@@ -9974,7 +9974,7 @@ done_generating:
  }
  #endif
  if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
  lj++;
  while (lj = j)
  gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index e34288e..0adc07b 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
  dc-env = env;
  dc-tb = tb;

-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;

  dc-is_jmp = DISAS_NEXT;
  dc-ppc = pc_start;
@@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
  check_breakpoint(env, dc);

  if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
  if (lj  j) {
  lj++;
  while (lj  j) {
@@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
  gen_icount_end(tb, num_insns);
  *tcg_ctx.gen_opc_ptr = INDEX_op_end;
  if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
  lj++;
  while (lj = j) {
  gen_opc_instr_start[lj++] = 0;
@@ -3452,7 +3452,8 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
  log_target_disas(pc_start, dc-pc - pc_start,
   dc-env-pregs[PR_VR]);
  qemu_log(\nisize=%d osize=%td\n,
-dc-pc - pc_start, gtcg_ctx.en_opc_ptr - gen_opc_buf);
+dc-pc - pc_start, gtcg_ctx.en_opc_ptr - tcg_ctx.gen_opc_buf);
+tcg_ctx.gen_opc_buf);


Broken patch:
/src/qemu/target-cris/translate.c:3456: error: statement with no effect
/src

Re: [Qemu-devel] [PATCH v5 0/7] TCG global variables clean-up

2012-11-08 Thread Evgeny Voevodin

On 11/06/2012 08:41 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied, I noticed 0.7% speed-up of code generation.
Probably, this is due to better data caching.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 731.5
max: 734.8
avg: 733.0
standard deviation: ~2 ~= 0.2%

Average cycles/op = 733 +- 2



After clean-up:
min: 725.0
max: 730.5
avg: 727.8
standard deviation: ~3 ~= 0.4%

Average cycles/op = 728 +- 3
Speed-up of TCG code generation = 0.7%

Changelog:
v4-v5:
Rebased.
Fixed authorship.
All patches are reviewed-by Richard Henderson r...@twiddle.net
v3-v4:
Rebased.
Added target-cris/translate.c: Code style clean-up
v2-v3:
Removed tcg_cur_ctx since it gives slow-down on gcc-4.5.
Rebased.
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny Voevodin (7):
   target-cris/translate.c: Code style clean-up
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.
   TCG: Remove unused global variables

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   | 5041 +
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 +-
  tcg/tcg-op.h  |  324 +--
  tcg/tcg.c |   85 +-
  tcg/tcg.h |   10 +-
  translate-all.c   |3 -
  21 files changed, 2860 insertions(+), 2817 deletions(-)




Is anybody going to apply this before I have to rebase again?

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com



[Qemu-devel] [PATCH v5 0/7] TCG global variables clean-up

2012-11-05 Thread Evgeny Voevodin
This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied, I noticed 0.7% speed-up of code generation.
Probably, this is due to better data caching.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 731.5
max: 734.8
avg: 733.0
standard deviation: ~2 ~= 0.2%

Average cycles/op = 733 +- 2



After clean-up:
min: 725.0
max: 730.5
avg: 727.8
standard deviation: ~3 ~= 0.4%

Average cycles/op = 728 +- 3
Speed-up of TCG code generation = 0.7%

Changelog:
v4-v5:
Rebased.
Fixed authorship.
All patches are reviewed-by Richard Henderson r...@twiddle.net
v3-v4:
Rebased.
Added target-cris/translate.c: Code style clean-up
v2-v3:
Removed tcg_cur_ctx since it gives slow-down on gcc-4.5.
Rebased.
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny Voevodin (7):
  target-cris/translate.c: Code style clean-up
  tcg/tcg.h: Duplicate global TCG variables in TCGContext
  TCG: Use gen_opc_ptr from context instead of global variable.
  TCG: Use gen_opparam_ptr from context instead of global variable.
  TCG: Use gen_opc_buf from context instead of global variable.
  TCG: Use gen_opparam_buf from context instead of global variable.
  TCG: Remove unused global variables

 gen-icount.h  |2 +-
 target-alpha/translate.c  |   10 +-
 target-arm/translate.c|   10 +-
 target-cris/translate.c   | 5041 +
 target-i386/translate.c   |   10 +-
 target-lm32/translate.c   |   13 +-
 target-m68k/translate.c   |   10 +-
 target-microblaze/translate.c |   13 +-
 target-mips/translate.c   |   11 +-
 target-openrisc/translate.c   |   13 +-
 target-ppc/translate.c|   11 +-
 target-s390x/translate.c  |   11 +-
 target-sh4/translate.c|   10 +-
 target-sparc/translate.c  |   10 +-
 target-unicore32/translate.c  |   10 +-
 target-xtensa/translate.c |8 +-
 tcg/optimize.c|   62 +-
 tcg/tcg-op.h  |  324 +--
 tcg/tcg.c |   85 +-
 tcg/tcg.h |   10 +-
 translate-all.c   |3 -
 21 files changed, 2860 insertions(+), 2817 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH v5 2/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext

2012-11-05 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 tcg/tcg.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index c2ae873..6ffec1d 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -450,6 +450,12 @@ struct TCGContext {
 int goto_tb_issue_mask;
 #endif
 
+uint16_t gen_opc_buf[OPC_BUF_SIZE];
+TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
+
+uint16_t *gen_opc_ptr;
+TCGArg *gen_opparam_ptr;
+
 #if defined(CONFIG_QEMU_LDST_OPTIMIZATION)  defined(CONFIG_SOFTMMU)
 /* labels info for qemu_ld/st IRs
The labels help to generate TLB miss case codes at the end of TB */
-- 
1.7.9.5




[Qemu-devel] [PATCH v5 6/7] TCG: Use gen_opparam_buf from context instead of global variable.

2012-11-05 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 tcg/tcg.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index ea27bd4..d281af9 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s)
 #endif
 
 s-gen_opc_ptr = s-gen_opc_buf;
-s-gen_opparam_ptr = gen_opparam_buf;
+s-gen_opparam_ptr = s-gen_opparam_buf;
 
 #if defined(CONFIG_QEMU_LDST_OPTIMIZATION)  defined(CONFIG_SOFTMMU)
 /* Initialize qemu_ld/st labels to assist code generation at the end of TB
@@ -897,7 +897,7 @@ void tcg_dump_ops(TCGContext *s)
 
 first_insn = 1;
 opc_ptr = s-gen_opc_buf;
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 while (opc_ptr  s-gen_opc_ptr) {
 c = *opc_ptr++;
 def = tcg_op_defs[c];
@@ -1440,8 +1440,9 @@ static void tcg_liveness_analysis(TCGContext *s)
 op_index--;
 }
 
-if (args != gen_opparam_buf)
+if (args != s-gen_opparam_buf) {
 tcg_abort();
+}
 }
 #else
 /* dummy liveness analysis */
@@ -,7 +2223,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 
 #ifdef USE_TCG_OPTIMIZATIONS
 s-gen_opparam_ptr =
-tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
+tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs);
 #endif
 
 #ifdef CONFIG_PROFILER
@@ -2249,7 +2250,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 s-code_buf = gen_code_buf;
 s-code_ptr = gen_code_buf;
 
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 op_index = 0;
 
 for(;;) {
-- 
1.7.9.5




[Qemu-devel] [PATCH v5 5/7] TCG: Use gen_opc_buf from context instead of global variable.

2012-11-05 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 target-alpha/translate.c  |6 ++--
 target-arm/translate.c|6 ++--
 target-cris/translate.c   |9 +++---
 target-i386/translate.c   |6 ++--
 target-lm32/translate.c   |9 +++---
 target-m68k/translate.c   |6 ++--
 target-microblaze/translate.c |9 +++---
 target-mips/translate.c   |6 ++--
 target-openrisc/translate.c   |9 +++---
 target-ppc/translate.c|6 ++--
 target-s390x/translate.c  |6 ++--
 target-sh4/translate.c|6 ++--
 target-sparc/translate.c  |6 ++--
 target-unicore32/translate.c  |6 ++--
 target-xtensa/translate.c |4 +--
 tcg/optimize.c|   62 -
 tcg/tcg.c |   30 ++--
 17 files changed, 98 insertions(+), 94 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 6676cbf..91c761a 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3373,7 +3373,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 int max_insns;
 
 pc_start = tb-pc;
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 ctx.tb = tb;
 ctx.env = env;
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3465,7 +3465,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 gen_icount_end(tb, num_insns);
 *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index ff5d294..0602b31 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9727,7 +9727,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-pc = pc_start;
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index e34288e..0adc07b 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 dc-env = env;
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-ppc = pc_start;
@@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 check_breakpoint(env, dc);
 
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j) {
@@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 gen_icount_end(tb, num_insns);
 *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j) {
 gen_opc_instr_start[lj++] = 0;
@@ -3452,7 +3452,8 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 log_target_disas(pc_start, dc-pc - pc_start,
  dc-env-pregs[PR_VR]);
 qemu_log(\nisize=%d osize=%td\n,
-dc-pc - pc_start, gtcg_ctx.en_opc_ptr - gen_opc_buf);
+dc-pc - pc_start, gtcg_ctx.en_opc_ptr - tcg_ctx.gen_opc_buf);
+tcg_ctx.gen_opc_buf);
 }
 #endif
 #endif
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 5f977d9..1563677 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7958,7 +7958,7 @@ static inline void 
gen_intermediate_code_internal(CPUX86State *env

[Qemu-devel] [PATCH v5 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.

2012-11-05 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 gen-icount.h |2 +-
 tcg/tcg-op.h |  254 +-
 tcg/tcg.c|   36 -
 3 files changed, 146 insertions(+), 146 deletions(-)

diff --git a/gen-icount.h b/gen-icount.h
index 430cb44..248cf5b 100644
--- a/gen-icount.h
+++ b/gen-icount.h
@@ -16,7 +16,7 @@ static inline void gen_icount_start(void)
 count = tcg_temp_local_new_i32();
 tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32));
 /* This is a horrid hack to allow fixing up the value later.  */
-icount_arg = gen_opparam_ptr + 1;
+icount_arg = tcg_ctx.gen_opparam_ptr + 1;
 tcg_gen_subi_i32(count, count, 0xdeadbeef);
 
 tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label);
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 9bc890f..0b3cb0b 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc)
 static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
 }
 
 static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
 }
 
 static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
 }
 
 static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
 }
 
 static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
 }
 
 static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
TCGv_i32 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
 }
 
 static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
TCGv_i64 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
 }
 
 static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1,
 TCGv_i32 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1,
 TCGv_i64 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void

[Qemu-devel] [PATCH v5 7/7] TCG: Remove unused global variables

2012-11-05 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 tcg/tcg.c   |4 
 tcg/tcg.h   |4 
 translate-all.c |3 ---
 3 files changed, 11 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index d281af9..359be16 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs);
 static TCGRegSet tcg_target_available_regs[2];
 static TCGRegSet tcg_target_call_clobber_regs;
 
-/* XXX: move that inside the context */
-uint16_t *gen_opc_ptr;
-TCGArg *gen_opparam_ptr;
-
 static inline void tcg_out8(TCGContext *s, uint8_t v)
 {
 *s-code_ptr++ = v;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 6ffec1d..9481e35 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -465,10 +465,6 @@ struct TCGContext {
 };
 
 extern TCGContext tcg_ctx;
-extern uint16_t *gen_opc_ptr;
-extern TCGArg *gen_opparam_ptr;
-extern uint16_t gen_opc_buf[];
-extern TCGArg gen_opparam_buf[];
 
 /* pool based memory allocation */
 
diff --git a/translate-all.c b/translate-all.c
index 5bd2d37..d9c2e57 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -33,9 +33,6 @@
 /* code generation context */
 TCGContext tcg_ctx;
 
-uint16_t gen_opc_buf[OPC_BUF_SIZE];
-TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-
 target_ulong gen_opc_pc[OPC_BUF_SIZE];
 uint16_t gen_opc_icount[OPC_BUF_SIZE];
 uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-- 
1.7.9.5




[Qemu-devel] [PATCH v5 3/7] TCG: Use gen_opc_ptr from context instead of global variable.

2012-11-05 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Richard Henderson r...@twiddle.net
---
 target-alpha/translate.c  |8 ++---
 target-arm/translate.c|8 ++---
 target-cris/translate.c   |   10 +++---
 target-i386/translate.c   |8 ++---
 target-lm32/translate.c   |   10 +++---
 target-m68k/translate.c   |8 ++---
 target-microblaze/translate.c |   10 +++---
 target-mips/translate.c   |9 +++---
 target-openrisc/translate.c   |   10 +++---
 target-ppc/translate.c|9 +++---
 target-s390x/translate.c  |9 +++---
 target-sh4/translate.c|8 ++---
 target-sparc/translate.c  |8 ++---
 target-unicore32/translate.c  |8 ++---
 target-xtensa/translate.c |6 ++--
 tcg/tcg-op.h  |   70 -
 tcg/tcg.c |   16 +-
 17 files changed, 109 insertions(+), 106 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index f707d8d..6676cbf 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3432,7 +3432,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
or exhaust instruction count, stop generation.  */
 if (ret == NO_EXIT
  ((ctx.pc  (TARGET_PAGE_SIZE - 1)) == 0
-|| gen_opc_ptr = gen_opc_end
+|| tcg_ctx.gen_opc_ptr = gen_opc_end
 || num_insns = max_insns
 || singlestep
 || env-singlestep_enabled)) {
@@ -3463,9 +3463,9 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 25433da..ff5d294 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9881,7 +9881,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
  * Also stop translation when a page boundary is reached.  This
  * ensures prefetch aborts occur at the right place.  */
 num_insns ++;
-} while (!dc-is_jmp  gen_opc_ptr  gen_opc_end 
+} while (!dc-is_jmp  tcg_ctx.gen_opc_ptr  gen_opc_end 
  !env-singlestep_enabled 
  !singlestep 
  dc-pc  next_page_start 
@@ -9962,7 +9962,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 done_generating:
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 27b82cf..e34288e 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 check_breakpoint(env, dc);
 
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j) {
@@ -3381,7 +3381,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 break;
 }
 } while (!dc-is_jmp  !dc-cpustate_changed
- gen_opc_ptr  gen_opc_end
+ tcg_ctx.gen_opc_ptr  gen_opc_end
  !singlestep
  (dc-pc  next_page_start)
  num_insns  max_insns);
@@ -3434,9 +3434,9 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 }
 }
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj

Re: [Qemu-devel] [PATCH v4 0/7] TCG global variables clean-up

2012-11-01 Thread Evgeny Voevodin

On 10/31/2012 11:01 PM, Richard Henderson wrote:

On 2012-10-31 16:19, Evgeny Voevodin wrote:

Evgeny (2):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Remove unused global variables

Evgeny Voevodin (5):
   target-cris/translate.c: Code style clean-up
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   | 5041 +
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 +-
  tcg/tcg-op.h  |  324 +--
  tcg/tcg.c |   85 +-
  tcg/tcg.h |   10 +-
  translate-all.c   |3 -
  21 files changed, 2860 insertions(+), 2817 deletions(-)

Reviewed-by: Richard Henderson r...@twiddle.net


r~




Thanks. Will anybody apply this?

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [PATCH v3 0/6] TCG global variables clean-up

2012-10-30 Thread Evgeny Voevodin

On 10/30/2012 10:46 PM, Blue Swirl wrote:

On Mon, Oct 29, 2012 at 9:14 AM, Evgeny Voevodin e.voevo...@samsung.com wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Patches don't apply, please rebase.


Ok. When I sent they applied correctly. I'll rebase.


  Also checkpatch.pl complains about tabs.


There are tabs everywhere in the target-cris/translate.c
Should I remove all tabs from patches only or from whole file?
Actually clean-up tabs was out of my scope...




Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied, I noticed 0.7% speed-up of code generation.
Probably, this is due to better data caching.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 731.5
max: 734.8
avg: 733.0
standard deviation: ~2 ~= 0.2%

Average cycles/op = 733 +- 2



After clean-up:
min: 725.0
max: 730.5
avg: 727.8
standard deviation: ~3 ~= 0.4%

Average cycles/op = 728 +- 3
Speed-up of TCG code generation = 0.7%

Changelog:
v2-v3:
Removed tcg_cur_ctx since it gives slow-down on gcc-4.5.
Rebased.
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Remove unused global variables

Evgeny Voevodin (4):
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   |   13 +-
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 
  tcg/tcg-op.h  |  324 -
  tcg/tcg.c |   85 ++-
  tcg/tcg.h |   10 +-
  translate-all.c   |3 -
  21 files changed, 326 insertions(+), 323 deletions(-)

--
1.7.9.5




--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




[Qemu-devel] [PATCH v4 3/7] TCG: Use gen_opc_ptr from context instead of global variable.

2012-10-30 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |8 ++---
 target-arm/translate.c|8 ++---
 target-cris/translate.c   |   10 +++---
 target-i386/translate.c   |8 ++---
 target-lm32/translate.c   |   10 +++---
 target-m68k/translate.c   |8 ++---
 target-microblaze/translate.c |   10 +++---
 target-mips/translate.c   |9 +++---
 target-openrisc/translate.c   |   10 +++---
 target-ppc/translate.c|9 +++---
 target-s390x/translate.c  |9 +++---
 target-sh4/translate.c|8 ++---
 target-sparc/translate.c  |8 ++---
 target-unicore32/translate.c  |8 ++---
 target-xtensa/translate.c |6 ++--
 tcg/tcg-op.h  |   70 -
 tcg/tcg.c |   16 +-
 17 files changed, 109 insertions(+), 106 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index f707d8d..6676cbf 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3432,7 +3432,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
or exhaust instruction count, stop generation.  */
 if (ret == NO_EXIT
  ((ctx.pc  (TARGET_PAGE_SIZE - 1)) == 0
-|| gen_opc_ptr = gen_opc_end
+|| tcg_ctx.gen_opc_ptr = gen_opc_end
 || num_insns = max_insns
 || singlestep
 || env-singlestep_enabled)) {
@@ -3463,9 +3463,9 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 25433da..ff5d294 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9881,7 +9881,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
  * Also stop translation when a page boundary is reached.  This
  * ensures prefetch aborts occur at the right place.  */
 num_insns ++;
-} while (!dc-is_jmp  gen_opc_ptr  gen_opc_end 
+} while (!dc-is_jmp  tcg_ctx.gen_opc_ptr  gen_opc_end 
  !env-singlestep_enabled 
  !singlestep 
  dc-pc  next_page_start 
@@ -9962,7 +9962,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 done_generating:
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 27b82cf..e34288e 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 check_breakpoint(env, dc);
 
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j) {
@@ -3381,7 +3381,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 break;
 }
 } while (!dc-is_jmp  !dc-cpustate_changed
- gen_opc_ptr  gen_opc_end
+ tcg_ctx.gen_opc_ptr  gen_opc_end
  !singlestep
  (dc-pc  next_page_start)
  num_insns  max_insns);
@@ -3434,9 +3434,9 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 }
 }
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j) {
 gen_opc_instr_start[lj

[Qemu-devel] [PATCH v4 0/7] TCG global variables clean-up

2012-10-30 Thread Evgeny Voevodin
This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied, I noticed 0.7% speed-up of code generation.
Probably, this is due to better data caching.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 731.5
max: 734.8
avg: 733.0
standard deviation: ~2 ~= 0.2%

Average cycles/op = 733 +- 2



After clean-up:
min: 725.0
max: 730.5
avg: 727.8
standard deviation: ~3 ~= 0.4%

Average cycles/op = 728 +- 3
Speed-up of TCG code generation = 0.7%

Changelog:
v3-v4:
Rebased.
Added target-cris/translate.c: Code style clean-up
v2-v3:
Removed tcg_cur_ctx since it gives slow-down on gcc-4.5.
Rebased.
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
  tcg/tcg.h: Duplicate global TCG variables in TCGContext
  TCG: Remove unused global variables

Evgeny Voevodin (5):
  target-cris/translate.c: Code style clean-up
  TCG: Use gen_opc_ptr from context instead of global variable.
  TCG: Use gen_opparam_ptr from context instead of global variable.
  TCG: Use gen_opc_buf from context instead of global variable.
  TCG: Use gen_opparam_buf from context instead of global variable.

 gen-icount.h  |2 +-
 target-alpha/translate.c  |   10 +-
 target-arm/translate.c|   10 +-
 target-cris/translate.c   | 5041 +
 target-i386/translate.c   |   10 +-
 target-lm32/translate.c   |   13 +-
 target-m68k/translate.c   |   10 +-
 target-microblaze/translate.c |   13 +-
 target-mips/translate.c   |   11 +-
 target-openrisc/translate.c   |   13 +-
 target-ppc/translate.c|   11 +-
 target-s390x/translate.c  |   11 +-
 target-sh4/translate.c|   10 +-
 target-sparc/translate.c  |   10 +-
 target-unicore32/translate.c  |   10 +-
 target-xtensa/translate.c |8 +-
 tcg/optimize.c|   62 +-
 tcg/tcg-op.h  |  324 +--
 tcg/tcg.c |   85 +-
 tcg/tcg.h |   10 +-
 translate-all.c   |3 -
 21 files changed, 2860 insertions(+), 2817 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH v4 2/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext

2012-10-30 Thread Evgeny Voevodin
From: Evgeny e.voevo...@samsung.com

Signed-off-by: Evgeny e.voevo...@samsung.com
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index a6c9256..b229061 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -431,6 +431,12 @@ struct TCGContext {
 int temps_in_use;
 int goto_tb_issue_mask;
 #endif
+
+uint16_t gen_opc_buf[OPC_BUF_SIZE];
+TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
+
+uint16_t *gen_opc_ptr;
+TCGArg *gen_opparam_ptr;
 };
 
 extern TCGContext tcg_ctx;
-- 
1.7.9.5




[Qemu-devel] [PATCH v4 6/7] TCG: Use gen_opparam_buf from context instead of global variable.

2012-10-30 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a7c3832..1fd1731 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s)
 #endif
 
 s-gen_opc_ptr = s-gen_opc_buf;
-s-gen_opparam_ptr = gen_opparam_buf;
+s-gen_opparam_ptr = s-gen_opparam_buf;
 }
 
 static inline void tcg_temp_alloc(TCGContext *s, int n)
@@ -889,7 +889,7 @@ void tcg_dump_ops(TCGContext *s)
 
 first_insn = 1;
 opc_ptr = s-gen_opc_buf;
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 while (opc_ptr  s-gen_opc_ptr) {
 c = *opc_ptr++;
 def = tcg_op_defs[c];
@@ -1432,8 +1432,9 @@ static void tcg_liveness_analysis(TCGContext *s)
 op_index--;
 }
 
-if (args != gen_opparam_buf)
+if (args != s-gen_opparam_buf) {
 tcg_abort();
+}
 }
 #else
 /* dummy liveness analysis */
@@ -2214,7 +2215,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 
 #ifdef USE_TCG_OPTIMIZATIONS
 s-gen_opparam_ptr =
-tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
+tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs);
 #endif
 
 #ifdef CONFIG_PROFILER
@@ -2241,7 +2242,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 s-code_buf = gen_code_buf;
 s-code_ptr = gen_code_buf;
 
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 op_index = 0;
 
 for(;;) {
-- 
1.7.9.5




[Qemu-devel] [PATCH v4 5/7] TCG: Use gen_opc_buf from context instead of global variable.

2012-10-30 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |6 ++--
 target-arm/translate.c|6 ++--
 target-cris/translate.c   |9 +++---
 target-i386/translate.c   |6 ++--
 target-lm32/translate.c   |9 +++---
 target-m68k/translate.c   |6 ++--
 target-microblaze/translate.c |9 +++---
 target-mips/translate.c   |6 ++--
 target-openrisc/translate.c   |9 +++---
 target-ppc/translate.c|6 ++--
 target-s390x/translate.c  |6 ++--
 target-sh4/translate.c|6 ++--
 target-sparc/translate.c  |6 ++--
 target-unicore32/translate.c  |6 ++--
 target-xtensa/translate.c |4 +--
 tcg/optimize.c|   62 -
 tcg/tcg.c |   30 ++--
 17 files changed, 98 insertions(+), 94 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 6676cbf..91c761a 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3373,7 +3373,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 int max_insns;
 
 pc_start = tb-pc;
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 ctx.tb = tb;
 ctx.env = env;
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3465,7 +3465,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 gen_icount_end(tb, num_insns);
 *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index ff5d294..0602b31 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9727,7 +9727,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-pc = pc_start;
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index e34288e..0adc07b 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3232,7 +3232,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 dc-env = env;
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-ppc = pc_start;
@@ -3297,7 +3297,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 check_breakpoint(env, dc);
 
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j) {
@@ -3436,7 +3436,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 gen_icount_end(tb, num_insns);
 *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j) {
 gen_opc_instr_start[lj++] = 0;
@@ -3452,7 +3452,8 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 log_target_disas(pc_start, dc-pc - pc_start,
  dc-env-pregs[PR_VR]);
 qemu_log(\nisize=%d osize=%td\n,
-dc-pc - pc_start, gtcg_ctx.en_opc_ptr - gen_opc_buf);
+dc-pc - pc_start, gtcg_ctx.en_opc_ptr - tcg_ctx.gen_opc_buf);
+tcg_ctx.gen_opc_buf);
 }
 #endif
 #endif
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 5f977d9..1563677 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7958,7 +7958,7 @@ static inline void 
gen_intermediate_code_internal(CPUX86State *env,
 cpu_ptr0 = tcg_temp_new_ptr();
 cpu_ptr1

[Qemu-devel] [PATCH v4 7/7] TCG: Remove unused global variables

2012-10-30 Thread Evgeny Voevodin
From: Evgeny e.voevo...@samsung.com

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.c   |4 
 tcg/tcg.h   |4 
 translate-all.c |3 ---
 3 files changed, 11 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 1fd1731..a9c9d6f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs);
 static TCGRegSet tcg_target_available_regs[2];
 static TCGRegSet tcg_target_call_clobber_regs;
 
-/* XXX: move that inside the context */
-uint16_t *gen_opc_ptr;
-TCGArg *gen_opparam_ptr;
-
 static inline void tcg_out8(TCGContext *s, uint8_t v)
 {
 *s-code_ptr++ = v;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index b229061..c09c188 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -440,10 +440,6 @@ struct TCGContext {
 };
 
 extern TCGContext tcg_ctx;
-extern uint16_t *gen_opc_ptr;
-extern TCGArg *gen_opparam_ptr;
-extern uint16_t gen_opc_buf[];
-extern TCGArg gen_opparam_buf[];
 
 /* pool based memory allocation */
 
diff --git a/translate-all.c b/translate-all.c
index 5bd2d37..d9c2e57 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -33,9 +33,6 @@
 /* code generation context */
 TCGContext tcg_ctx;
 
-uint16_t gen_opc_buf[OPC_BUF_SIZE];
-TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-
 target_ulong gen_opc_pc[OPC_BUF_SIZE];
 uint16_t gen_opc_icount[OPC_BUF_SIZE];
 uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-- 
1.7.9.5




[Qemu-devel] [PATCH v4 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.

2012-10-30 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 gen-icount.h |2 +-
 tcg/tcg-op.h |  254 +-
 tcg/tcg.c|   36 -
 3 files changed, 146 insertions(+), 146 deletions(-)

diff --git a/gen-icount.h b/gen-icount.h
index 430cb44..248cf5b 100644
--- a/gen-icount.h
+++ b/gen-icount.h
@@ -16,7 +16,7 @@ static inline void gen_icount_start(void)
 count = tcg_temp_local_new_i32();
 tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32));
 /* This is a horrid hack to allow fixing up the value later.  */
-icount_arg = gen_opparam_ptr + 1;
+icount_arg = tcg_ctx.gen_opparam_ptr + 1;
 tcg_gen_subi_i32(count, count, 0xdeadbeef);
 
 tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label);
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 9bc890f..0b3cb0b 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc)
 static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
 }
 
 static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
 }
 
 static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
 }
 
 static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
 }
 
 static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
 }
 
 static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
TCGv_i32 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
 }
 
 static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
TCGv_i64 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
 }
 
 static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1,
 TCGv_i32 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1,
 TCGv_i64 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void tcg_gen_ldst_op_i32(TCGOpcode opc, TCGv_i32 val

Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up

2012-10-29 Thread Evgeny Voevodin

On 10/27/2012 06:34 PM, Blue Swirl wrote:

On Fri, Oct 26, 2012 at 6:32 AM, Evgeny Voevodin e.voevo...@samsung.com wrote:

Today I made more precise testing with usage of --enable-profiler.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results:

Before clean-up:
min: 731.9
max: 735.8
avg: 734.3
standard deviation: ~2 = 0.3%
Avarage cycles/op = 734 +- 2

After clean-up:
min: 747.2
max: 751.7
avg: 750.5
standard deviation: ~2 = 0.3%
Avarage cycles/op = 750 +- 2
Slow-down of TCG code generation = 2.2%


After clean-up with TCGContext *const tcg_cur_ctx:
min: 730.6
max: 733.2
avg: 728.7
standard deviation: ~2 = 0.3%
Avarage cycles/op = 729 +- 2
Slow-down of TCG code generation = 0%

I suggest to define tcg_cur_ctx as TCGContext *const.
Then we will get rid of TCG code generation slow-down and also
will have no usage of global variables.

How does this compare with the original version without pointers? I
think that version may be safer to be assumed to be optimized by the
compiler.


I did more testing with different gcc versions and different patch series:

gcc verion   v1 clean-up, no pointer   v2 clean-up, const pointer   
   master
gcc-4.4754.3   752.1 
  769.8
gcc-4.5770.8   779.8 
  774.8
gcc-4.6731.8   729.8 
  737


Conclusion:
 - First clean-up series without pointer operates faster than master in 
all cases. It's probably because

   data is cached more efficiently.
 - Second clean-up series with constant pointer operates faster than 
master in the case of gcc-4.4 and gcc-4.6.
   In the case of gcc-4.5 it seems that const pointer is not optimised 
as I assumed.


I think that it's worth to generate third series without pointer and 
with code clean-up included in second.


How do you think?



On 10/25/2012 10:45 AM, Evgeny Voevodin wrote:

Here are the results of tests before and after this patch series was
applied:

* EEMBC CoreMark (before - after)
- Guest: Exynos4210 ARMv7, Linux (Custom buildroot image)
- Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux
- Results: 1148.105626 - 1161.440186 (+1.16%)

* nbench (before - after)
- Guest: Exynos4210 ARMv7, Linux (Custom buildroot image)
- Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux
- Results
  . MEMORY INDEX: 1.864 - 1.862 (-0.11%)
  . INTEGER INDEX: 2.518 - 2.523 (+0.2%)
  . FLOATING-POINT INDEX: 0.385 - 0.394 (+2.34%)


Those tests show that it became even faster :))

But I'm quite sceptical about such results.
The thing is that in case of nbench it prints the warning if results are
not 95% statistically accurate.
So we can be sure that nbench result is 95% accurate.
And it's obvious that result shown above are in the scope of this
accuracy.
I don't know the accuracy of CoreMark.

So, the main decision we can make that this patch series didn't
introduce any slow-down comparable to inaccuracy of the measurement.

Is this enough?

On 10/23/2012 10:21 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Changelog:
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where
we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
tcg/tcg.h: Duplicate global TCG variables in TCGContext
TCG: Remove unused global variables

Evgeny Voevodin (5):
translate-all.c: Introduce TCGContext *tcg_cur_ctx
TCG: Use gen_opc_ptr from context instead of global variable.
TCG: Use gen_opparam_ptr from context instead of global variable.
TCG: Use gen_opc_buf from context instead of global variable.
TCG: Use gen_opparam_buf from context instead of global variable.

   gen-icount.h  |2 +-
   target-alpha/translate.c  |   10 +-
   target-arm/translate.c|   10 +-
   target-cris/translate.c   |   13 +-
   target-i386/translate.c   |   10 +-
   target-lm32/translate.c   |   13 +-
   target-m68k/translate.c   |   10 +-
   target-microblaze/translate.c |   13 +-
   target-mips/translate.c   |   11 +-
   target-openrisc/translate.c   |   13 +-
   target-ppc/translate.c|   11 +-
   target-s390x/translate.c  |   11 +-
   target-sh4/translate.c|   10 +-
   target-sparc/translate.c  |   10 +-
   target-unicore32/translate.c  |   10 +-
   target-xtensa/translate.c |8 +-
   tcg/optimize.c|   62 
   tcg/tcg-op.h  |  324
-
   tcg/tcg.c

[Qemu-devel] [PATCH v3 3/6] TCG: Use gen_opparam_ptr from context instead of global variable.

2012-10-29 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 gen-icount.h |2 +-
 tcg/tcg-op.h |  254 +-
 tcg/tcg.c|   36 -
 3 files changed, 146 insertions(+), 146 deletions(-)

diff --git a/gen-icount.h b/gen-icount.h
index 430cb44..248cf5b 100644
--- a/gen-icount.h
+++ b/gen-icount.h
@@ -16,7 +16,7 @@ static inline void gen_icount_start(void)
 count = tcg_temp_local_new_i32();
 tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32));
 /* This is a horrid hack to allow fixing up the value later.  */
-icount_arg = gen_opparam_ptr + 1;
+icount_arg = tcg_ctx.gen_opparam_ptr + 1;
 tcg_gen_subi_i32(count, count, 0xdeadbeef);
 
 tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label);
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 50b1c62..d6daea4 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc)
 static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
 }
 
 static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
 }
 
 static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
 }
 
 static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
 }
 
 static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
 }
 
 static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
-*gen_opparam_ptr++ = arg2;
+*tcg_ctx.gen_opparam_ptr++ = arg1;
+*tcg_ctx.gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
TCGv_i32 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
 }
 
 static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
TCGv_i64 arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg3);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
 }
 
 static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1,
 TCGv_i32 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1,
 TCGv_i64 arg2, TCGArg arg3)
 {
 *tcg_ctx.gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_ctx.gen_opparam_ptr++ = arg3;
 }
 
 static inline void tcg_gen_ldst_op_i32(TCGOpcode opc, TCGv_i32 val

[Qemu-devel] [PATCH v3 5/6] TCG: Use gen_opparam_buf from context instead of global variable.

2012-10-29 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index c4e663b..f332463 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s)
 #endif
 
 s-gen_opc_ptr = s-gen_opc_buf;
-s-gen_opparam_ptr = gen_opparam_buf;
+s-gen_opparam_ptr = s-gen_opparam_buf;
 }
 
 static inline void tcg_temp_alloc(TCGContext *s, int n)
@@ -885,7 +885,7 @@ void tcg_dump_ops(TCGContext *s)
 
 first_insn = 1;
 opc_ptr = s-gen_opc_buf;
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 while (opc_ptr  s-gen_opc_ptr) {
 c = *opc_ptr++;
 def = tcg_op_defs[c];
@@ -1409,8 +1409,9 @@ static void tcg_liveness_analysis(TCGContext *s)
 op_index--;
 }
 
-if (args != gen_opparam_buf)
+if (args != s-gen_opparam_buf) {
 tcg_abort();
+}
 }
 #else
 /* dummy liveness analysis */
@@ -2104,7 +2105,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 
 #ifdef USE_TCG_OPTIMIZATIONS
 s-gen_opparam_ptr =
-tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
+tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs);
 #endif
 
 #ifdef CONFIG_PROFILER
@@ -2131,7 +2132,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 s-code_buf = gen_code_buf;
 s-code_ptr = gen_code_buf;
 
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 op_index = 0;
 
 for(;;) {
-- 
1.7.9.5




[Qemu-devel] [PATCH v3 0/6] TCG global variables clean-up

2012-10-29 Thread Evgeny Voevodin
This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on Exynos4210 target.

After this patchset was aplied, I noticed 0.7% speed-up of code generation.
Probably, this is due to better data caching.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results (tested on gcc-4.6):

Before clean-up:
min: 731.5
max: 734.8
avg: 733.0
standard deviation: ~2 ~= 0.2%

Average cycles/op = 733 +- 2



After clean-up:
min: 725.0
max: 730.5
avg: 727.8
standard deviation: ~3 ~= 0.4%

Average cycles/op = 728 +- 3
Speed-up of TCG code generation = 0.7%

Changelog:
v2-v3:
Removed tcg_cur_ctx since it gives slow-down on gcc-4.5.
Rebased.
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
  tcg/tcg.h: Duplicate global TCG variables in TCGContext
  TCG: Remove unused global variables

Evgeny Voevodin (4):
  TCG: Use gen_opc_ptr from context instead of global variable.
  TCG: Use gen_opparam_ptr from context instead of global variable.
  TCG: Use gen_opc_buf from context instead of global variable.
  TCG: Use gen_opparam_buf from context instead of global variable.

 gen-icount.h  |2 +-
 target-alpha/translate.c  |   10 +-
 target-arm/translate.c|   10 +-
 target-cris/translate.c   |   13 +-
 target-i386/translate.c   |   10 +-
 target-lm32/translate.c   |   13 +-
 target-m68k/translate.c   |   10 +-
 target-microblaze/translate.c |   13 +-
 target-mips/translate.c   |   11 +-
 target-openrisc/translate.c   |   13 +-
 target-ppc/translate.c|   11 +-
 target-s390x/translate.c  |   11 +-
 target-sh4/translate.c|   10 +-
 target-sparc/translate.c  |   10 +-
 target-unicore32/translate.c  |   10 +-
 target-xtensa/translate.c |8 +-
 tcg/optimize.c|   62 
 tcg/tcg-op.h  |  324 -
 tcg/tcg.c |   85 ++-
 tcg/tcg.h |   10 +-
 translate-all.c   |3 -
 21 files changed, 326 insertions(+), 323 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH v3 6/6] TCG: Remove unused global variables

2012-10-29 Thread Evgeny Voevodin
From: Evgeny e.voevo...@samsung.com

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.c   |4 
 tcg/tcg.h   |4 
 translate-all.c |3 ---
 3 files changed, 11 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index f332463..53bf109 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs);
 static TCGRegSet tcg_target_available_regs[2];
 static TCGRegSet tcg_target_call_clobber_regs;
 
-/* XXX: move that inside the context */
-uint16_t *gen_opc_ptr;
-TCGArg *gen_opparam_ptr;
-
 static inline void tcg_out8(TCGContext *s, uint8_t v)
 {
 *s-code_ptr++ = v;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 43b4317..b1f4e49 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -431,10 +431,6 @@ struct TCGContext {
 };
 
 extern TCGContext tcg_ctx;
-extern uint16_t *gen_opc_ptr;
-extern TCGArg *gen_opparam_ptr;
-extern uint16_t gen_opc_buf[];
-extern TCGArg gen_opparam_buf[];
 
 /* pool based memory allocation */
 
diff --git a/translate-all.c b/translate-all.c
index 5bd2d37..d9c2e57 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -33,9 +33,6 @@
 /* code generation context */
 TCGContext tcg_ctx;
 
-uint16_t gen_opc_buf[OPC_BUF_SIZE];
-TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-
 target_ulong gen_opc_pc[OPC_BUF_SIZE];
 uint16_t gen_opc_icount[OPC_BUF_SIZE];
 uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-- 
1.7.9.5




[Qemu-devel] [PATCH v3 4/6] TCG: Use gen_opc_buf from context instead of global variable.

2012-10-29 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |6 ++--
 target-arm/translate.c|6 ++--
 target-cris/translate.c   |9 +++---
 target-i386/translate.c   |6 ++--
 target-lm32/translate.c   |9 +++---
 target-m68k/translate.c   |6 ++--
 target-microblaze/translate.c |9 +++---
 target-mips/translate.c   |6 ++--
 target-openrisc/translate.c   |9 +++---
 target-ppc/translate.c|6 ++--
 target-s390x/translate.c  |6 ++--
 target-sh4/translate.c|6 ++--
 target-sparc/translate.c  |6 ++--
 target-unicore32/translate.c  |6 ++--
 target-xtensa/translate.c |4 +--
 tcg/optimize.c|   62 -
 tcg/tcg.c |   30 ++--
 17 files changed, 98 insertions(+), 94 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 6676cbf..91c761a 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3373,7 +3373,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 int max_insns;
 
 pc_start = tb-pc;
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 ctx.tb = tb;
 ctx.env = env;
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3465,7 +3465,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 gen_icount_end(tb, num_insns);
 *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index ff5d294..0602b31 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9727,7 +9727,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-pc = pc_start;
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 903907b..c54e3df 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3202,7 +3202,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
dc-env = env;
dc-tb = tb;
 
-   gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+   gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
dc-is_jmp = DISAS_NEXT;
dc-ppc = pc_start;
@@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
check_breakpoint(env, dc);
 
if (search_pc) {
-   j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+   j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
if (lj  j) {
lj++;
while (lj  j)
@@ -3401,7 +3401,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 gen_icount_end(tb, num_insns);
*tcg_ctx.gen_opc_ptr = INDEX_op_end;
if (search_pc) {
-   j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
+   j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
lj++;
while (lj = j)
gen_opc_instr_start[lj++] = 0;
@@ -3416,7 +3416,8 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
log_target_disas(pc_start, dc-pc - pc_start,
  dc-env-pregs[PR_VR]);
qemu_log(\nisize=%d osize=%td\n,
-   dc-pc - pc_start, tcg_ctx.gen_opc_ptr - gen_opc_buf);
+   dc-pc - pc_start, tcg_ctx.gen_opc_ptr -
+   tcg_ctx.gen_opc_buf);
}
 #endif
 #endif
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 5f977d9..1563677 100644
--- a/target-i386/translate.c
+++ b

[Qemu-devel] [PATCH v3 1/6] tcg/tcg.h: Duplicate global TCG variables in TCGContext

2012-10-29 Thread Evgeny Voevodin
From: Evgeny e.voevo...@samsung.com

Signed-off-by: Evgeny e.voevo...@samsung.com
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index 45e94f5..43b4317 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -422,6 +422,12 @@ struct TCGContext {
 int temps_in_use;
 int goto_tb_issue_mask;
 #endif
+
+uint16_t gen_opc_buf[OPC_BUF_SIZE];
+TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
+
+uint16_t *gen_opc_ptr;
+TCGArg *gen_opparam_ptr;
 };
 
 extern TCGContext tcg_ctx;
-- 
1.7.9.5




[Qemu-devel] [PATCH v3 2/6] TCG: Use gen_opc_ptr from context instead of global variable.

2012-10-29 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |8 ++---
 target-arm/translate.c|8 ++---
 target-cris/translate.c   |   10 +++---
 target-i386/translate.c   |8 ++---
 target-lm32/translate.c   |   10 +++---
 target-m68k/translate.c   |8 ++---
 target-microblaze/translate.c |   10 +++---
 target-mips/translate.c   |9 +++---
 target-openrisc/translate.c   |   10 +++---
 target-ppc/translate.c|9 +++---
 target-s390x/translate.c  |9 +++---
 target-sh4/translate.c|8 ++---
 target-sparc/translate.c  |8 ++---
 target-unicore32/translate.c  |8 ++---
 target-xtensa/translate.c |6 ++--
 tcg/tcg-op.h  |   70 -
 tcg/tcg.c |   16 +-
 17 files changed, 109 insertions(+), 106 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index f707d8d..6676cbf 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3432,7 +3432,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
or exhaust instruction count, stop generation.  */
 if (ret == NO_EXIT
  ((ctx.pc  (TARGET_PAGE_SIZE - 1)) == 0
-|| gen_opc_ptr = gen_opc_end
+|| tcg_ctx.gen_opc_ptr = gen_opc_end
 || num_insns = max_insns
 || singlestep
 || env-singlestep_enabled)) {
@@ -3463,9 +3463,9 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 25433da..ff5d294 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9834,7 +9834,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9881,7 +9881,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
  * Also stop translation when a page boundary is reached.  This
  * ensures prefetch aborts occur at the right place.  */
 num_insns ++;
-} while (!dc-is_jmp  gen_opc_ptr  gen_opc_end 
+} while (!dc-is_jmp  tcg_ctx.gen_opc_ptr  gen_opc_end 
  !env-singlestep_enabled 
  !singlestep 
  dc-pc  next_page_start 
@@ -9962,7 +9962,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 done_generating:
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -9974,7 +9974,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 755de65..903907b 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
check_breakpoint(env, dc);
 
if (search_pc) {
-   j = gen_opc_ptr - gen_opc_buf;
+   j = tcg_ctx.gen_opc_ptr - gen_opc_buf;
if (lj  j) {
lj++;
while (lj  j)
@@ -3348,7 +3348,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
if (!(tb-pc  1)  env-singlestep_enabled)
break;
} while (!dc-is_jmp  !dc-cpustate_changed
- gen_opc_ptr  gen_opc_end
+ tcg_ctx.gen_opc_ptr  gen_opc_end
   !singlestep
  (dc-pc  next_page_start)
   num_insns  max_insns);
@@ -3399,9 +3399,9 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
}
}
 gen_icount_end(tb, num_insns);
-   *gen_opc_ptr = INDEX_op_end;
+   *tcg_ctx.gen_opc_ptr = INDEX_op_end

Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up

2012-10-26 Thread Evgeny Voevodin

Today I made more precise testing with usage of --enable-profiler.

Here is the test procedure:
1. Boot Linux Kernel 5 times.
2. For each iteration wait while JIT cycles is stable for ~10 seconds
3. Write down the cycles/op

Here are the results:

Before clean-up:
min: 731.9
max: 735.8
avg: 734.3
standard deviation: ~2 = 0.3%
Avarage cycles/op = 734 +- 2

After clean-up:
min: 747.2
max: 751.7
avg: 750.5
standard deviation: ~2 = 0.3%
Avarage cycles/op = 750 +- 2
Slow-down of TCG code generation = 2.2%


After clean-up with TCGContext *const tcg_cur_ctx:
min: 730.6
max: 733.2
avg: 728.7
standard deviation: ~2 = 0.3%
Avarage cycles/op = 729 +- 2
Slow-down of TCG code generation = 0%

I suggest to define tcg_cur_ctx as TCGContext *const.
Then we will get rid of TCG code generation slow-down and also
will have no usage of global variables.

On 10/25/2012 10:45 AM, Evgeny Voevodin wrote:

Here are the results of tests before and after this patch series was
applied:

* EEMBC CoreMark (before - after)
   - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image)
   - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux
   - Results: 1148.105626 - 1161.440186 (+1.16%)

* nbench (before - after)
   - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image)
   - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux
   - Results
 . MEMORY INDEX: 1.864 - 1.862 (-0.11%)
 . INTEGER INDEX: 2.518 - 2.523 (+0.2%)
 . FLOATING-POINT INDEX: 0.385 - 0.394 (+2.34%)


Those tests show that it became even faster :))

But I'm quite sceptical about such results.
The thing is that in case of nbench it prints the warning if results are
not 95% statistically accurate.
So we can be sure that nbench result is 95% accurate.
And it's obvious that result shown above are in the scope of this accuracy.
I don't know the accuracy of CoreMark.

So, the main decision we can make that this patch series didn't
introduce any slow-down comparable to inaccuracy of the measurement.

Is this enough?

On 10/23/2012 10:21 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Changelog:
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where
we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Remove unused global variables

Evgeny Voevodin (5):
   translate-all.c: Introduce TCGContext *tcg_cur_ctx
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   |   13 +-
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 
  tcg/tcg-op.h  |  324
-
  tcg/tcg.c |   85 ++-
  tcg/tcg.h |   11 +-
  translate-all.c   |4 +-
  21 files changed, 328 insertions(+), 323 deletions(-)







--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up

2012-10-25 Thread Evgeny Voevodin
Here are the results of tests before and after this patch series was 
applied:


* EEMBC CoreMark (before - after)
  - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image)
  - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux
  - Results: 1148.105626 - 1161.440186 (+1.16%)

* nbench (before - after)
  - Guest: Exynos4210 ARMv7, Linux (Custom buildroot image)
  - Host: Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz, 4GB RAM, Linux
  - Results
. MEMORY INDEX: 1.864 - 1.862 (-0.11%)
. INTEGER INDEX: 2.518 - 2.523 (+0.2%)
. FLOATING-POINT INDEX: 0.385 - 0.394 (+2.34%)


Those tests show that it became even faster :))

But I'm quite sceptical about such results.
The thing is that in case of nbench it prints the warning if results are 
not 95% statistically accurate.

So we can be sure that nbench result is 95% accurate.
And it's obvious that result shown above are in the scope of this accuracy.
I don't know the accuracy of CoreMark.

So, the main decision we can make that this patch series didn't 
introduce any slow-down comparable to inaccuracy of the measurement.


Is this enough?

On 10/23/2012 10:21 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Changelog:
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Remove unused global variables

Evgeny Voevodin (5):
   translate-all.c: Introduce TCGContext *tcg_cur_ctx
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   |   13 +-
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 
  tcg/tcg-op.h  |  324 -
  tcg/tcg.c |   85 ++-
  tcg/tcg.h |   11 +-
  translate-all.c   |4 +-
  21 files changed, 328 insertions(+), 323 deletions(-)




--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up

2012-10-24 Thread Evgeny Voevodin

Any other comments on the patches?
I didn't get the consensus. Do we need a pointer to tcg context?
As I said before, I didn't notice any slow-down with it.

On 10/23/2012 10:21 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Changelog:
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Remove unused global variables

Evgeny Voevodin (5):
   translate-all.c: Introduce TCGContext *tcg_cur_ctx
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   |   13 +-
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 
  tcg/tcg-op.h  |  324 -
  tcg/tcg.c |   85 ++-
  tcg/tcg.h |   11 +-
  translate-all.c   |4 +-
  21 files changed, 328 insertions(+), 323 deletions(-)




--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up

2012-10-24 Thread Evgeny Voevodin

On 10/25/2012 07:17 AM, 陳韋任 (Wei-Ren Chen) wrote:

On Thu, Oct 25, 2012 at 07:06:37AM +0400, Evgeny Voevodin wrote:

Any other comments on the patches?
I didn't get the consensus. Do we need a pointer to tcg context?
As I said before, I didn't notice any slow-down with it.

On 10/23/2012 10:21 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

   Would you like to try to run some benchmark after the kernel booting? Like
Yeongkyoon Lee done with his qemu_ld/qemu_st work [1], EEMBC, nbench
, or even SPEC. ;)

Regards,
chenwj

[1] http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03630.html



Sure.

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com





[Qemu-devel] [PATCH v2 7/7] TCG: Remove unused global variables

2012-10-23 Thread Evgeny Voevodin
From: Evgeny e.voevo...@samsung.com

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.c   |4 
 tcg/tcg.h   |4 
 translate-all.c |3 ---
 3 files changed, 11 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index f332463..53bf109 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -96,10 +96,6 @@ const size_t tcg_op_defs_max = ARRAY_SIZE(tcg_op_defs);
 static TCGRegSet tcg_target_available_regs[2];
 static TCGRegSet tcg_target_call_clobber_regs;
 
-/* XXX: move that inside the context */
-uint16_t *gen_opc_ptr;
-TCGArg *gen_opparam_ptr;
-
 static inline void tcg_out8(TCGContext *s, uint8_t v)
 {
 *s-code_ptr++ = v;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index d326b36..19426f9 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -432,10 +432,6 @@ struct TCGContext {
 
 extern TCGContext tcg_ctx;
 extern TCGContext *tcg_cur_ctx;
-extern uint16_t *gen_opc_ptr;
-extern TCGArg *gen_opparam_ptr;
-extern uint16_t gen_opc_buf[];
-extern TCGArg gen_opparam_buf[];
 
 /* pool based memory allocation */
 
diff --git a/translate-all.c b/translate-all.c
index ccdcddf..3351012 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -34,9 +34,6 @@
 TCGContext tcg_ctx;
 TCGContext *tcg_cur_ctx = tcg_ctx;
 
-uint16_t gen_opc_buf[OPC_BUF_SIZE];
-TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-
 target_ulong gen_opc_pc[OPC_BUF_SIZE];
 uint16_t gen_opc_icount[OPC_BUF_SIZE];
 uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-- 
1.7.9.5




[Qemu-devel] [PATCH v2 1/7] tcg/tcg.h: Duplicate global TCG variables in TCGContext

2012-10-23 Thread Evgeny Voevodin
From: Evgeny e.voevo...@samsung.com

Signed-off-by: Evgeny e.voevo...@samsung.com
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.h |6 ++
 1 file changed, 6 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index 45e94f5..43b4317 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -422,6 +422,12 @@ struct TCGContext {
 int temps_in_use;
 int goto_tb_issue_mask;
 #endif
+
+uint16_t gen_opc_buf[OPC_BUF_SIZE];
+TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
+
+uint16_t *gen_opc_ptr;
+TCGArg *gen_opparam_ptr;
 };
 
 extern TCGContext tcg_ctx;
-- 
1.7.9.5




[Qemu-devel] [PATCH v2 3/7] TCG: Use gen_opc_ptr from context instead of global variable.

2012-10-23 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |8 ++---
 target-arm/translate.c|8 ++---
 target-cris/translate.c   |   10 +++---
 target-i386/translate.c   |8 ++---
 target-lm32/translate.c   |   10 +++---
 target-m68k/translate.c   |8 ++---
 target-microblaze/translate.c |   10 +++---
 target-mips/translate.c   |9 +++---
 target-openrisc/translate.c   |   10 +++---
 target-ppc/translate.c|9 +++---
 target-s390x/translate.c  |9 +++---
 target-sh4/translate.c|8 ++---
 target-sparc/translate.c  |8 ++---
 target-unicore32/translate.c  |8 ++---
 target-xtensa/translate.c |6 ++--
 tcg/tcg-op.h  |   70 -
 tcg/tcg.c |   16 +-
 17 files changed, 109 insertions(+), 106 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index f707d8d..751d457 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3432,7 +3432,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
or exhaust instruction count, stop generation.  */
 if (ret == NO_EXIT
  ((ctx.pc  (TARGET_PAGE_SIZE - 1)) == 0
-|| gen_opc_ptr = gen_opc_end
+|| tcg_cur_ctx-gen_opc_ptr = gen_opc_end
 || num_insns = max_insns
 || singlestep
 || env-singlestep_enabled)) {
@@ -3463,9 +3463,9 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_cur_ctx-gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index daccb15..41b1671 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9825,7 +9825,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9872,7 +9872,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
  * Also stop translation when a page boundary is reached.  This
  * ensures prefetch aborts occur at the right place.  */
 num_insns ++;
-} while (!dc-is_jmp  gen_opc_ptr  gen_opc_end 
+} while (!dc-is_jmp  tcg_cur_ctx-gen_opc_ptr  gen_opc_end 
  !env-singlestep_enabled 
  !singlestep 
  dc-pc  next_page_start 
@@ -9953,7 +9953,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 done_generating:
 gen_icount_end(tb, num_insns);
-*gen_opc_ptr = INDEX_op_end;
+*tcg_cur_ctx-gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -9965,7 +9965,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 755de65..9bec8b5 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
check_breakpoint(env, dc);
 
if (search_pc) {
-   j = gen_opc_ptr - gen_opc_buf;
+   j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
if (lj  j) {
lj++;
while (lj  j)
@@ -3348,7 +3348,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
if (!(tb-pc  1)  env-singlestep_enabled)
break;
} while (!dc-is_jmp  !dc-cpustate_changed
- gen_opc_ptr  gen_opc_end
+ tcg_cur_ctx-gen_opc_ptr  gen_opc_end
   !singlestep
  (dc-pc  next_page_start)
   num_insns  max_insns);
@@ -3399,9 +3399,9 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
}
}
 gen_icount_end(tb, num_insns);
-   *gen_opc_ptr = INDEX_op_end;
+   *tcg_cur_ctx

[Qemu-devel] [PATCH v2 4/7] TCG: Use gen_opparam_ptr from context instead of global variable.

2012-10-23 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 gen-icount.h |2 +-
 tcg/tcg-op.h |  254 +-
 tcg/tcg.c|   36 -
 3 files changed, 146 insertions(+), 146 deletions(-)

diff --git a/gen-icount.h b/gen-icount.h
index 430cb44..be0bd7e 100644
--- a/gen-icount.h
+++ b/gen-icount.h
@@ -16,7 +16,7 @@ static inline void gen_icount_start(void)
 count = tcg_temp_local_new_i32();
 tcg_gen_ld_i32(count, cpu_env, offsetof(CPUArchState, icount_decr.u32));
 /* This is a horrid hack to allow fixing up the value later.  */
-icount_arg = gen_opparam_ptr + 1;
+icount_arg = tcg_cur_ctx-gen_opparam_ptr + 1;
 tcg_gen_subi_i32(count, count, 0xdeadbeef);
 
 tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label);
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 4793909..39ce9de 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -33,230 +33,230 @@ static inline void tcg_gen_op0(TCGOpcode opc)
 static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1);
 }
 
 static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1);
 }
 
 static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
+*tcg_cur_ctx-gen_opparam_ptr++ = arg1;
 }
 
 static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg2);
 }
 
 static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg2);
 }
 
 static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = arg2;
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = arg1;
-*gen_opparam_ptr++ = arg2;
+*tcg_cur_ctx-gen_opparam_ptr++ = arg1;
+*tcg_cur_ctx-gen_opparam_ptr++ = arg2;
 }
 
 static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
TCGv_i32 arg3)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg3);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg3);
 }
 
 static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
TCGv_i64 arg3)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg3);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg3);
 }
 
 static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1,
 TCGv_i32 arg2, TCGArg arg3)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+*tcg_cur_ctx-gen_opparam_ptr++ = arg3;
 }
 
 static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1,
 TCGv_i64 arg2, TCGArg arg3)
 {
 *tcg_cur_ctx-gen_opc_ptr++ = opc;
-*gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-*gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-*gen_opparam_ptr++ = arg3;
+*tcg_cur_ctx-gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+*tcg_cur_ctx-gen_opparam_ptr

[Qemu-devel] [PATCH v2 5/7] TCG: Use gen_opc_buf from context instead of global variable.

2012-10-23 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 target-alpha/translate.c  |6 ++--
 target-arm/translate.c|6 ++--
 target-cris/translate.c   |9 +++---
 target-i386/translate.c   |6 ++--
 target-lm32/translate.c   |9 +++---
 target-m68k/translate.c   |6 ++--
 target-microblaze/translate.c |9 +++---
 target-mips/translate.c   |6 ++--
 target-openrisc/translate.c   |9 +++---
 target-ppc/translate.c|6 ++--
 target-s390x/translate.c  |6 ++--
 target-sh4/translate.c|6 ++--
 target-sparc/translate.c  |6 ++--
 target-unicore32/translate.c  |6 ++--
 target-xtensa/translate.c |4 +--
 tcg/optimize.c|   62 -
 tcg/tcg.c |   30 ++--
 17 files changed, 98 insertions(+), 94 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 751d457..7454ebd 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3373,7 +3373,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 int max_insns;
 
 pc_start = tb-pc;
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_cur_ctx-gen_opc_buf + OPC_MAX_SIZE;
 
 ctx.tb = tb;
 ctx.env = env;
@@ -3406,7 +3406,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 }
 }
 if (search_pc) {
-j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -3465,7 +3465,7 @@ static inline void 
gen_intermediate_code_internal(CPUAlphaState *env,
 gen_icount_end(tb, num_insns);
 *tcg_cur_ctx-gen_opc_ptr = INDEX_op_end;
 if (search_pc) {
-j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 41b1671..3c82a0d 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9718,7 +9718,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 
 dc-tb = tb;
 
-gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+gen_opc_end = tcg_cur_ctx-gen_opc_buf + OPC_MAX_SIZE;
 
 dc-is_jmp = DISAS_NEXT;
 dc-pc = pc_start;
@@ -9825,7 +9825,7 @@ static inline void 
gen_intermediate_code_internal(CPUARMState *env,
 }
 }
 if (search_pc) {
-j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf;
 if (lj  j) {
 lj++;
 while (lj  j)
@@ -9965,7 +9965,7 @@ done_generating:
 }
 #endif
 if (search_pc) {
-j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
+j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf;
 lj++;
 while (lj = j)
 gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 9bec8b5..587df16 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3202,7 +3202,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
dc-env = env;
dc-tb = tb;
 
-   gen_opc_end = gen_opc_buf + OPC_MAX_SIZE;
+   gen_opc_end = tcg_cur_ctx-gen_opc_buf + OPC_MAX_SIZE;
 
dc-is_jmp = DISAS_NEXT;
dc-ppc = pc_start;
@@ -3266,7 +3266,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
check_breakpoint(env, dc);
 
if (search_pc) {
-   j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
+   j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf;
if (lj  j) {
lj++;
while (lj  j)
@@ -3401,7 +3401,7 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
 gen_icount_end(tb, num_insns);
*tcg_cur_ctx-gen_opc_ptr = INDEX_op_end;
if (search_pc) {
-   j = tcg_cur_ctx-gen_opc_ptr - gen_opc_buf;
+   j = tcg_cur_ctx-gen_opc_ptr - tcg_cur_ctx-gen_opc_buf;
lj++;
while (lj = j)
gen_opc_instr_start[lj++] = 0;
@@ -3416,7 +3416,8 @@ gen_intermediate_code_internal(CPUCRISState *env, 
TranslationBlock *tb,
log_target_disas(pc_start, dc-pc - pc_start,
  dc-env-pregs[PR_VR]);
qemu_log(\nisize=%d osize=%td\n,
-   dc-pc - pc_start, tcg_cur_ctx-gen_opc_ptr - 
gen_opc_buf);
+   dc-pc - pc_start, tcg_cur_ctx-gen_opc_ptr -
+   tcg_cur_ctx-gen_opc_buf);
}
 #endif
 #endif
diff --git a/target-i386

[Qemu-devel] [PATCH v2 6/7] TCG: Use gen_opparam_buf from context instead of global variable.

2012-10-23 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index c4e663b..f332463 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -298,7 +298,7 @@ void tcg_func_start(TCGContext *s)
 #endif
 
 s-gen_opc_ptr = s-gen_opc_buf;
-s-gen_opparam_ptr = gen_opparam_buf;
+s-gen_opparam_ptr = s-gen_opparam_buf;
 }
 
 static inline void tcg_temp_alloc(TCGContext *s, int n)
@@ -885,7 +885,7 @@ void tcg_dump_ops(TCGContext *s)
 
 first_insn = 1;
 opc_ptr = s-gen_opc_buf;
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 while (opc_ptr  s-gen_opc_ptr) {
 c = *opc_ptr++;
 def = tcg_op_defs[c];
@@ -1409,8 +1409,9 @@ static void tcg_liveness_analysis(TCGContext *s)
 op_index--;
 }
 
-if (args != gen_opparam_buf)
+if (args != s-gen_opparam_buf) {
 tcg_abort();
+}
 }
 #else
 /* dummy liveness analysis */
@@ -2104,7 +2105,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 
 #ifdef USE_TCG_OPTIMIZATIONS
 s-gen_opparam_ptr =
-tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
+tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs);
 #endif
 
 #ifdef CONFIG_PROFILER
@@ -2131,7 +2132,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
 s-code_buf = gen_code_buf;
 s-code_ptr = gen_code_buf;
 
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
 op_index = 0;
 
 for(;;) {
-- 
1.7.9.5




[Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up

2012-10-23 Thread Evgeny Voevodin
This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Changelog:
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
  tcg/tcg.h: Duplicate global TCG variables in TCGContext
  TCG: Remove unused global variables

Evgeny Voevodin (5):
  translate-all.c: Introduce TCGContext *tcg_cur_ctx
  TCG: Use gen_opc_ptr from context instead of global variable.
  TCG: Use gen_opparam_ptr from context instead of global variable.
  TCG: Use gen_opc_buf from context instead of global variable.
  TCG: Use gen_opparam_buf from context instead of global variable.

 gen-icount.h  |2 +-
 target-alpha/translate.c  |   10 +-
 target-arm/translate.c|   10 +-
 target-cris/translate.c   |   13 +-
 target-i386/translate.c   |   10 +-
 target-lm32/translate.c   |   13 +-
 target-m68k/translate.c   |   10 +-
 target-microblaze/translate.c |   13 +-
 target-mips/translate.c   |   11 +-
 target-openrisc/translate.c   |   13 +-
 target-ppc/translate.c|   11 +-
 target-s390x/translate.c  |   11 +-
 target-sh4/translate.c|   10 +-
 target-sparc/translate.c  |   10 +-
 target-unicore32/translate.c  |   10 +-
 target-xtensa/translate.c |8 +-
 tcg/optimize.c|   62 
 tcg/tcg-op.h  |  324 -
 tcg/tcg.c |   85 ++-
 tcg/tcg.h |   11 +-
 translate-all.c   |4 +-
 21 files changed, 328 insertions(+), 323 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH v2 2/7] translate-all.c: Introduce TCGContext *tcg_cur_ctx

2012-10-23 Thread Evgeny Voevodin
We will use this pointer from functions where we don't have an
interface to pass tcg_ctx as a parameter.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 tcg/tcg.h   |1 +
 translate-all.c |1 +
 2 files changed, 2 insertions(+)

diff --git a/tcg/tcg.h b/tcg/tcg.h
index 43b4317..d326b36 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -431,6 +431,7 @@ struct TCGContext {
 };
 
 extern TCGContext tcg_ctx;
+extern TCGContext *tcg_cur_ctx;
 extern uint16_t *gen_opc_ptr;
 extern TCGArg *gen_opparam_ptr;
 extern uint16_t gen_opc_buf[];
diff --git a/translate-all.c b/translate-all.c
index 5bd2d37..ccdcddf 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -32,6 +32,7 @@
 
 /* code generation context */
 TCGContext tcg_ctx;
+TCGContext *tcg_cur_ctx = tcg_ctx;
 
 uint16_t gen_opc_buf[OPC_BUF_SIZE];
 TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH v2 0/7] TCG global variables clean-up

2012-10-23 Thread Evgeny Voevodin

On 10/23/2012 10:21 AM, Evgeny Voevodin wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Changelog:
v1-v2:
Introduced TCGContext *tcg_cur_ctx global to use in those places where we don't
have an interface to pass pointer to tcg_ctx.
Code style clean-up

Evgeny (2):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Remove unused global variables



It seems that I cherry-picked commits that were made before I correctly
set a user name. Hope I don't need to generate v3 because of that.


Evgeny Voevodin (5):
   translate-all.c: Introduce TCGContext *tcg_cur_ctx
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   10 +-
  target-cris/translate.c   |   13 +-
  target-i386/translate.c   |   10 +-
  target-lm32/translate.c   |   13 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   13 +-
  target-mips/translate.c   |   11 +-
  target-openrisc/translate.c   |   13 +-
  target-ppc/translate.c|   11 +-
  target-s390x/translate.c  |   11 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   10 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 
  tcg/tcg-op.h  |  324 -
  tcg/tcg.c |   85 ++-
  tcg/tcg.h |   11 +-
  translate-all.c   |4 +-
  21 files changed, 328 insertions(+), 323 deletions(-)




--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] [PATCH v2 2/7] translate-all.c: Introduce TCGContext *tcg_cur_ctx

2012-10-23 Thread Evgeny Voevodin

On 10/24/2012 01:18 AM, Richard Henderson wrote:

On 2012-10-23 16:21, Evgeny Voevodin wrote:

We will use this pointer from functions where we don't have an
interface to pass tcg_ctx as a parameter.

I don't think this is worthwhile.  It'll just make the whole thing slower,
passing around unnecessary pointers.


r~



1. I didn't noticed any slow-down of kernel boot process. Maybe it's 
worth to make more

tests with self modifying code but I don't think so, because
2. The most intensive usage of tcg_cur_ctx is in tcg/tcg-op.h functions. 
If we look carefully at
them then we will see that there are only few functions for which single 
excessive dereferencing
of a pointer leads to any significant slow-down. These functions are 
those which make just one
or two operations and exit. And we should keep in mind that there is 
only single dereference of

a pointer since it is stored in the register for further operations.

Of course some slow-down should present but I found it negligible 
(actually I didn't find it at all).
If there are some common tests for TCG generation speed I can try to run 
them and report results.


Also we can specify tcg_cur_ctx as const and in that case I guess that 
dereferencing of tcg_cur_ctx

should not lead to any slow-down.

Also I can drop tcg_cur_ctx and use tcg_ctx.xxx instead as was in the 
first series.


What about the rest patches?

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [PATCH 0/6] *** TCG global variables clean-up ***

2012-10-21 Thread Evgeny Voevodin

On 10/19/2012 09:55 PM, Blue Swirl wrote:

On Fri, Oct 19, 2012 at 12:42 PM, Evgeny e.voevo...@samsung.com wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Where it was possible I used s-...
Where we don't have an interface to pass a pointer to tcg_ctx, I used 
tcg_ctx.xxx
since it is a global variable too.

Maybe a pointer should be added so that the references become
tcg_ctx_ptr-xxx. This would incur unnecessary pointer dereference
penalties though.


I thought about usage of TCGContext * tcg_cur_ctx, but I decided that we 
don't need this until

we want to use multiple TCG contexts.
If we would like to introduce such a pointer, then I'd prefer to hold it 
in CPUArchState inside

CPU_COMMON and initialize it in qemu_tcg_cpu_thread_fn for each CPU:
env-tcg_cur_ctx = tcg_ctx;





Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Evgeny (6):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.
   TCG: Remove unused global variables

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   12 +-
  target-cris/translate.c   |   12 +-
  target-i386/translate.c   |   12 +-
  target-lm32/translate.c   |   12 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   12 +-
  target-mips/translate.c   |   10 +-
  target-openrisc/translate.c   |   12 +-
  target-ppc/translate.c|   10 +-
  target-s390x/translate.c  |   10 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   12 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 
  tcg/tcg-op.h  |  324 -
  tcg/tcg.c |   84 +--
  tcg/tcg.h |   10 +-
  translate-all.c   |3 -
  21 files changed, 321 insertions(+), 326 deletions(-)

--
1.7.9.5




--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [PATCH 5/6] TCG: Use gen_opparam_buf from context instead of global variable.

2012-10-21 Thread Evgeny Voevodin

On 10/19/2012 09:53 PM, Blue Swirl wrote:

On Fri, Oct 19, 2012 at 12:42 PM, Evgeny e.voevo...@samsung.com wrote:

Signed-off-by: Evgeny e.voevo...@samsung.com
---
  tcg/tcg.c |   10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 3da1d83..77b15a0 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -302,7 +302,7 @@ void tcg_func_start(TCGContext *s)
  #endif

  s-gen_opc_ptr = s-gen_opc_buf;
-s-gen_opparam_ptr = gen_opparam_buf;
+s-gen_opparam_ptr = s-gen_opparam_buf;
  }

  static inline void tcg_temp_alloc(TCGContext *s, int n)
@@ -889,7 +889,7 @@ void tcg_dump_ops(TCGContext *s)

  first_insn = 1;
  opc_ptr = s-gen_opc_buf;
-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
  while (opc_ptr  s-gen_opc_ptr) {
  c = *opc_ptr++;
  def = tcg_op_defs[c];
@@ -1413,7 +1413,7 @@ static void tcg_liveness_analysis(TCGContext *s)
  op_index--;
  }

-if (args != gen_opparam_buf)
+if (args != s-gen_opparam_buf)

Please add braces.


Ok.
Maybe I should introduce a little code clean-up in the scope of my patches?
I mean, remove tabs and so on... Then maybe it would better be a 
separate patch?





  tcg_abort();
  }
  #else
@@ -2108,7 +2108,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,

  #ifdef USE_TCG_OPTIMIZATIONS
  s-gen_opparam_ptr =
-tcg_optimize(s, s-gen_opc_ptr, gen_opparam_buf, tcg_op_defs);
+tcg_optimize(s, s-gen_opc_ptr, s-gen_opparam_buf, tcg_op_defs);
  #endif

  #ifdef CONFIG_PROFILER
@@ -2135,7 +2135,7 @@ static inline int tcg_gen_code_common(TCGContext *s, 
uint8_t *gen_code_buf,
  s-code_buf = gen_code_buf;
  s-code_ptr = gen_code_buf;

-args = gen_opparam_buf;
+args = s-gen_opparam_buf;
  op_index = 0;

  for(;;) {
--
1.7.9.5




--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [PATCH 0/6] *** TCG global variables clean-up ***

2012-10-21 Thread Evgeny Voevodin

On 10/22/2012 07:40 AM, Evgeny Voevodin wrote:

On 10/19/2012 09:55 PM, Blue Swirl wrote:

On Fri, Oct 19, 2012 at 12:42 PM, Evgeny e.voevo...@samsung.com wrote:

This set of patches moves global variables to tcg_ctx:
gen_opc_ptr
gen_opparam_ptr
gen_opc_buf
gen_opparam_buf

Where it was possible I used s-...
Where we don't have an interface to pass a pointer to tcg_ctx, I used
tcg_ctx.xxx
since it is a global variable too.

Maybe a pointer should be added so that the references become
tcg_ctx_ptr-xxx. This would incur unnecessary pointer dereference
penalties though.


I thought about usage of TCGContext * tcg_cur_ctx, but I decided that we
don't need this until
we want to use multiple TCG contexts.
If we would like to introduce such a pointer, then I'd prefer to hold it
in CPUArchState inside
CPU_COMMON and initialize it in qemu_tcg_cpu_thread_fn for each CPU:
env-tcg_cur_ctx = tcg_ctx;


Oh, I just found out that we don't have an access to CPUArchState form 
tcg-op.h, so we can't get rid of usage of global variable here until we

change interfaces to all functions...







Build tested for all targets.
Execution tested on ARM.

I didn't notice any slow-down of kernel boot after this set was applied.

Evgeny (6):
   tcg/tcg.h: Duplicate global TCG variables in TCGContext
   TCG: Use gen_opc_ptr from context instead of global variable.
   TCG: Use gen_opparam_ptr from context instead of global variable.
   TCG: Use gen_opc_buf from context instead of global variable.
   TCG: Use gen_opparam_buf from context instead of global variable.
   TCG: Remove unused global variables

  gen-icount.h  |2 +-
  target-alpha/translate.c  |   10 +-
  target-arm/translate.c|   12 +-
  target-cris/translate.c   |   12 +-
  target-i386/translate.c   |   12 +-
  target-lm32/translate.c   |   12 +-
  target-m68k/translate.c   |   10 +-
  target-microblaze/translate.c |   12 +-
  target-mips/translate.c   |   10 +-
  target-openrisc/translate.c   |   12 +-
  target-ppc/translate.c|   10 +-
  target-s390x/translate.c  |   10 +-
  target-sh4/translate.c|   10 +-
  target-sparc/translate.c  |   10 +-
  target-unicore32/translate.c  |   12 +-
  target-xtensa/translate.c |8 +-
  tcg/optimize.c|   62 
  tcg/tcg-op.h  |  324
-
  tcg/tcg.c |   84 +--
  tcg/tcg.h |   10 +-
  translate-all.c   |3 -
  21 files changed, 321 insertions(+), 326 deletions(-)

--
1.7.9.5







--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] Building QEMU with multiple CPU targets.

2012-10-08 Thread Evgeny Voevodin

On 10/08/2012 02:54 PM, Peter Maydell wrote:

On 8 October 2012 07:39, Peter Crosthwaite
peter.crosthwa...@petalogix.com wrote:

Im currently investigating the possibility of building QEMU with
multiple CPU architectures active concurrently. That is, I have a
binary with both an target-arm and target-microblaze and wish to run
them as a heterogeneous multiprocessor platform.

Given the recent QOM development in making CPUs just another object,
shouldn't be possible with a bit of Makefile and configure rework to
build qemu-system-arm+microblaze and then create machine models
instantiating both CPU types?

Are the major complications here from either a Make or QOM perspective?


I certainly think this would be a nice feature to have, but I suspect
the makefile/QOM bits are probably the easy parts :-)

At the moment things like the translated code cache are basically
globals and would need to be moved to be per-CPU. Also there are
still various settings that are compile time which would need to
become runtime (though we just got rid of the 'size of physical
address type' one at least).


Did anybody start this work?
I'm interested in localiation of tcg per cpu.



-- PMM




--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] [PATCH] hw/arm_gic.c: Fix improper DPRINTF output.

2012-10-04 Thread Evgeny Voevodin

On 10/02/2012 07:50 AM, Evgeny Voevodin wrote:

s-cpu_enabled is an array, so s-cpu_enabled ? En : Dis returns
En always. We should use s-cpu_enabled[cpu] here.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
  hw/arm_gic.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm_gic.c b/hw/arm_gic.c
index 55871fa..4024dae 100644
--- a/hw/arm_gic.c
+++ b/hw/arm_gic.c
@@ -566,7 +566,7 @@ static void gic_cpu_write(gic_state *s, int cpu, int 
offset, uint32_t value)
  switch (offset) {
  case 0x00: /* Control */
  s-cpu_enabled[cpu] = (value  1);
-DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled ? En : Dis);
+DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled[cpu] ? En : Dis);
  break;
  case 0x04: /* Priority mask */
  s-priority_mask[cpu] = (value  0xff);


Did anybody pick this up?

--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




[Qemu-devel] [PATCH] hw/arm_gic.c: Fix improper DPRINTF output.

2012-10-01 Thread Evgeny Voevodin
s-cpu_enabled is a massive, so s-cpu_enabled ? En : Dis returns
En always. We should use s-cpu_enabled[cpu] here.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/arm_gic.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm_gic.c b/hw/arm_gic.c
index 55871fa..4024dae 100644
--- a/hw/arm_gic.c
+++ b/hw/arm_gic.c
@@ -566,7 +566,7 @@ static void gic_cpu_write(gic_state *s, int cpu, int 
offset, uint32_t value)
 switch (offset) {
 case 0x00: /* Control */
 s-cpu_enabled[cpu] = (value  1);
-DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled ? En : Dis);
+DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled[cpu] ? En : Dis);
 break;
 case 0x04: /* Priority mask */
 s-priority_mask[cpu] = (value  0xff);
-- 
1.7.9.5




[Qemu-devel] [PATCH] hw/arm_gic.c: Fix improper DPRINTF output.

2012-10-01 Thread Evgeny Voevodin
s-cpu_enabled is an array, so s-cpu_enabled ? En : Dis returns
En always. We should use s-cpu_enabled[cpu] here.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/arm_gic.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm_gic.c b/hw/arm_gic.c
index 55871fa..4024dae 100644
--- a/hw/arm_gic.c
+++ b/hw/arm_gic.c
@@ -566,7 +566,7 @@ static void gic_cpu_write(gic_state *s, int cpu, int 
offset, uint32_t value)
 switch (offset) {
 case 0x00: /* Control */
 s-cpu_enabled[cpu] = (value  1);
-DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled ? En : Dis);
+DPRINTF(CPU %d %sabled\n, cpu, s-cpu_enabled[cpu] ? En : Dis);
 break;
 case 0x04: /* Priority mask */
 s-priority_mask[cpu] = (value  0xff);
-- 
1.7.9.5




Re: [Qemu-devel] [RFC v2 00/12] Virtio-mmio refactoring.

2012-09-27 Thread Evgeny Voevodin

On 09/27/2012 09:31 PM, KONRAD Frédéric wrote:

Hi,

We actually want to add virtio models for arm, therefore these patches 
are really helpful.


We will try it, start looking at the issues.

Any feedback ?



Ok. Feel free.


On 17/09/2012 12:00, Evgeny Voevodin wrote:

Previous RFC you can find at
http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03665.html
Yes, long time ago...
Since I'm not sure when I'll be able to continue on this,
I'm publishing this work as is.
In this patchset I tried to split virtio-xxx-pci devices into
virtio-pci + virtio-xxx (blk, net, serial,...). Also virtio-mmio
transport is introduced based on Peter's work which is accessible
here: http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html

The main idea was to let users specify
-device virtio-pci,id=virtio-pci.0
-device virtio-blk,transport=virtio-pci.0,...
  and

-device virtio-mmio,id=virtio-mmio.0
-device virtio-blk,transport=virtio-mmio.0,...

I created virtio-pci and virtio-mmio transport devices and tried to 
enclose

back-end functionality into virtio-blk, virtio-net, etc. On
initialization of transport device it creates a bus to which a 
back-end device
could be connected. Each back-end device is implemented in 
corresponding source
file. As for PCI transport, I temporary placed it in a new 
virtio-pci-new.c file

to not break a functionality of still presented virtio-xxx-pci devices.

Known issues to be resolved:
1. On creation of back-end we need to resolve somehow if props were 
explicitly set

by user.
2. Back-end device can't be initialized if there are no free bus 
created by transport,

so you can't specify
-device virtio-blk,transport=virtio-pci.0,...
-device virtio-pci,id=virtio-pci.0
3. Implement virtio-xxx-devices such that they just create virtio-pci 
and virtio-xxx

devices during initialization.
4. Refactor all remaining back-ends since I just tried blk, net, 
serial and balloon.

5. Refactor s390
6. Further?

Evgeny Voevodin (9):
   Virtio: Add transport bindings.
   hw/qdev-properties.c: Add transport property.
   hw/pci.c: Make pci_add_option_rom global visible
   hw/virtio-serial-bus.c: Add virtio-serial device.
   hw/virtio-balloon.c: Add virtio-balloon device.
   hw/virtio-net.c: Add virtio-net device.
   hw/virtio-blk.c: Add virtio-blk device.
   hw/virtio-pci-new.c: Add VirtIOPCI device.
   hw/exynos4210.c: Create two virtio-mmio transport instances.

Peter Maydell (3):
   virtio: Add support for guest setting of queue size
   virtio: Support transports which can specify the vring alignment
   Add MMIO based virtio transport

  hw/Makefile.objs   |3 +
  hw/exynos4210.c|   13 +
  hw/pci.c   |3 +-
  hw/pci.h   |2 +
  hw/qdev-properties.c   |   29 ++
  hw/qdev.h  |3 +
  hw/virtio-balloon.c|   42 +++
  hw/virtio-balloon.h|9 +
  hw/virtio-blk.c|   65 
  hw/virtio-blk.h|   15 +
  hw/virtio-mmio.c   |  400 +
  hw/virtio-net.c|   59 +++
  hw/virtio-net.h|   16 +
  hw/virtio-pci-new.c|  925 


  hw/virtio-pci.h|   18 +
  hw/virtio-serial-bus.c |   44 +++
  hw/virtio-serial.h |   11 +
  hw/virtio-transport.c  |  147 
  hw/virtio-transport.h  |   74 
  hw/virtio.c|   20 +-
  hw/virtio.h|2 +
  21 files changed, 1896 insertions(+), 4 deletions(-)
  create mode 100644 hw/virtio-mmio.c
  create mode 100644 hw/virtio-pci-new.c
  create mode 100644 hw/virtio-transport.c
  create mode 100644 hw/virtio-transport.h







--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com





[Qemu-devel] [RFC v2 08/12] hw/virtio-balloon.c: Add virtio-balloon device.

2012-09-17 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/virtio-balloon.c |   42 ++
 hw/virtio-balloon.h |9 +
 2 files changed, 51 insertions(+)

diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c
index dd1a650..d6fe2aa 100644
--- a/hw/virtio-balloon.c
+++ b/hw/virtio-balloon.c
@@ -20,6 +20,7 @@
 #include cpu.h
 #include balloon.h
 #include virtio-balloon.h
+#include virtio-transport.h
 #include kvm.h
 #include exec-memory.h
 
@@ -272,3 +273,44 @@ void virtio_balloon_exit(VirtIODevice *vdev)
 unregister_savevm(s-qdev, virtio-balloon, s);
 virtio_cleanup(vdev);
 }
+
+/ VirtIOBaloon Device **/
+
+static int virtio_balloondev_init(DeviceState *dev)
+{
+VirtIODevice *vdev;
+VirtIOBaloonState *s = VIRTIO_BALLOON_FROM_QDEV(dev);
+vdev = virtio_balloon_init(dev);
+if (!vdev) {
+return -1;
+}
+
+assert(s-trl != NULL);
+
+return virtio_call_backend_init_cb(dev, s-trl, vdev);
+}
+
+static Property virtio_balloon_properties[] = {
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_balloon_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+dc-init = virtio_balloondev_init;
+dc-props = virtio_balloon_properties;
+}
+
+static TypeInfo virtio_balloon_info = {
+.name = virtio-balloon,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(VirtIOBaloonState),
+.class_init = virtio_balloon_class_init,
+};
+
+static void virtio_baloon_register_types(void)
+{
+type_register_static(virtio_balloon_info);
+}
+
+type_init(virtio_baloon_register_types)
diff --git a/hw/virtio-balloon.h b/hw/virtio-balloon.h
index 73300dd..b925186 100644
--- a/hw/virtio-balloon.h
+++ b/hw/virtio-balloon.h
@@ -15,8 +15,10 @@
 #ifndef _QEMU_VIRTIO_BALLOON_H
 #define _QEMU_VIRTIO_BALLOON_H
 
+#include sysbus.h
 #include virtio.h
 #include pci.h
+#include virtio-transport.h
 
 /* from Linux's linux/virtio_balloon.h */
 
@@ -52,4 +54,11 @@ typedef struct VirtIOBalloonStat {
 uint64_t val;
 } QEMU_PACKED VirtIOBalloonStat;
 
+typedef struct {
+DeviceState qdev;
+VirtIOTransportLink *trl;
+} VirtIOBaloonState;
+
+#define VIRTIO_BALLOON_FROM_QDEV(dev) DO_UPCAST(VirtIOBaloonState, qdev, dev)
+
 #endif
-- 
1.7.9.5




[Qemu-devel] [RFC v2 01/12] virtio: Add support for guest setting of queue size

2012-09-17 Thread Evgeny Voevodin
From: Peter Maydell peter.mayd...@linaro.org

The MMIO virtio transport spec allows the guest to tell the host how
large the queue size is. Add virtio_queue_set_num() function which
implements this in the QEMU common virtio support code.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/virtio.c |6 ++
 hw/virtio.h |1 +
 2 files changed, 7 insertions(+)

diff --git a/hw/virtio.c b/hw/virtio.c
index 209c763..5334326 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -628,6 +628,12 @@ target_phys_addr_t virtio_queue_get_addr(VirtIODevice 
*vdev, int n)
 return vdev-vq[n].pa;
 }
 
+void virtio_queue_set_num(VirtIODevice *vdev, int n, int num)
+{
+vdev-vq[n].vring.num = num;
+virtqueue_init(vdev-vq[n]);
+}
+
 int virtio_queue_get_num(VirtIODevice *vdev, int n)
 {
 return vdev-vq[n].vring.num;
diff --git a/hw/virtio.h b/hw/virtio.h
index 7a4f564..eb9953f 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -177,6 +177,7 @@ void virtio_config_writew(VirtIODevice *vdev, uint32_t 
addr, uint32_t data);
 void virtio_config_writel(VirtIODevice *vdev, uint32_t addr, uint32_t data);
 void virtio_queue_set_addr(VirtIODevice *vdev, int n, target_phys_addr_t addr);
 target_phys_addr_t virtio_queue_get_addr(VirtIODevice *vdev, int n);
+void virtio_queue_set_num(VirtIODevice *vdev, int n, int num);
 int virtio_queue_get_num(VirtIODevice *vdev, int n);
 void virtio_queue_notify(VirtIODevice *vdev, int n);
 uint16_t virtio_queue_vector(VirtIODevice *vdev, int n);
-- 
1.7.9.5




[Qemu-devel] [RFC v2 03/12] Virtio: Add transport bindings.

2012-09-17 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/Makefile.objs  |1 +
 hw/virtio-transport.c |  147 +
 hw/virtio-transport.h |   74 +
 3 files changed, 222 insertions(+)
 create mode 100644 hw/virtio-transport.c
 create mode 100644 hw/virtio-transport.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 6dfebd2..db4c16d 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -2,6 +2,7 @@ hw-obj-y = usb/ ide/
 hw-obj-y += loader.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
 hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
+hw-obj-$(CONFIG_VIRTIO) += virtio-transport.o
 hw-obj-y += fw_cfg.o
 hw-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o
 hw-obj-$(CONFIG_PCI) += msix.o msi.o
diff --git a/hw/virtio-transport.c b/hw/virtio-transport.c
new file mode 100644
index 000..76360ba
--- /dev/null
+++ b/hw/virtio-transport.c
@@ -0,0 +1,147 @@
+/*
+ * Virtio transport bindings
+ *
+ * Copyright (c) 2011 - 2012 Samsung Electronics Co., Ltd.
+ *
+ * Author:
+ *  Evgeny Voevodin e.voevo...@samsung.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include virtio-transport.h
+
+#define VIRTIO_TRANSPORT_BUS virtio-transport
+
+static QTAILQ_HEAD(, VirtIOTransportLink) transport_links =
+QTAILQ_HEAD_INITIALIZER(transport_links);
+
+/*
+ * Find transport device by its ID.
+ */
+VirtIOTransportLink* virtio_find_transport(const char *name)
+{
+VirtIOTransportLink *trl;
+
+assert(name != NULL);
+
+QTAILQ_FOREACH(trl, transport_links, sibling) {
+if (trl-tr-id != NULL) {
+if (!strcmp(name, trl-tr-id)) {
+return trl;
+}
+}
+}
+
+return NULL;
+}
+
+/*
+ * Count transport devices by ID.
+ */
+uint32_t virtio_count_transports(const char *name)
+{
+VirtIOTransportLink *trl;
+uint32_t i = 0;
+
+QTAILQ_FOREACH(trl, transport_links, sibling) {
+if (name == NULL) {
+i++;
+continue;
+}
+
+if (trl-tr-id != NULL) {
+if (!strncmp(name, trl-tr-id,strlen(name))) {
+i++;
+}
+}
+}
+return i;
+}
+
+/*
+ * Initialize new transport device
+ */
+char* virtio_init_transport(DeviceState *dev, VirtIOTransportLink **trl,
+const char* name, virtio_backend_init_cb cb)
+{
+VirtIOTransportLink *link = g_malloc0(sizeof(VirtIOTransportLink));
+char *buf;
+size_t len;
+uint32_t i;
+
+assert(dev != NULL);
+assert(name != NULL);
+assert(trl != NULL);
+
+i = virtio_count_transports(name);
+len = strlen(name) + 16;
+buf = g_malloc(len);
+snprintf(buf, len, %s.%d, name, i);
+qbus_create(TYPE_VIRTIO_BUS, dev, buf);
+
+/* Add new transport */
+QTAILQ_INSERT_TAIL(transport_links, link, sibling);
+link-tr = dev;
+link-cb = cb;
+// TODO: Add a link property
+*trl = link;
+return buf;
+}
+
+/*
+ * Unplug back-end from system bus and plug it into transport bus.
+ */
+void virtio_plug_into_transport(DeviceState *dev, VirtIOTransportLink *trl)
+{
+BusChild *kid;
+
+/* Unplug back-end from system bus */
+QTAILQ_FOREACH(kid, qdev_get_parent_bus(dev)-children, sibling) {
+if (kid-child == dev) {
+QTAILQ_REMOVE(qdev_get_parent_bus(dev)-children, kid, sibling);
+break;
+}
+}
+
+/* Plug back-end into transport's bus */
+qdev_set_parent_bus(dev, QLIST_FIRST(trl-tr-child_bus));
+
+}
+
+/*
+ * Execute call-back on back-end initialization.
+ * Performs initialization of MMIO or PCI transport.
+ */
+int virtio_call_backend_init_cb(DeviceState *dev, VirtIOTransportLink *trl,
+VirtIODevice *vdev)
+{
+if (trl-cb) {
+return trl-cb(dev, vdev, trl);
+}
+
+return 0;
+}
+
+static const TypeInfo virtio_bus_info = {
+.name = TYPE_VIRTIO_BUS,
+.parent = TYPE_BUS,
+.instance_size = sizeof(BusState),
+};
+
+static void virtio_register_types(void)
+{
+type_register_static(virtio_bus_info);
+}
+
+type_init(virtio_register_types)
diff --git a/hw/virtio-transport.h b/hw/virtio-transport.h
new file mode 100644
index 000..43200dc
--- /dev/null
+++ b/hw/virtio-transport.h
@@ -0,0 +1,74 @@
+/*
+ * Virtio transport header
+ *
+ * Copyright (c) 2011 - 2012 Samsung Electronics Co., Ltd.
+ *
+ * Author:
+ *  Evgeny Voevodin e.voevo...@samsung.com

[Qemu-devel] [RFC v2 02/12] virtio: Support transports which can specify the vring alignment

2012-09-17 Thread Evgeny Voevodin
From: Peter Maydell peter.mayd...@linaro.org

Support virtio transports which can specify the vring alignment
(ie where the guest communicates this to the host) by providing
a new virtio_queue_set_align() function. (The default alignment
remains as before.)

FIXME save/load support for this new field!

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/virtio.c |   14 --
 hw/virtio.h |1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/hw/virtio.c b/hw/virtio.c
index 5334326..4f47d4e 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -19,7 +19,9 @@
 #include qemu-barrier.h
 
 /* The alignment to use between consumer and producer parts of vring.
- * x86 pagesize again. */
+ * x86 pagesize again. This is the default, used by transports like PCI
+ * which don't provide a means for the guest to tell the host the alignment.
+ */
 #define VIRTIO_PCI_VRING_ALIGN 4096
 
 typedef struct VRingDesc
@@ -53,6 +55,7 @@ typedef struct VRingUsed
 typedef struct VRing
 {
 unsigned int num;
+unsigned int align;
 target_phys_addr_t desc;
 target_phys_addr_t avail;
 target_phys_addr_t used;
@@ -90,7 +93,7 @@ static void virtqueue_init(VirtQueue *vq)
 vq-vring.avail = pa + vq-vring.num * sizeof(VRingDesc);
 vq-vring.used = vring_align(vq-vring.avail +
  offsetof(VRingAvail, ring[vq-vring.num]),
- VIRTIO_PCI_VRING_ALIGN);
+ vq-vring.align);
 }
 
 static inline uint64_t vring_desc_addr(target_phys_addr_t desc_pa, int i)
@@ -646,6 +649,12 @@ int virtio_queue_get_id(VirtQueue *vq)
 return vq - vdev-vq[0];
 }
 
+void virtio_queue_set_align(VirtIODevice *vdev, int n, int align)
+{
+vdev-vq[n].vring.align = align;
+virtqueue_init(vdev-vq[n]);
+}
+
 void virtio_queue_notify_vq(VirtQueue *vq)
 {
 if (vq-vring.desc) {
@@ -686,6 +695,7 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 abort();
 
 vdev-vq[i].vring.num = queue_size;
+vdev-vq[i].vring.align = VIRTIO_PCI_VRING_ALIGN;
 vdev-vq[i].handle_output = handle_output;
 
 return vdev-vq[i];
diff --git a/hw/virtio.h b/hw/virtio.h
index eb9953f..3f16367 100644
--- a/hw/virtio.h
+++ b/hw/virtio.h
@@ -179,6 +179,7 @@ void virtio_queue_set_addr(VirtIODevice *vdev, int n, 
target_phys_addr_t addr);
 target_phys_addr_t virtio_queue_get_addr(VirtIODevice *vdev, int n);
 void virtio_queue_set_num(VirtIODevice *vdev, int n, int num);
 int virtio_queue_get_num(VirtIODevice *vdev, int n);
+void virtio_queue_set_align(VirtIODevice *vdev, int n, int align);
 void virtio_queue_notify(VirtIODevice *vdev, int n);
 uint16_t virtio_queue_vector(VirtIODevice *vdev, int n);
 void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector);
-- 
1.7.9.5




[Qemu-devel] [RFC v2 06/12] Add MMIO based virtio transport

2012-09-17 Thread Evgeny Voevodin
From: Peter Maydell peter.mayd...@linaro.org

Add support for the generic MMIO based virtio transport.

This patch is a modyfied patch of
Peter Maydell peter.mayd...@linaro.org. Changes are to have
virtio-mmio bridge device which provides virtio-mmio bus. To this bus
virtio-mmio-transport device is connected and in turn provides
virtio-transport bus. Then virtio backends could be connected to this
bus.

Also this patch includes some fixes for bugs spotted by
Ying-Shiuan Pan ys...@itri.org.tw.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com

Conflicts:

Makefile.objs
---
 hw/Makefile.objs |1 +
 hw/virtio-mmio.c |  400 ++
 2 files changed, 401 insertions(+)
 create mode 100644 hw/virtio-mmio.c

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index db4c16d..0c112af 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -3,6 +3,7 @@ hw-obj-y += loader.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
 hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-transport.o
+hw-obj-$(CONFIG_VIRTIO) += virtio-mmio.o
 hw-obj-y += fw_cfg.o
 hw-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bridge_dev.o
 hw-obj-$(CONFIG_PCI) += msix.o msi.o
diff --git a/hw/virtio-mmio.c b/hw/virtio-mmio.c
new file mode 100644
index 000..88e5d9f
--- /dev/null
+++ b/hw/virtio-mmio.c
@@ -0,0 +1,400 @@
+/*
+ * Virtio MMIO bindings
+ *
+ * Copyright (c) 2011 Linaro Limited
+ *
+ * Authors:
+ *  Peter Maydell peter.mayd...@linaro.org
+ *  Evgeny Voevodin e.voevo...@samsung.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+/* TODO:
+ *  * save/load support
+ *  * test net, serial, balloon
+ */
+
+#include sysbus.h
+#include virtio.h
+#include virtio-transport.h
+#include virtio-blk.h
+#include virtio-net.h
+#include virtio-serial.h
+#include host-utils.h
+
+/* #define DEBUG_VIRTIO_MMIO */
+
+#ifdef DEBUG_VIRTIO_MMIO
+
+#define DPRINTF(fmt, ...) \
+do { printf(virtio_mmio:  fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while (0)
+#endif
+
+/* Memory mapped register offsets */
+#define VIRTIO_MMIO_MAGIC 0x0
+#define VIRTIO_MMIO_VERSION 0x4
+#define VIRTIO_MMIO_DEVICEID 0x8
+#define VIRTIO_MMIO_VENDORID 0xc
+#define VIRTIO_MMIO_HOSTFEATURES 0x10
+#define VIRTIO_MMIO_HOSTFEATURESSEL 0x14
+#define VIRTIO_MMIO_GUESTFEATURES 0x20
+#define VIRTIO_MMIO_GUESTFEATURESSEL 0x24
+#define VIRTIO_MMIO_GUESTPAGESIZE 0x28
+#define VIRTIO_MMIO_QUEUESEL 0x30
+#define VIRTIO_MMIO_QUEUENUMMAX 0x34
+#define VIRTIO_MMIO_QUEUENUM 0x38
+#define VIRTIO_MMIO_QUEUEALIGN 0x3c
+#define VIRTIO_MMIO_QUEUEPFN 0x40
+#define VIRTIO_MMIO_QUEUENOTIFY 0x50
+#define VIRTIO_MMIO_INTERRUPTSTATUS 0x60
+#define VIRTIO_MMIO_INTERRUPTACK 0x64
+#define VIRTIO_MMIO_STATUS 0x70
+/* Device specific config space starts here */
+#define VIRTIO_MMIO_CONFIG 0x100
+
+#define VIRT_MAGIC 0x74726976 /* 'virt' */
+#define VIRT_VERSION 1
+#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
+
+enum VIRTIO_MMIO_MAPPINGS {
+VIRTIO_MMIO_IOMAP,
+VIRTIO_MMIO_IOMEM,
+};
+
+typedef struct {
+SysBusDevice busdev;
+VirtIODevice *vdev;
+VirtIOTransportLink *trl;
+
+MemoryRegion iomap; /* hold base address */
+MemoryRegion iomem; /* hold io funcs */
+MemoryRegion alias;
+qemu_irq irq;
+uint32_t int_enable;
+uint32_t host_features;
+uint32_t host_features_sel;
+uint32_t guest_features_sel;
+uint32_t guest_page_shift;
+} VirtIOMMIO;
+
+static uint64_t virtio_mmio_read(void *opaque, target_phys_addr_t offset,
+ unsigned size)
+{
+VirtIOMMIO *s = (VirtIOMMIO *)opaque;
+VirtIODevice *vdev = s-vdev;
+DPRINTF(virtio_mmio_read offset 0x%x\n, (int)offset);
+if (offset = VIRTIO_MMIO_CONFIG) {
+offset -= VIRTIO_MMIO_CONFIG;
+switch (size) {
+case 1:
+return virtio_config_readb(vdev, offset);
+case 2:
+return virtio_config_readw(vdev, offset);
+case 4:
+return virtio_config_readl(vdev, offset);
+default:
+abort();
+}
+}
+if (size != 4) {
+DPRINTF(wrong size access to register!\n);
+return 0;
+}
+switch (offset) {
+case VIRTIO_MMIO_MAGIC:
+return VIRT_MAGIC;
+case VIRTIO_MMIO_VERSION:
+return VIRT_VERSION;
+case VIRTIO_MMIO_DEVICEID:
+return vdev-device_id;
+case

[Qemu-devel] [RFC v2 07/12] hw/virtio-serial-bus.c: Add virtio-serial device.

2012-09-17 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/virtio-serial-bus.c |   44 
 hw/virtio-serial.h |   11 +++
 2 files changed, 55 insertions(+)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 82073f5..699a485 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -24,6 +24,7 @@
 #include sysbus.h
 #include trace.h
 #include virtio-serial.h
+#include virtio-transport.h
 
 /* The virtio-serial bus on top of which the ports will ride as devices */
 struct VirtIOSerialBus {
@@ -1014,3 +1015,46 @@ static void virtio_serial_register_types(void)
 }
 
 type_init(virtio_serial_register_types)
+
+/ VirtIOSer Device **/
+
+static int virtio_serialdev_init(DeviceState *dev)
+{
+VirtIODevice *vdev;
+VirtIOSerState *s = VIRTIO_SERIAL_FROM_QDEV(dev);
+vdev = virtio_serial_init(dev, s-serial);
+if (!vdev) {
+return -1;
+}
+
+assert(s-trl != NULL);
+
+return virtio_call_backend_init_cb(dev, s-trl, vdev);
+}
+
+static Property virtio_serial_properties[] = {
+DEFINE_PROP_UINT32(max_ports, VirtIOSerState,
+   serial.max_virtserial_ports, 31),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_serial_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+dc-init = virtio_serialdev_init;
+dc-props = virtio_serial_properties;
+}
+
+static TypeInfo virtio_serial_info = {
+.name = virtio-serial,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(VirtIOSerState),
+.class_init = virtio_serial_class_init,
+};
+
+static void virtio_ser_register_types(void)
+{
+type_register_static(virtio_serial_info);
+}
+
+type_init(virtio_ser_register_types)
diff --git a/hw/virtio-serial.h b/hw/virtio-serial.h
index 16e3982..c6b916a 100644
--- a/hw/virtio-serial.h
+++ b/hw/virtio-serial.h
@@ -15,8 +15,10 @@
 #ifndef _QEMU_VIRTIO_SERIAL_H
 #define _QEMU_VIRTIO_SERIAL_H
 
+#include sysbus.h
 #include qdev.h
 #include virtio.h
+#include virtio-transport.h
 
 /* == Interface shared between the guest kernel and qemu == */
 
@@ -173,6 +175,15 @@ struct VirtIOSerialPort {
 bool throttled;
 };
 
+typedef struct {
+DeviceState qdev;
+/* virtio-serial */
+virtio_serial_conf serial;
+VirtIOTransportLink *trl;
+} VirtIOSerState;
+
+#define VIRTIO_SERIAL_FROM_QDEV(dev) DO_UPCAST(VirtIOSerState, qdev, dev)
+
 /* Interface to the virtio-serial bus */
 
 /*
-- 
1.7.9.5




[Qemu-devel] [RFC v2 09/12] hw/virtio-net.c: Add virtio-net device.

2012-09-17 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/virtio-net.c |   59 +++
 hw/virtio-net.h |   16 +++
 2 files changed, 75 insertions(+)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index b1998b2..b7cfb1c 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -13,6 +13,8 @@
 
 #include iov.h
 #include virtio.h
+#include virtio-transport.h
+#include virtio-pci.h
 #include net.h
 #include net/checksum.h
 #include net/tap.h
@@ -1080,3 +1082,60 @@ void virtio_net_exit(VirtIODevice *vdev)
 qemu_del_net_client(n-nic-nc);
 virtio_cleanup(n-vdev);
 }
+
+/ VirtIONet Device **/
+
+static int virtio_netdev_init(DeviceState *dev)
+{
+VirtIODevice *vdev;
+VirtIONetState *s = VIRTIO_NET_FROM_QDEV(dev);
+
+assert(s-trl != NULL);
+
+vdev = virtio_net_init(dev, s-nic, s-net);
+
+/* Pass default host_features to transport */
+s-trl-host_features = s-host_features;
+
+if (virtio_call_backend_init_cb(dev, s-trl, vdev) != 0) {
+return -1;
+}
+
+/* Binding should be ready here, let's get final features */
+if (vdev-binding-get_features) {
+   s-host_features = vdev-binding-get_features(vdev-binding_opaque);
+}
+return 0;
+}
+
+static Property virtio_net_properties[] = {
+DEFINE_VIRTIO_NET_FEATURES(VirtIONetState, host_features),
+DEFINE_NIC_PROPERTIES(VirtIONetState, nic),
+DEFINE_PROP_UINT32(x-txtimer, VirtIONetState, net.txtimer,
+TX_TIMER_INTERVAL),
+DEFINE_PROP_INT32(x-txburst, VirtIONetState, net.txburst, TX_BURST),
+DEFINE_PROP_STRING(tx, VirtIONetState, net.tx),
+DEFINE_PROP_TRANSPORT(transport, VirtIONetState, trl),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_net_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+dc-init = virtio_netdev_init;
+dc-props = virtio_net_properties;
+}
+
+static TypeInfo virtio_net_info = {
+.name = virtio-net,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(VirtIONetState),
+.class_init = virtio_net_class_init,
+};
+
+static void virtio_net_register_types(void)
+{
+type_register_static(virtio_net_info);
+}
+
+type_init(virtio_net_register_types)
diff --git a/hw/virtio-net.h b/hw/virtio-net.h
index 36aa463..8dd49d3 100644
--- a/hw/virtio-net.h
+++ b/hw/virtio-net.h
@@ -14,7 +14,9 @@
 #ifndef _QEMU_VIRTIO_NET_H
 #define _QEMU_VIRTIO_NET_H
 
+#include sysbus.h
 #include virtio.h
+#include virtio-transport.h
 #include net.h
 #include pci.h
 
@@ -187,4 +189,18 @@ struct virtio_net_ctrl_mac {
 DEFINE_PROP_BIT(ctrl_rx, _state, _field, VIRTIO_NET_F_CTRL_RX, 
true), \
 DEFINE_PROP_BIT(ctrl_vlan, _state, _field, VIRTIO_NET_F_CTRL_VLAN, 
true), \
 DEFINE_PROP_BIT(ctrl_rx_extra, _state, _field, 
VIRTIO_NET_F_CTRL_RX_EXTRA, true)
+
+typedef struct {
+DeviceState qdev;
+/* virtio-net */
+NICConf nic;
+virtio_net_conf net;
+
+uint32_t host_features;
+
+VirtIOTransportLink *trl;
+} VirtIONetState;
+
+#define VIRTIO_NET_FROM_QDEV(dev) DO_UPCAST(VirtIONetState, qdev, dev)
+
 #endif
-- 
1.7.9.5




[Qemu-devel] [RFC v2 11/12] hw/virtio-pci-new.c: Add VirtIOPCI device.

2012-09-17 Thread Evgeny Voevodin
This commit adds VirtIOPCI device implementation which is temporary
held in virtio-pci-new.c file. We need this file until virtio-xxx-pci
devices in hw/virtio-pci.c are not implemented in the way that they
just create virtio-pci and virtio-xxx devices during initialization.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/Makefile.objs|1 +
 hw/virtio-pci-new.c |  925 +++
 hw/virtio-pci.h |   18 +
 3 files changed, 944 insertions(+)
 create mode 100644 hw/virtio-pci-new.c

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 0c112af..e5bda7f 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -2,6 +2,7 @@ hw-obj-y = usb/ ide/
 hw-obj-y += loader.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-console.o
 hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
+hw-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci-new.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-transport.o
 hw-obj-$(CONFIG_VIRTIO) += virtio-mmio.o
 hw-obj-y += fw_cfg.o
diff --git a/hw/virtio-pci-new.c b/hw/virtio-pci-new.c
new file mode 100644
index 000..c1650b5
--- /dev/null
+++ b/hw/virtio-pci-new.c
@@ -0,0 +1,925 @@
+/*
+ * Virtio PCI Bindings
+ *
+ * Copyright IBM, Corp. 2007
+ * Copyright (c) 2009 CodeSourcery
+ *
+ * Authors:
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *  Paul Brookp...@codesourcery.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
+ */
+
+#include inttypes.h
+
+#include virtio.h
+#include virtio-transport.h
+#include virtio-blk.h
+#include virtio-net.h
+#include virtio-serial.h
+#include virtio-scsi.h
+#include virtio-balloon.h
+#include pci.h
+#include qemu-error.h
+#include msi.h
+#include msix.h
+#include net.h
+#include loader.h
+#include kvm.h
+#include blockdev.h
+#include virtio-pci.h
+#include range.h
+
+/* from Linux's linux/virtio_pci.h */
+
+/* A 32-bit r/o bitmask of the features supported by the host */
+#define VIRTIO_PCI_HOST_FEATURES0
+
+/* A 32-bit r/w bitmask of features activated by the guest */
+#define VIRTIO_PCI_GUEST_FEATURES   4
+
+/* A 32-bit r/w PFN for the currently selected queue */
+#define VIRTIO_PCI_QUEUE_PFN8
+
+/* A 16-bit r/o queue size for the currently selected queue */
+#define VIRTIO_PCI_QUEUE_NUM12
+
+/* A 16-bit r/w queue selector */
+#define VIRTIO_PCI_QUEUE_SEL14
+
+/* A 16-bit r/w queue notifier */
+#define VIRTIO_PCI_QUEUE_NOTIFY 16
+
+/* An 8-bit device status register.  */
+#define VIRTIO_PCI_STATUS   18
+
+/* An 8-bit r/o interrupt status register.  Reading the value will return the
+ * current contents of the ISR and will also clear it.  This is effectively
+ * a read-and-acknowledge. */
+#define VIRTIO_PCI_ISR  19
+
+/* MSI-X registers: only enabled if MSI-X is enabled. */
+/* A 16-bit vector for configuration changes. */
+#define VIRTIO_MSI_CONFIG_VECTOR20
+/* A 16-bit vector for selected queue notifications. */
+#define VIRTIO_MSI_QUEUE_VECTOR 22
+
+/* Config space size */
+#define VIRTIO_PCI_CONFIG_NOMSI 20
+#define VIRTIO_PCI_CONFIG_MSI   24
+#define VIRTIO_PCI_REGION_SIZE(dev) (msix_present(dev) ? \
+ VIRTIO_PCI_CONFIG_MSI : \
+ VIRTIO_PCI_CONFIG_NOMSI)
+
+/* The remaining space is defined by each driver as the per-driver
+ * configuration space */
+#define VIRTIO_PCI_CONFIG(dev)  (msix_enabled(dev) ? \
+ VIRTIO_PCI_CONFIG_MSI : \
+ VIRTIO_PCI_CONFIG_NOMSI)
+
+/* How many bits to shift physical queue address written to QUEUE_PFN.
+ * 12 is historical, and due to x86 page size. */
+#define VIRTIO_PCI_QUEUE_ADDR_SHIFT12
+
+/* Flags track per-device state like workarounds for quirks in older guests. */
+#define VIRTIO_PCI_FLAG_BUS_MASTER_BUG  (1  0)
+
+/* QEMU doesn't strictly need write barriers since everything runs in
+ * lock-step.  We'll leave the calls to wmb() in though to make it obvious for
+ * KVM or if kqemu gets SMP support.
+ */
+#define wmb() do { } while (0)
+
+/* HACK for virtio to determine if it's running a big endian guest */
+bool virtio_is_big_endian(void);
+
+/* virtio device */
+
+static void virtio_pci_notify(void *opaque, uint16_t vector)
+{
+VirtIOPCI *s = opaque;
+if (msix_enabled(s-pci_dev)) {
+msix_notify(s-pci_dev, vector);
+}
+else {
+qemu_set_irq(s-pci_dev.irq[0], s-vdev-isr  1);
+}
+}
+
+static void virtio_pci_save_config(void * opaque, QEMUFile *f)
+{
+VirtIOPCI *s = opaque;
+pci_device_save(s-pci_dev, f);
+msix_save(s-pci_dev, f);
+if (msix_present(s-pci_dev)) {
+qemu_put_be16(f, s-vdev-config_vector);
+}
+}
+
+static

[Qemu-devel] [RFC v2 10/12] hw/virtio-blk.c: Add virtio-blk device.

2012-09-17 Thread Evgeny Voevodin
Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/virtio-blk.c |   65 +++
 hw/virtio-blk.h |   15 +
 2 files changed, 80 insertions(+)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 6f6d172..0a23352 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -16,6 +16,8 @@
 #include trace.h
 #include hw/block-common.h
 #include blockdev.h
+#include virtio-transport.h
+#include virtio-pci.h
 #include virtio-blk.h
 #include scsi-defs.h
 #ifdef __linux__
@@ -665,3 +667,66 @@ void virtio_blk_exit(VirtIODevice *vdev)
 blockdev_mark_auto_del(s-bs);
 virtio_cleanup(vdev);
 }
+
+/ VirtIOBlk Device **/
+
+static int virtio_blkdev_init(DeviceState *dev)
+{
+VirtIODevice *vdev;
+VirtIOBlockState *s = VIRTIO_BLK_FROM_QDEV(dev);
+
+assert(s-trl != NULL);
+
+vdev = virtio_blk_init(dev, s-blk);
+if (!vdev) {
+return -1;
+}
+
+/* Pass default host_features to transport */
+s-trl-host_features = s-host_features;
+
+if (virtio_call_backend_init_cb(dev, s-trl, vdev) != 0) {
+return -1;
+}
+
+/* Binding should be ready here, let's get final features */
+if (vdev-binding-get_features) {
+   s-host_features = vdev-binding-get_features(vdev-binding_opaque);
+}
+return 0;
+}
+
+static Property virtio_blkdev_properties[] = {
+DEFINE_BLOCK_PROPERTIES(VirtIOBlockState, blk.conf),
+DEFINE_BLOCK_CHS_PROPERTIES(VirtIOBlockState, blk.conf),
+DEFINE_PROP_STRING(serial, VirtIOBlockState, blk.serial),
+#ifdef __linux__
+DEFINE_PROP_BIT(scsi, VirtIOBlockState, blk.scsi, 0, true),
+#endif
+DEFINE_PROP_BIT(config-wce, VirtIOBlockState, blk.config_wce, 0, true),
+DEFINE_VIRTIO_BLK_FEATURES(VirtIOBlockState, host_features),
+
+DEFINE_PROP_TRANSPORT(transport, VirtIOBlockState, trl),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void virtio_blkdev_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+dc-init = virtio_blkdev_init;
+dc-props = virtio_blkdev_properties;
+}
+
+static TypeInfo virtio_blkdev_info = {
+.name = virtio-blk,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(VirtIOBlockState),
+.class_init = virtio_blkdev_class_init,
+};
+
+static void virtio_blk_register_types(void)
+{
+type_register_static(virtio_blkdev_info);
+}
+
+type_init(virtio_blk_register_types)
diff --git a/hw/virtio-blk.h b/hw/virtio-blk.h
index f0740d0..0886818 100644
--- a/hw/virtio-blk.h
+++ b/hw/virtio-blk.h
@@ -14,7 +14,9 @@
 #ifndef _QEMU_VIRTIO_BLK_H
 #define _QEMU_VIRTIO_BLK_H
 
+#include sysbus.h
 #include virtio.h
+#include virtio-transport.h
 #include hw/block-common.h
 
 /* from Linux's linux/virtio_blk.h */
@@ -111,4 +113,17 @@ struct VirtIOBlkConf
 DEFINE_VIRTIO_COMMON_FEATURES(_state, _field), \
 DEFINE_PROP_BIT(config-wce, _state, _field, VIRTIO_BLK_F_CONFIG_WCE, 
true)
 
+
+typedef struct {
+DeviceState qdev;
+/* virtio-blk */
+VirtIOBlkConf blk;
+
+uint32_t host_features;
+
+VirtIOTransportLink *trl;
+} VirtIOBlockState;
+
+#define VIRTIO_BLK_FROM_QDEV(dev) DO_UPCAST(VirtIOBlockState, qdev, dev)
+
 #endif
-- 
1.7.9.5




[Qemu-devel] [RFC v2 00/12] Virtio-mmio refactoring.

2012-09-17 Thread Evgeny Voevodin
Previous RFC you can find at 
http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03665.html
Yes, long time ago...
Since I'm not sure when I'll be able to continue on this,
I'm publishing this work as is.
In this patchset I tried to split virtio-xxx-pci devices into 
virtio-pci + virtio-xxx (blk, net, serial,...). Also virtio-mmio
transport is introduced based on Peter's work which is accessible 
here: http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html

The main idea was to let users specify
-device virtio-pci,id=virtio-pci.0
-device virtio-blk,transport=virtio-pci.0,...
 
and

-device virtio-mmio,id=virtio-mmio.0
-device virtio-blk,transport=virtio-mmio.0,...

I created virtio-pci and virtio-mmio transport devices and tried to enclose
back-end functionality into virtio-blk, virtio-net, etc. On
initialization of transport device it creates a bus to which a back-end device
could be connected. Each back-end device is implemented in corresponding source
file. As for PCI transport, I temporary placed it in a new virtio-pci-new.c file
to not break a functionality of still presented virtio-xxx-pci devices.

Known issues to be resolved:
1. On creation of back-end we need to resolve somehow if props were explicitly 
set
by user.
2. Back-end device can't be initialized if there are no free bus created by 
transport,
so you can't specify
-device virtio-blk,transport=virtio-pci.0,...
-device virtio-pci,id=virtio-pci.0
3. Implement virtio-xxx-devices such that they just create virtio-pci and 
virtio-xxx
devices during initialization.
4. Refactor all remaining back-ends since I just tried blk, net, serial and 
balloon.
5. Refactor s390
6. Further?

Evgeny Voevodin (9):
  Virtio: Add transport bindings.
  hw/qdev-properties.c: Add transport property.
  hw/pci.c: Make pci_add_option_rom global visible
  hw/virtio-serial-bus.c: Add virtio-serial device.
  hw/virtio-balloon.c: Add virtio-balloon device.
  hw/virtio-net.c: Add virtio-net device.
  hw/virtio-blk.c: Add virtio-blk device.
  hw/virtio-pci-new.c: Add VirtIOPCI device.
  hw/exynos4210.c: Create two virtio-mmio transport instances.

Peter Maydell (3):
  virtio: Add support for guest setting of queue size
  virtio: Support transports which can specify the vring alignment
  Add MMIO based virtio transport

 hw/Makefile.objs   |3 +
 hw/exynos4210.c|   13 +
 hw/pci.c   |3 +-
 hw/pci.h   |2 +
 hw/qdev-properties.c   |   29 ++
 hw/qdev.h  |3 +
 hw/virtio-balloon.c|   42 +++
 hw/virtio-balloon.h|9 +
 hw/virtio-blk.c|   65 
 hw/virtio-blk.h|   15 +
 hw/virtio-mmio.c   |  400 +
 hw/virtio-net.c|   59 +++
 hw/virtio-net.h|   16 +
 hw/virtio-pci-new.c|  925 
 hw/virtio-pci.h|   18 +
 hw/virtio-serial-bus.c |   44 +++
 hw/virtio-serial.h |   11 +
 hw/virtio-transport.c  |  147 
 hw/virtio-transport.h  |   74 
 hw/virtio.c|   20 +-
 hw/virtio.h|2 +
 21 files changed, 1896 insertions(+), 4 deletions(-)
 create mode 100644 hw/virtio-mmio.c
 create mode 100644 hw/virtio-pci-new.c
 create mode 100644 hw/virtio-transport.c
 create mode 100644 hw/virtio-transport.h

-- 
1.7.9.5




[Qemu-devel] [RFC v2 12/12] hw/exynos4210.c: Create two virtio-mmio transport instances.

2012-09-17 Thread Evgeny Voevodin
NB: This is for test purposes only.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/exynos4210.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/exynos4210.c b/hw/exynos4210.c
index 00d4db8..70fcdd6 100644
--- a/hw/exynos4210.c
+++ b/hw/exynos4210.c
@@ -26,6 +26,7 @@
 #include sysbus.h
 #include arm-misc.h
 #include loader.h
+#include virtio-transport.h
 #include exynos4210.h
 
 #define EXYNOS4210_CHIPID_ADDR 0x1000
@@ -72,6 +73,10 @@
 /* Display controllers (FIMD) */
 #define EXYNOS4210_FIMD0_BASE_ADDR  0x11C0
 
+/* VirtIO MMIO */
+#define EXYNOS4210_VIRTIO_MMIO0_BASE_ADDR   0x10AD
+#define EXYNOS4210_VIRTIO_MMIO1_BASE_ADDR   0x10AC
+
 static uint8_t chipid_and_omr[] = { 0x11, 0x02, 0x21, 0x43,
 0x09, 0x00, 0x00, 0x00 };
 
@@ -334,5 +339,13 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem,
 s-irq_table[exynos4210_get_irq(11, 2)],
 NULL);
 
+sysbus_create_simple(VIRTIO_MMIO,
+ EXYNOS4210_VIRTIO_MMIO0_BASE_ADDR,
+ s-irq_table[exynos4210_get_irq(37, 3)]);
+
+sysbus_create_simple(VIRTIO_MMIO,
+ EXYNOS4210_VIRTIO_MMIO1_BASE_ADDR,
+ s-irq_table[exynos4210_get_irq(37, 2)]);
+
 return s;
 }
-- 
1.7.9.5




[Qemu-devel] [RFC v2 05/12] hw/pci.c: Make pci_add_option_rom global visible

2012-09-17 Thread Evgeny Voevodin
We need to use this function to load rom for virtio-net backend.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/pci.c |3 +--
 hw/pci.h |2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index f855cf3..bba69ef 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -77,7 +77,6 @@ static const TypeInfo pci_bus_info = {
 static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num);
 static void pci_update_mappings(PCIDevice *d);
 static void pci_set_irq(void *opaque, int irq_num, int level);
-static int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom);
 static void pci_del_option_rom(PCIDevice *pdev);
 
 static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET;
@@ -1733,7 +1732,7 @@ static void pci_patch_ids(PCIDevice *pdev, uint8_t *ptr, 
int size)
 }
 
 /* Add an option rom for the device */
-static int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom)
+int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom)
 {
 int size;
 char *path;
diff --git a/hw/pci.h b/hw/pci.h
index 4b6ab3d..5f47618 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -274,6 +274,8 @@ void pci_register_bar(PCIDevice *pci_dev, int region_num,
   uint8_t attr, MemoryRegion *memory);
 pcibus_t pci_get_bar_addr(PCIDevice *pci_dev, int region_num);
 
+int pci_add_option_rom(PCIDevice *pdev, bool is_default_rom);
+
 int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
uint8_t offset, uint8_t size);
 
-- 
1.7.9.5




[Qemu-devel] [RFC v2 04/12] hw/qdev-properties.c: Add transport property.

2012-09-17 Thread Evgeny Voevodin
Virtio back-end devices can be plugged into both transports:
VIRTIO_PCI and VIRTIO_MMIO. In order to choose the desired
transport we have a property transport in every back-end
state struct. By specifying -device virtio-blk-pci user chooses
VIRTIO_PCI transport and transport property is set automatically.
But in order to provide full control to user we need to have
transport property available to be set through command line:

-device virtio-pci,id=virtio-pci.0
-device virtio-blk,transport=virtio-pci.0,...

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/qdev-properties.c |   29 +
 hw/qdev.h|3 +++
 2 files changed, 32 insertions(+)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index 8aca0d4..4226a02 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -2,6 +2,7 @@
 #include qdev.h
 #include qerror.h
 #include blockdev.h
+#include virtio-transport.h
 #include hw/block-common.h
 #include net/hub.h
 
@@ -526,6 +527,34 @@ PropertyInfo qdev_prop_drive = {
 .release = release_drive,
 };
 
+/* --- virtio transport --- */
+
+static int parse_transport(DeviceState *dev, const char *str, void **ptr)
+{
+VirtIOTransportLink *trl;
+
+trl = virtio_find_transport(str);
+
+if (trl == NULL) {
+return -ENOENT;
+}
+
+*ptr = trl;
+
+return 0;
+}
+
+static void set_transport(Object *obj, Visitor *v, void *opaque,
+  const char *name, Error **errp)
+{
+set_pointer(obj, v, opaque, parse_transport, name, errp);
+}
+
+PropertyInfo qdev_prop_transport = {
+.name  = transport,
+.set   = set_transport,
+};
+
 /* --- character device --- */
 
 static int parse_chr(DeviceState *dev, const char *str, void **ptr)
diff --git a/hw/qdev.h b/hw/qdev.h
index d699194..bd6aa6e 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -233,6 +233,7 @@ extern PropertyInfo qdev_prop_macaddr;
 extern PropertyInfo qdev_prop_losttickpolicy;
 extern PropertyInfo qdev_prop_bios_chs_trans;
 extern PropertyInfo qdev_prop_drive;
+extern PropertyInfo qdev_prop_transport;
 extern PropertyInfo qdev_prop_netdev;
 extern PropertyInfo qdev_prop_vlan;
 extern PropertyInfo qdev_prop_pci_devfn;
@@ -294,6 +295,8 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
 DEFINE_PROP(_n, _s, _f, qdev_prop_vlan, NetClientState*)
 #define DEFINE_PROP_DRIVE(_n, _s, _f) \
 DEFINE_PROP(_n, _s, _f, qdev_prop_drive, BlockDriverState *)
+#define DEFINE_PROP_TRANSPORT(_n, _s, _f) \
+DEFINE_PROP(_n, _s, _f, qdev_prop_transport, VirtIOTransportLink *)
 #define DEFINE_PROP_MACADDR(_n, _s, _f) \
 DEFINE_PROP(_n, _s, _f, qdev_prop_macaddr, MACAddr)
 #define DEFINE_PROP_LOSTTICKPOLICY(_n, _s, _f, _d) \
-- 
1.7.9.5




Re: [Qemu-devel] [RFC v2 04/12] hw/qdev-properties.c: Add transport property.

2012-09-17 Thread Evgeny Voevodin

On 09/17/2012 04:42 PM, Paolo Bonzini wrote:

Il 17/09/2012 12:00, Evgeny Voevodin ha scritto:

Virtio back-end devices can be plugged into both transports:
VIRTIO_PCI and VIRTIO_MMIO. In order to choose the desired
transport we have a property transport in every back-end
state struct. By specifying -device virtio-blk-pci user chooses
VIRTIO_PCI transport and transport property is set automatically.
But in order to provide full control to user we need to have
transport property available to be set through command line:

-device virtio-pci,id=virtio-pci.0
-device virtio-blk,transport=virtio-pci.0,...

What's the difference between this and bus? i.e.

   -device virtio-pci,id=virtio-pci-0
   -device virtio-blk,bus=virtio-pci-0.0,...

Paolo



The difference is that with transport I used a linked list like, say, 
bdrv_states in block.c.

It's much simpler then use buses. Also I was planning to use a link.
In this approach buses are used only to reflect hierarchy of devices in 
emulator manager.
And yes, cover letter contains quite misleading information because 
attach to transport
is based on a list of links, not on buses. Sorry, I forgot that when 
wrote the cover.


--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




[Qemu-devel] [Bug 1036987] [NEW] compilation error due to bug in savevm.c

2012-08-15 Thread Evgeny Voevodin
Public bug reported:

Since

302dfbeb21fc5154c24ca50d296e865a3778c7da

Add xbzrle_encode_buffer and xbzrle_decode_buffer functions

For performance we are encoding long word at a time.
For nzrun we use long-word-at-a-time NULL-detection tricks from strcmp():
using ((lword - 0x0101010101010101)  (~lword)  0x8080808080808080) test
to find out if any byte in the long word is zero.

Signed-off-by: Benoit Hudzia benoit.hud...@sap.com
Signed-off-by: Petter Svard pett...@cs.umu.se
Signed-off-by: Aidan Shribman aidan.shrib...@sap.com
Signed-off-by: Orit Wasserman owass...@redhat.com
Signed-off-by: Eric Blake ebl...@redhat.com

Reviewed-by: Luiz Capitulino lcapitul...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com

 commit arrived into master barnch, I can't compile qemu at all:

savevm.c:2476:13: error: overflow in implicit constant conversion
[-Werror=overflow]

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1036987

Title:
  compilation error due to bug in savevm.c

Status in QEMU:
  New

Bug description:
  Since

  302dfbeb21fc5154c24ca50d296e865a3778c7da

  Add xbzrle_encode_buffer and xbzrle_decode_buffer functions
  
  For performance we are encoding long word at a time.
  For nzrun we use long-word-at-a-time NULL-detection tricks from strcmp():
  using ((lword - 0x0101010101010101)  (~lword)  0x8080808080808080) test
  to find out if any byte in the long word is zero.
  
  Signed-off-by: Benoit Hudzia benoit.hud...@sap.com
  Signed-off-by: Petter Svard pett...@cs.umu.se
  Signed-off-by: Aidan Shribman aidan.shrib...@sap.com
  Signed-off-by: Orit Wasserman owass...@redhat.com
  Signed-off-by: Eric Blake ebl...@redhat.com
  
  Reviewed-by: Luiz Capitulino lcapitul...@redhat.com
  Reviewed-by: Eric Blake ebl...@redhat.com

   commit arrived into master barnch, I can't compile qemu at all:

  savevm.c:2476:13: error: overflow in implicit constant conversion
  [-Werror=overflow]

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1036987/+subscriptions



[Qemu-devel] [PATCH] savevm.c: Fix compilation error on 32bit host.

2012-08-15 Thread Evgeny Voevodin
Casting of 0x0101010101010101ULL to long will truncate it to 32
bits on 32bit hosts, and won't truncate on 64bit hosts.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 savevm.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/savevm.c b/savevm.c
index 0ea10c9..9ab4d83 100644
--- a/savevm.c
+++ b/savevm.c
@@ -2473,7 +2473,7 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t 
*new_buf, int slen,
 /* word at a time for speed, use of 32-bit long okay */
 if (!res) {
 /* truncation to 32-bit long okay */
-long mask = 0x0101010101010101ULL;
+long mask = (long)0x0101010101010101ULL;
 while (i  slen) {
 xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
 if ((xor - mask)  ~xor  (mask  7)) {
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH] savevm.c: Fix compilation error on 32bit host.

2012-08-15 Thread Evgeny Voevodin

On 08/15/2012 01:30 PM, Peter Maydell wrote:

On 15 August 2012 10:10, Evgeny Voevodin e.voevo...@samsung.com wrote:

Casting of 0x0101010101010101ULL to long will truncate it to 32
bits on 32bit hosts, and won't truncate on 64bit hosts.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com


Dup of http://patchwork.ozlabs.org/patch/177217/ I'm afraid.

-- PMM




Don't be afraid, it's true. I didn't see it in the mailing list and
didn't know that the bug is already fixed there.

--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] [PATCH 08/23] exynos4: Suppress unused default drives

2012-08-09 Thread Evgeny Voevodin

On 09.08.2012 17:31, Markus Armbruster wrote:

Cc: Evgeny Voevodin e.voevo...@samsung.com
Cc: Maksim Kozlov m.koz...@samsung.com
Cc: Igor Mitsyanko i.mitsya...@samsung.com
Cc: Dmitry Solodkiy d.solod...@samsung.com

Suppress default floppy, CD-ROM and SD card drives for machines nuri,
smdkc210.

Signed-off-by: Markus Armbruster arm...@redhat.com
---
  hw/exynos4_boards.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/hw/exynos4_boards.c b/hw/exynos4_boards.c
index 4bb0a60..1512c27 100644
--- a/hw/exynos4_boards.c
+++ b/hw/exynos4_boards.c
@@ -160,12 +160,18 @@ static QEMUMachine 
exynos4_machines[EXYNOS4_NUM_OF_BOARDS] = {
  .desc = Samsung NURI board (Exynos4210),
  .init = nuri_init,
  .max_cpus = EXYNOS4210_NCPUS,
+.no_floppy = 1,
+.no_cdrom = 1,
+.no_sdcard = 1,
  },
  [EXYNOS4_BOARD_SMDKC210] = {
  .name = smdkc210,
  .desc = Samsung SMDKC210 board (Exynos4210),
  .init = smdkc210_init,
  .max_cpus = EXYNOS4210_NCPUS,
+.no_floppy = 1,
+.no_cdrom = 1,
+.no_sdcard = 1,
  },
  };
  


Recently I looked in monitor and thought about how to dismiss them from 
our board )


Acked-by: Evgeny Voevodin e.voevo...@samsung.com


--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group,
Samsung Moscow Research Center,
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] [RFC][PATCH v2 4/4] configure: add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization

2012-07-05 Thread Evgeny Voevodin

On 05.07.2012 17:55, Andreas Färber wrote:

Am 05.07.2012 15:23, schrieb Yeongkyoon Lee:

Add an option --enable-ldst-optimization to enable 
CONFIG_QEMU_LDST_OPTIMIZATION macro for TCG qemu_ld/st optimization. It only works with 
CONFIG_SOFTMMU and doesn't work with CONFIG_TCG_PASS_AREG0.

Signed-off-by: Yeongkyoon Lee yeongkyoon@samsung.com
---
  configure |   15 +++
  1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index 9f071b7..2b364cc 100755
--- a/configure
+++ b/configure

[...]

@@ -3463,6 +3466,11 @@ echo EXESUF=$EXESUF  $config_host_mak
  echo LIBS_QGA+=$libs_qga  $config_host_mak
  echo POD2MAN=$POD2MAN  $config_host_mak
  
+if [ $ldst_optimization = yes -a $cpu != i386 -a $cpu != x86_64 ] ; then

+  echo ERROR: qemu_ld/st optimization is only available on i386 or x86_64 
hosts
+  exit 1
+fi

[snip]

I assume that Samsung is interested in optimizing the Exynos emulation.


Nope ) Originally it's from x86 Tizen emulator )


I think there was already a patchset posted converting target-arm to
CONFIG_PASS_TCG_AREG0, only with some slowdowns to be investigated...
What is the obstacle for supporting AREG0 mode in your optimization?

Regards,
Andreas


+
  # generate list of library paths for linker script
  
  $ld --verbose -v 2 /dev/null | grep SEARCH_DIR  ${config_host_ld}

@@ -3696,11 +3704,18 @@ fi
  symlink $source_path/Makefile.target $target_dir/Makefile
  
  
+target_ldst_optimization=$ldst_optimization

+
  case $target_arch2 in
alpha | sparc* | xtensa* | ppc*)
  echo CONFIG_TCG_PASS_AREG0=y  $config_target_mak
+# qemu_ld/st optimization is not available with CONFIG_TCG_PASS_AREG0
+target_ldst_optimization=no
;;
  esac
+if [ $target_ldst_optimization = yes -a $target_softmmu = yes ] ; then
+echo CONFIG_QEMU_LDST_OPTIMIZATION=y  $config_target_mak
+fi
  
  echo TARGET_SHORT_ALIGNMENT=$target_short_alignment  $config_target_mak

  echo TARGET_INT_ALIGNMENT=$target_int_alignment  $config_target_mak



--
Kind regards,
Evgeny Voevodin,
Technical Leader,
Mobile Group, SMRC, Samsung Electronics
e-mail: e.voevo...@samsung.com





[Qemu-devel] [PATCH 2/3] hw/exynos4210_pwm.c: Fix STOP status in tick handler.

2012-06-26 Thread Evgeny Voevodin
START/STOP bit was not cleaned correctly.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/exynos4210_pwm.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/exynos4210_pwm.c b/hw/exynos4210_pwm.c
index 6243e59..0c22828 100644
--- a/hw/exynos4210_pwm.c
+++ b/hw/exynos4210_pwm.c
@@ -200,7 +200,7 @@ static void exynos4210_pwm_tick(void *opaque)
 ptimer_run(p-timer[id].ptimer, 1);
 } else {
 /* stop timer, set status to STOP, see Basic Timer Operation */
-p-reg_tcon = ~TCON_TIMER_START(id);
+p-reg_tcon = ~TCON_TIMER_START(id);
 ptimer_stop(p-timer[id].ptimer);
 }
 }
-- 
1.7.9.5




[Qemu-devel] [PATCH 0/3] ARM: Exynos4210 bugfixes

2012-06-26 Thread Evgeny Voevodin
First patch is on the list: 
http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg03717.html
It fixes a critical bug in MCT that leads to hanged linux kernel v3.0. I 
preferred to pick this
patch into this patch set.

Second patch fixes STOP status bit setting in PWM
(not critical only since latest kernels use MCT as clock-source)

Third patch fixes misleading initialization of IROM mirror


Evgeny Voevodin (2):
  hw/exynos4210_pwm.c: Fix STOP status in tick handler.
  hw/exynos4210.c: Fix misleading initialization of IROM mirror

Stanislav Vorobiov (1):
  ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel.

 hw/exynos4210.c |2 +-
 hw/exynos4210_mct.c |4 
 hw/exynos4210_pwm.c |2 +-
 3 files changed, 2 insertions(+), 6 deletions(-)

-- 
1.7.9.5




[Qemu-devel] [PATCH 3/3] hw/exynos4210.c: Fix misleading initialization of IROM mirror

2012-06-26 Thread Evgeny Voevodin
We want to mirror whole IROM and should pass zero instead of
EXYNOS4210_IROM_BASE_ADDR (though it equals to zero too) since
memory_region_init_alias takes an offset within an original
region as an argument.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/exynos4210.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/exynos4210.c b/hw/exynos4210.c
index 9c20b3f..80a00b9 100644
--- a/hw/exynos4210.c
+++ b/hw/exynos4210.c
@@ -216,7 +216,7 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem,
 /* mirror of iROM */
 memory_region_init_alias(s-irom_alias_mem, exynos4210.irom_alias,
  s-irom_mem,
- EXYNOS4210_IROM_BASE_ADDR,
+ 0,
  EXYNOS4210_IROM_SIZE);
 memory_region_set_readonly(s-irom_alias_mem, true);
 memory_region_add_subregion(system_mem, EXYNOS4210_IROM_MIRROR_BASE_ADDR,
-- 
1.7.9.5




[Qemu-devel] [PATCH 1/3] ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel.

2012-06-26 Thread Evgeny Voevodin
From: Stanislav Vorobiov s.vorob...@samsung.com

After some long period of time Linux kernel hanged due to
ptimer_get_count may return 0 before timer interrupt occurs,
thus, causing FRC to jump back in time

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/exynos4210_mct.c |4 
 1 file changed, 4 deletions(-)

diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c
index 7474fcf..7a22b1f 100644
--- a/hw/exynos4210_mct.c
+++ b/hw/exynos4210_mct.c
@@ -376,10 +376,6 @@ static uint64_t exynos4210_gfrc_get_count(Exynos4210MCTGT 
*s)
 {
 uint64_t count = 0;
 count = ptimer_get_count(s-ptimer_frc);
-if (!count) {
-/* Timer event was generated and s-reg.cnt holds adequate value */
-return s-reg.cnt;
-}
 count = s-count - count;
 return s-reg.cnt + count;
 }
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH] Exynos4: added RTC device

2012-06-25 Thread Evgeny Voevodin
:
+s-reg_almhour = (value  0x3f);
+break;
+case ALMDAY:
+s-reg_almday = (value  0x3f);
+break;
+case ALMMON:
+s-reg_almmon = (value  0x1f);
+break;
+case ALMYEAR:
+s-reg_almyear = (value  0x0fff);
+break;
+
+case BCDSEC:
+if (s-reg_rtccon  RTC_ENABLE) {
+s-current_tm.tm_sec = rtc_from_bcd(value, 2);
+}
+break;
+case BDCMIN:
+if (s-reg_rtccon  RTC_ENABLE) {
+s-current_tm.tm_min = rtc_from_bcd(value, 2);
+}
+break;
+case BCDHOUR:
+if (s-reg_rtccon  RTC_ENABLE) {
+s-current_tm.tm_hour = rtc_from_bcd(value, 2);
+}
+break;
+case BCDDAYWEEK:
+if (s-reg_rtccon  RTC_ENABLE) {
+s-current_tm.tm_wday = rtc_from_bcd(value, 2);
+}
+break;
+case BCDDAY:
+if (s-reg_rtccon  RTC_ENABLE) {
+s-current_tm.tm_mday = rtc_from_bcd(value, 2);
+}
+break;
+case BCDMON:
+if (s-reg_rtccon  RTC_ENABLE) {
+s-current_tm.tm_mon = rtc_from_bcd(value, 2) - 1;
+}
+break;
+case BCDYEAR:
+if (s-reg_rtccon  RTC_ENABLE) {
+s-current_tm.tm_year = rtc_from_bcd(value, 3);
+}
+break;
+
+default:
+fprintf(stderr,
+[exynos4210.rtc: bad write offset  TARGET_FMT_plx ]\n,
+offset);
+break;
+
+}
+}
+
+/*
+ * Set default values to timer fields and registers
+ */
+static void exynos4210_rtc_reset(DeviceState *d)
+{
+Exynos4210RTCState *s = (Exynos4210RTCState *)d;
+
+struct tm tm;
+
+qemu_get_timedate(tm, 0);
+s-current_tm = tm;
+
+DPRINTF(Get time from host: %d-%d-%d %2d:%02d:%02d\n,
+s-current_tm.tm_year, s-current_tm.tm_mon, s-current_tm.tm_mday,
+s-current_tm.tm_hour, s-current_tm.tm_min, s-current_tm.tm_sec);
+
+s-reg_intp = 0;
+s-reg_rtccon = 0;
+s-reg_ticcnt = 0;
+s-reg_rtcalm = 0;
+s-reg_almsec = 0;
+s-reg_almmin = 0;
+s-reg_almhour = 0;
+s-reg_almday = 0;
+s-reg_almmon = 0;
+s-reg_almyear = 0;
+
+s-reg_curticcnt = 0;
+
+exynos4210_rtc_update_freq(s, s-reg_rtccon);
+ptimer_stop(s-ptimer);
+ptimer_stop(s-ptimer_1Hz);
+}
+
+static const MemoryRegionOps exynos4210_rtc_ops = {
+.read = exynos4210_rtc_read,
+.write = exynos4210_rtc_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+/*
+ * RTC timer initialization
+ */
+static int exynos4210_rtc_init(SysBusDevice *dev)
+{
+Exynos4210RTCState *s = FROM_SYSBUS(Exynos4210RTCState, dev);
+QEMUBH *bh;
+
+bh = qemu_bh_new(exynos4210_rtc_tick, s);
+s-ptimer = ptimer_init(bh);
+ptimer_set_freq(s-ptimer, RTC_BASE_FREQ);
+exynos4210_rtc_update_freq(s, 0);
+
+bh = qemu_bh_new(exynos4210_rtc_1Hz_tick, s);
+s-ptimer_1Hz = ptimer_init(bh);
+ptimer_set_freq(s-ptimer_1Hz, RTC_BASE_FREQ);
+
+sysbus_init_irq(dev,s-alm_irq);
+sysbus_init_irq(dev,s-tick_irq);
+
+memory_region_init_io(s-iomem,exynos4210_rtc_ops, s, exynos4210-rtc,
+EXYNOS4210_RTC_REG_MEM_SIZE);
+sysbus_init_mmio(dev,s-iomem);
+
+return 0;
+}
+
+static void exynos4210_rtc_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+
+k-init = exynos4210_rtc_init;
+dc-reset = exynos4210_rtc_reset;
+dc-vmsd =vmstate_exynos4210_rtc_state;
+}
+
+static const TypeInfo exynos4210_rtc_info = {
+.name  = exynos4210.rtc,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(Exynos4210RTCState),
+.class_init= exynos4210_rtc_class_init,
+};
+
+static void exynos4210_rtc_register_types(void)
+{
+type_register_static(exynos4210_rtc_info);
+}
+
+type_init(exynos4210_rtc_register_types)


Reviewed-by: Evgeny Voevodin e.voevo...@samsung.com

--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com



Re: [Qemu-devel] [PATCH] Exynos4: added RTC device

2012-06-25 Thread Evgeny Voevodin

On 25.06.2012 13:24, Andreas Färber wrote:

Am 25.06.2012 09:55, schrieb Oleg Ogurtsov:

Signed-off-by: Oleg Ogurtsovo.ogurt...@samsung.com
---
  hw/arm/Makefile.objs |1 +
  hw/exynos4210.c  |8 +
  hw/exynos4210_rtc.c  |  607 ++
  3 files changed, 616 insertions(+), 0 deletions(-)
  create mode 100644 hw/exynos4210_rtc.c

This RTC like many other Exynos devices has no dependency on the CPU. I
have a patch in preparation that moves such devices from
hw/arm/Makefile.objs to hw/Makefile.objs.
I don't object to this patch, not even minor style nits spotted,
compliment, but if you have to respin for some reason, it would be nice
if you could consider that improvement.


These devices are SOC specific and this SOC is based on ARM only.
Do we really need to move them?

--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com





Re: [Qemu-devel] [PATCH] Exynos4: added RTC device

2012-06-25 Thread Evgeny Voevodin

On 25.06.2012 16:00, Andreas Färber wrote:

Am 25.06.2012 13:46, schrieb Evgeny Voevodin:

On 25.06.2012 13:24, Andreas Färber wrote:

Am 25.06.2012 09:55, schrieb Oleg Ogurtsov:

Signed-off-by: Oleg Ogurtsovo.ogurt...@samsung.com
---
   hw/arm/Makefile.objs |1 +
   hw/exynos4210.c  |8 +
   hw/exynos4210_rtc.c  |  607
++
   3 files changed, 616 insertions(+), 0 deletions(-)
   create mode 100644 hw/exynos4210_rtc.c

This RTC like many other Exynos devices has no dependency on the CPU. I
have a patch in preparation that moves such devices from
hw/arm/Makefile.objs to hw/Makefile.objs.
I don't object to this patch, not even minor style nits spotted,
compliment, but if you have to respin for some reason, it would be nice
if you could consider that improvement.

These devices are SOC specific and this SOC is based on ARM only.
Do we really need to move them?

For one, they do not need to be rebuilt when cpu.h changes and they
should get the usual device poisoning for proper modeling.
For another, someone on IRC started work on an armeb-softmmu, for which
we would probably not want to compile in the Exynos devices. Or if we
do, we certainly don't want to compile everything twice (cf. xilinx).

If devices are ARM-specific and need access to the CPU (e.g., machines,
PICs) then according to Paolo they should be placed in hw/arm/ with the
new scheme. I'm trying to stay away from moving other people's files
around, but the Makefile changes are pretty non-intrusive and can go
through arm-devs.next.

Cheers,
Andreas



Oh, I see. Should we place this device to hw/Makefile.objs in v2?

--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com





[Qemu-devel] [PATCH] ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel.

2012-06-22 Thread Evgeny Voevodin
From: Stanislav Vorobiov s.vorob...@samsung.com

After some long period of time Linux kernel hanged due to
ptimer_get_count may return 0 before timer interrupt occurs,
thus, causing FRC to jump back in time

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/exynos4210_mct.c |4 
 1 file changed, 4 deletions(-)

diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c
index 7474fcf..7a22b1f 100644
--- a/hw/exynos4210_mct.c
+++ b/hw/exynos4210_mct.c
@@ -376,10 +376,6 @@ static uint64_t exynos4210_gfrc_get_count(Exynos4210MCTGT 
*s)
 {
 uint64_t count = 0;
 count = ptimer_get_count(s-ptimer_frc);
-if (!count) {
-/* Timer event was generated and s-reg.cnt holds adequate value */
-return s-reg.cnt;
-}
 count = s-count - count;
 return s-reg.cnt + count;
 }
-- 
1.7.9.5




Re: [Qemu-devel] [PATCH] ARM: hw/exynos4210_mct.c: Fix a bug which hangs Linux kernel.

2012-06-22 Thread Evgeny Voevodin

On 22.06.2012 11:56, Peter Crosthwaite wrote:

Hi Evgeny,

Im just speculating here, but I recently ran into Linux hangs on
Microblaze due to ptimer issues and I think you may be suffering the
same base issue.

The Microblaze timer (hw/xilinx_timer.c) has a similar implementation
to the exynos (chained one-shot ptimer). Recently Peter Chubb put
patch cf36b31db209a261ee3bc2737e788e1ced0a1bec through, which modified
ptimer to not set excessively short periods. Problem is, that only
works for periodic ptimers.


No, that's another problem. MCT was developed earlier then this commit 
landed
and it contains similar code to avoid such situation. But our patch 
fixes bug in logic.

The thing is that since ptimer uses BHs, it can be the situation when ptimer
is stopped but BH is not called yet. And exactly in this moment target reads
counter value that was incorrect calculated.


Anyways, you may find that changing your set_count() calls to
set_limit (i.e. the function designed for periodic timers) calls works
for you, without changing the logic of your device. Heres the change
pattern:

-ptimer_set_count(ptimer, count);
+ptimer_set_limit(ptimer, count, 1);

More permanently (and a question for the cheif maintainers) can we
look into ways of fixing ptimer properly?

Regards,
Peter

On Fri, Jun 22, 2012 at 5:22 PM, Evgeny Voevodine.voevo...@samsung.com  wrote:

From: Stanislav Vorobiovs.vorob...@samsung.com

After some long period of time Linux kernel hanged due to
ptimer_get_count may return 0 before timer interrupt occurs,
thus, causing FRC to jump back in time

Signed-off-by: Evgeny Voevodine.voevo...@samsung.com

Reviewed-by Peter A. G. Crosthwaitepeter.crosthwa...@petalogix.com


Thanks.


---
  hw/exynos4210_mct.c |4 
  1 file changed, 4 deletions(-)

diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c
index 7474fcf..7a22b1f 100644
--- a/hw/exynos4210_mct.c
+++ b/hw/exynos4210_mct.c
@@ -376,10 +376,6 @@ static uint64_t exynos4210_gfrc_get_count(Exynos4210MCTGT 
*s)
  {
 uint64_t count = 0;
 count = ptimer_get_count(s-ptimer_frc);
-if (!count) {
-/* Timer event was generated and s-reg.cnt holds adequate value */
-return s-reg.cnt;
-}
 count = s-count - count;
 return s-reg.cnt + count;
  }
--
1.7.9.5





--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] Virtio-pci issue

2012-05-30 Thread Evgeny Voevodin

On 30.05.2012 11:56, Stefan Hajnoczi wrote:

On Tue, May 29, 2012 at 4:48 AM, Evgeny Voevodine.voevo...@samsung.com  wrote:

On 28.05.2012 16:37, Stefan Hajnoczi wrote:

On Thu, May 24, 2012 at 4:18 AM, Evgeny Voevodine.voevo...@samsung.com
  wrote:

And also there is another problem that I've faced with. It is the ability
to
plug as many pci back-ends as board wants.
I mean that if for each back-end board should create a transport, then
user
have to know maximum number of transport instances
created by board. In the case of mmio transport I think that it's a
correct
behaviour, but for pci transport seems not.

Not sure I understand the problem.  Can you rephrase it?

Stefan


Ok, I'll try )
As I see, to connect a pci device to board it should be enough to specify
-device ... on command line.
And in the way virtio refactoring is moving, board should create transport
pci device to correspond each
back-end created by -device ... command.
So, if we create more back-ends with -device option then transports
created by board then there would be
back-ends that will not have corresponding transport device.
As result user should know maximum number of transport instances created by
board to not overrun it.
In the case of mmio I think it's normal, but not in the pci case. Am I
right?

The only limit to PCI devices should be the number slots available.


Where number of slots is defined?


For convenience we could continue to have virtio-blk-pci,
virtio-net-pci, etc which actually just add a virtio-pci adapter and
link it to a virtio device.

Users that want full control can specify:
   -device virtio-pci,id=virtio-pci.0
   -device virtio-blk,transport=virtio-pci.0,...

The board doesn't need to preallocate virtio-pci adapters.

Stefan



You suggest transport device to be created by user...
In that case an interface would differ from mmio since in the case
of mmio a board should specify memory and irq mappings for transport device.

--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] Can we improve virtio data structures with QOM?

2012-05-30 Thread Evgeny Voevodin

On 30.05.2012 20:05, Markus Armbruster wrote:

Stefan Hajnoczistefa...@gmail.com  writes:


On Wed, May 30, 2012 at 1:01 PM, Markus Armbrusterarm...@redhat.com  wrote:

Ordinary device models have a single state struct.  The first member is
a DeviceState or a specialization of DeviceState, e.g. a PCIDevice.
Simple enough.


I think Evgeny's virtio mmio patches change all this.  In the recent
virtio-pci thread we were discussing how the virtio transport (mmio,
pci) and virtio devices (net, blk, etc) fit together.  The email
thread is Virtio-pci issues from Evgeny Voevodin
e.voevo...@samsung.com.


Thanks for the pointer.

It's been a couple of weeks.  Evgeny, are you still pursuing this?



Yes, but in the past time we have a lot of work in Tizen project, so I
delayed this work a bit. If anybody wants I can send latest patches to
let you continue the work or maybe improve since I'm not sure if I'll
have a time to continue until 15'th of june (but I'll try :). Actually
my work is based on Peter's virtio-mmio patch set 
http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html, so 
maybe it's worth

adding him in the list peter.mayd...@linaro.org.


It probably makes sense to first merge Evgeny's virtio refactoring and
then ensure it's nicely mapped to QOM.


Yes, no good attempting to do too much in one series.  Nevertheless,
having a sufficiently developed idea of the final state in mind helps.





--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com




Re: [Qemu-devel] Can we improve virtio data structures with QOM?

2012-05-30 Thread Evgeny Voevodin

On 30.05.2012 20:05, Markus Armbruster wrote:

Stefan Hajnoczistefa...@gmail.com  writes:


On Wed, May 30, 2012 at 1:01 PM, Markus Armbrusterarm...@redhat.com  wrote:

Ordinary device models have a single state struct.  The first member is
a DeviceState or a specialization of DeviceState, e.g. a PCIDevice.
Simple enough.


I think Evgeny's virtio mmio patches change all this.  In the recent
virtio-pci thread we were discussing how the virtio transport (mmio,
pci) and virtio devices (net, blk, etc) fit together.  The email
thread is Virtio-pci issues from Evgeny Voevodin
e.voevo...@samsung.com.


Thanks for the pointer.

It's been a couple of weeks.  Evgeny, are you still pursuing this?



Yes, but in the past time we have a lot of work in Tizen project, so I
delayed this work a bit. If anybody wants I can send latest patches to
let you continue the work or maybe improve since I'm not sure if I'll
have a time to continue until 15'th of june (but I'll try :). Actually
my work is based on Peter's virtio-mmio patch set 
http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01870.html, so 
maybe it's worth

adding him in the list peter.mayd...@linaro.org.


It probably makes sense to first merge Evgeny's virtio refactoring and
then ensure it's nicely mapped to QOM.


Yes, no good attempting to do too much in one series.  Nevertheless,
having a sufficiently developed idea of the final state in mind helps.





--
Kind regards,
Evgeny Voevodin,
Leading Software Engineer,
ASWG, Moscow RD center, Samsung Electronics
e-mail: e.voevo...@samsung.com



  1   2   3   4   >