date:20160926

Re: [Qemu-devel] [PATCH v3 1/3] qom: make base type user-creatable abstract

2016-09-26 Thread Daniel P. Berrange

On Mon, Sep 26, 2016 at 06:16:25PM +0800, Lin Ma wrote:
> Signed-off-by: Lin Ma 
> ---
>  qom/object_interfaces.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
> index bf59846..9288242 100644
> --- a/qom/object_interfaces.c
> +++ b/qom/object_interfaces.c
> @@ -217,6 +217,7 @@ static void register_types(void)
>  static const TypeInfo uc_interface_info = {
>  .name  = TYPE_USER_CREATABLE,
>  .parent= TYPE_INTERFACE,
> +.abstract  = true,
>  .class_size = sizeof(UserCreatableClass),
>  };

This doesn't make any conceptual sense. UserCreatable is an inteface and
by definition all interfaces are abstract.

Were you trying to fix some particular real bug here ? If so, we almost
certainly need a different fix to what's suggested here, because QOM
should automatically treat all interfaces as abstract by their very
nature.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH v3 3/3] object: Add 'help' option for all available backends and properties

2016-09-26 Thread Daniel P. Berrange

On Mon, Sep 26, 2016 at 06:16:27PM +0800, Lin Ma wrote:
> '-object help' prints available user creatable backends.
> '-object $typename,help' prints relevant properties.
> 
> Signed-off-by: Lin Ma 
> ---
>  backends/hostmem.c  |  4 
>  crypto/secret.c |  4 
>  crypto/tlscreds.c   |  4 
>  include/qom/object_interfaces.h |  2 ++
>  net/filter.c|  4 
>  qemu-options.hx |  7 +-
>  qom/object_interfaces.c | 48 
> +
>  vl.c|  5 +
>  8 files changed, 77 insertions(+), 1 deletion(-)
> 
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index b7a208d..eea9dce 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -261,6 +261,10 @@ static void host_memory_backend_init(Object *obj)
>   HostMemPolicy_lookup,
>   host_memory_backend_get_policy,
>   host_memory_backend_set_policy, NULL);
> +object_property_set_description(obj, "policy",
> +"Data format: one of "
> +HostMemPolicy_value_str,
> +_abort);
>  }
>

Adding descriptions to properties should be done separately from
your impl of help printing, as they're independant concepts.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PULL 17/33] block: Accept device model name for x-blockdev-remove-medium

2016-09-26 Thread Paolo Bonzini



On 22/09/2016 18:29, Kevin Wolf wrote:
> -qmp_x_blockdev_remove_medium(device, );
> +qmp_x_blockdev_remove_medium(true, device, false, NULL, errp);
>  if (err) {
>  error_propagate(errp, err);
>  goto fail;

Bug:  changed to errp, so err is always NULL.

Paolo

Re: [Qemu-devel] [PULL 23/36] cadence_gem: Add queue support

2016-09-26 Thread Paolo Bonzini



On 22/09/2016 19:22, Peter Maydell wrote:
> +case GEM_RECEIVE_Q1_PTR ... GEM_RECEIVE_Q15_PTR:
> +s->rx_desc_addr[offset - GEM_RECEIVE_Q1_PTR + 1] = val;
> +break;

MAX_PRIORITY_QUEUES is still 8, so this can cause an out-of-bounds write
in s->rx_desc_addr (and likewise for s->tx_addr).

Paolo

Re: [Qemu-devel] [PATCH v2 07/14] pc: apic_common: extend APIC ID property to 32bit

2016-09-26 Thread Paolo Bonzini



On 26/09/2016 13:10, Igor Mammedov wrote:
> On Thu, 22 Sep 2016 18:16:47 +0200
> Paolo Bonzini  wrote:
> 
>> On 22/09/2016 18:00, Igor Mammedov wrote:
 Why not just return initial_apic_id?  This is the meaning the property
 had before your patch.  
>>>
>>> initial_apic_id is immutable but 'id' could be changed at runtime by guest 
>>> in xAPIC mode
>>> so returned value depends on xAPIC/x2APIC mode  
>>
>> Understood, but this is just a possibly poorly-named property.  "id"
>> (e.g. from info qtree as opposed to info lapic) used to be the initial
>> APIC ID always, even in x2APIC mode.
> 
> 'info qtree' doesn't show CPUs anymore (since ICC bus has been removed),
> but if it were it would show effective APIC ID. Same applie[ds] for
> reading property value with qom-get.

Oh, thanks for correcting me then.

>> Not a big deal, but thought I'd mention it since you can keep using
>> static properties.
>
> changing initial APIC ID from guest probably wouldn't work anyway
> and beak somewhere else, so we could just continue to ignore
> it and use static properties for now if you prefer.

No big deal---I'll let other reviewers chime in.

Paolo

[Qemu-devel] [PATCH 08/18] target-riscv: Add Atomic Instructions

2016-09-26 Thread Sagar Karandikar

Signed-off-by: Sagar Karandikar 
---
 target-riscv/translate.c | 154 +++
 1 file changed, 154 insertions(+)

diff --git a/target-riscv/translate.c b/target-riscv/translate.c
index 767cdbe..af82eab 100644
--- a/target-riscv/translate.c
+++ b/target-riscv/translate.c
@@ -641,6 +641,157 @@ static inline void gen_fp_store(DisasContext *ctx, 
uint32_t opc, int rs1,
 tcg_temp_free(t1);
 }
 
+static inline void gen_atomic(DisasContext *ctx, uint32_t opc,
+  int rd, int rs1, int rs2)
+{
+/* TODO: handle aq, rl bits? - for now just get rid of them: */
+opc = MASK_OP_ATOMIC_NO_AQ_RL(opc);
+TCGv source1, source2, dat;
+TCGLabel *j = gen_new_label();
+TCGLabel *done = gen_new_label();
+source1 = tcg_temp_local_new();
+source2 = tcg_temp_local_new();
+dat = tcg_temp_local_new();
+gen_get_gpr(source1, rs1);
+gen_get_gpr(source2, rs2);
+
+switch (opc) {
+/* all currently implemented as non-atomics */
+case OPC_RISC_LR_W:
+/* put addr in load_res */
+tcg_gen_mov_tl(load_res, source1);
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+break;
+case OPC_RISC_SC_W:
+tcg_gen_brcond_tl(TCG_COND_NE, load_res, source1, j);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+tcg_gen_movi_tl(dat, 0); /*success */
+tcg_gen_br(done);
+gen_set_label(j);
+tcg_gen_movi_tl(dat, 1); /*fail */
+gen_set_label(done);
+break;
+case OPC_RISC_AMOSWAP_W:
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+break;
+case OPC_RISC_AMOADD_W:
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+tcg_gen_add_tl(source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+break;
+case OPC_RISC_AMOXOR_W:
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+tcg_gen_xor_tl(source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+break;
+case OPC_RISC_AMOAND_W:
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+tcg_gen_and_tl(source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+break;
+case OPC_RISC_AMOOR_W:
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+tcg_gen_or_tl(source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+break;
+case OPC_RISC_AMOMIN_W:
+tcg_gen_ext32s_tl(source2, source2); /* since comparing */
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+tcg_gen_movcond_tl(TCG_COND_LT, source2, dat, source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+break;
+case OPC_RISC_AMOMAX_W:
+tcg_gen_ext32s_tl(source2, source2); /* since comparing */
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TESL | MO_ALIGN);
+tcg_gen_movcond_tl(TCG_COND_GT, source2, dat, source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+break;
+case OPC_RISC_AMOMINU_W:
+tcg_gen_ext32u_tl(source2, source2); /* since comparing */
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+tcg_gen_movcond_tl(TCG_COND_LTU, source2, dat, source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+tcg_gen_ext32s_tl(dat, dat); /* since load was TEUL */
+break;
+case OPC_RISC_AMOMAXU_W:
+tcg_gen_ext32u_tl(source2, source2); /* since comparing */
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+tcg_gen_movcond_tl(TCG_COND_GTU, source2, dat, source2, dat, source2);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEUL | MO_ALIGN);
+tcg_gen_ext32s_tl(dat, dat); /* since load was TEUL */
+break;
+#if defined(TARGET_RISCV64)
+case OPC_RISC_LR_D:
+/* put addr in load_res */
+tcg_gen_mov_tl(load_res, source1);
+tcg_gen_qemu_ld_tl(dat, source1, ctx->mem_idx, MO_TEQ | MO_ALIGN);
+break;
+case OPC_RISC_SC_D:
+tcg_gen_brcond_tl(TCG_COND_NE, load_res, source1, j);
+tcg_gen_qemu_st_tl(source2, source1, ctx->mem_idx, MO_TEQ | MO_ALIGN);
+tcg_gen_movi_tl(dat, 0); /* success */
+tcg_gen_br(done);
+gen_set_label(j);
+tcg_gen_movi_tl(dat, 1); /* fail */
+gen_set_label(done);
+break;
+case OPC_RISC_AMOSWAP_D:
+tcg_gen_qemu_ld_tl(dat,

[Qemu-devel] [PATCH 17/18] target-riscv: Add support for Host-Target Interface (HTIF) Devices

2016-09-26 Thread Sagar Karandikar

HTIF devices are currently used for the console and signaling test
completion for tests in riscv-tests. These are not part of any
RISC-V standard and will be phased out once better device support is
available.

Signed-off-by: Sagar Karandikar 
---
 hw/riscv/Makefile.objs   |   2 +
 hw/riscv/htif/elf_symb.c | 286 +
 hw/riscv/htif/elf_symb.h |  80 
 hw/riscv/htif/htif.c | 423 +++
 include/hw/riscv/htif/htif.h |  61 +++
 5 files changed, 852 insertions(+)
 create mode 100644 hw/riscv/htif/elf_symb.c
 create mode 100644 hw/riscv/htif/elf_symb.h
 create mode 100644 hw/riscv/htif/htif.c
 create mode 100644 include/hw/riscv/htif/htif.h

diff --git a/hw/riscv/Makefile.objs b/hw/riscv/Makefile.objs
index 79b4553..d830e5d 100644
--- a/hw/riscv/Makefile.objs
+++ b/hw/riscv/Makefile.objs
@@ -1 +1,3 @@
 obj-y += riscv_rtc.o
+obj-y += htif/elf_symb.o
+obj-y += htif/htif.o
diff --git a/hw/riscv/htif/elf_symb.c b/hw/riscv/htif/elf_symb.c
new file mode 100644
index 000..dc8efd6
--- /dev/null
+++ b/hw/riscv/htif/elf_symb.c
@@ -0,0 +1,286 @@
+/*
+ * elf.c - A simple package for manipulating symbol tables in elf binaries.
+ *
+ * Taken from
+ * https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15213-f03/www/
+ * ftrace/elf.c
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "elf_symb.h"
+
+/*
+ * elf_open - Map a binary into the address space and extract the
+ * locations of the static and dynamic symbol tables and their string
+ * tables. Return this information in a Elf object file handle that will
+ * be passed to all of the other elf functions.
+ */
+Elf_obj *elf_open(const char *filename)
+{
+int i, fd;
+struct stat sbuf;
+Elf_obj *ep;
+Elf64_Shdr *shdr;
+
+ep = g_new(Elf_obj, 1);
+
+/* Do some consistency checks on the binary */
+fd = open(filename, O_RDONLY);
+if (fd == -1) {
+fprintf(stderr, "Can't open \"%s\": %s\n", filename, strerror(errno));
+exit(1);
+}
+if (fstat(fd, ) == -1) {
+fprintf(stderr, "Can't stat \"%s\": %s\n", filename, strerror(errno));
+exit(1);
+}
+if (sbuf.st_size < sizeof(Elf64_Ehdr)) {
+fprintf(stderr, "\"%s\" is not an ELF binary object\n", filename);
+exit(1);
+}
+
+/* It looks OK, so map the Elf binary into our address space */
+ep->mlen = sbuf.st_size;
+ep->maddr = mmap(NULL, ep->mlen, PROT_READ, MAP_SHARED, fd, 0);
+if (ep->maddr == (void *)-1) {
+fprintf(stderr, "Can't mmap \"%s\": %s\n", filename, strerror(errno));
+exit(1);
+}
+close(fd);
+
+/* The Elf binary begins with the Elf header */
+ep->ehdr = ep->maddr;
+
+/* Make sure that this is an Elf binary */
+/*if (strncmp(ep->ehdr->e_ident, ELFMAG, SELFMAG)) {
+fprintf(stderr, "\"%s\" is not an ELF binary object\n", filename);
+exit(1);
+}*/
+
+/*
+ * Find the static and dynamic symbol tables and their string
+ * tables in the the mapped binary. The sh_link field in symbol
+ * table section headers gives the section index of the string
+ * table for that symbol table.
+ */
+shdr = (Elf64_Shdr *)(ep->maddr + ep->ehdr->e_shoff);
+for (i = 0; i < ep->ehdr->e_shnum; i++) {
+if (shdr[i].sh_type == SHT_SYMTAB) {   /* Static symbol table */
+ep->symtab = (Elf64_Sym *)(ep->maddr + shdr[i].sh_offset);
+ep->symtab_end = (Elf64_Sym *)((char *)ep->symtab +
+ shdr[i].sh_size);
+ep->strtab = (char *)(ep->maddr + shdr[shdr[i].sh_link].sh_offset);
+}
+if (shdr[i].sh_type == SHT_DYNSYM) {   /* Dynamic symbol table */
+ep->dsymtab = (Elf64_Sym *)(ep->maddr + shdr[i].sh_offset);
+ep->dsymtab_end = (Elf64_Sym *)((char *)ep->dsymtab +
+  shdr[i].sh_size);
+ep->dstrtab = (char *)(ep->maddr + 
shdr[shdr[i].sh_link].sh_offset);
+}
+}
+return ep;
+}
+
+/*
+ * elf_open - Map a binary into the address space and extract the
+ * locations of the static and dynamic symbol tables and their string
+ * tables. Return this information in a Elf object file handle that will
+ * be passed to all of the other elf functions.
+ */
+Elf_obj32 *elf_open32(const char *filename)
+{
+int i, fd;
+struct stat sbuf;
+Elf_obj32 *ep;
+Elf32_Shdr *shdr;
+
+ep = g_new(Elf_obj32, 1);
+
+/* Do some consistency checks on the binary */
+fd = open(filename, O_RDONLY);
+if (fd == -1) {
+fprintf(stderr, "Can't open \"%s\": %s\n", filename, strerror(errno));
+exit(1);
+}
+if (fstat(fd, ) == -1) {
+fprintf(stderr, "Can't stat \"%s\": %s\n", filename, strerror(errno));
+exit(1);
+}
+if (sbuf.st_size < sizeof(Elf32_Ehdr)) {
+

[Qemu-devel] [PATCH 12/18] target-riscv: Add system instructions

2016-09-26 Thread Sagar Karandikar

System instructions, stubs for csr read/write, necessary helpers

Signed-off-by: Sagar Karandikar 
---
 target-riscv/helper.h|  11 
 target-riscv/op_helper.c | 144 +++
 target-riscv/translate.c | 119 +++
 3 files changed, 274 insertions(+)

diff --git a/target-riscv/helper.h b/target-riscv/helper.h
index eeb1caf..a87a0ba 100644
--- a/target-riscv/helper.h
+++ b/target-riscv/helper.h
@@ -74,3 +74,14 @@ DEF_HELPER_FLAGS_3(fcvt_d_l, TCG_CALL_NO_RWG, i64, env, i64, 
i64)
 DEF_HELPER_FLAGS_3(fcvt_d_lu, TCG_CALL_NO_RWG, i64, env, i64, i64)
 #endif
 DEF_HELPER_FLAGS_2(fclass_d, TCG_CALL_NO_RWG, tl, env, i64)
+
+/* Special functions */
+#ifndef CONFIG_USER_ONLY
+DEF_HELPER_4(csrrw, tl, env, tl, tl, tl)
+DEF_HELPER_5(csrrs, tl, env, tl, tl, tl, tl)
+DEF_HELPER_5(csrrc, tl, env, tl, tl, tl, tl)
+DEF_HELPER_2(sret, tl, env, tl)
+DEF_HELPER_2(mret, tl, env, tl)
+DEF_HELPER_1(tlb_flush, void, env)
+DEF_HELPER_1(fence_i, void, env)
+#endif /* !CONFIG_USER_ONLY */
diff --git a/target-riscv/op_helper.c b/target-riscv/op_helper.c
index 1a7fb18..ee51f02 100644
--- a/target-riscv/op_helper.c
+++ b/target-riscv/op_helper.c
@@ -24,6 +24,21 @@
 #include "qemu/host-utils.h"
 #include "exec/helper-proto.h"
 
+int validate_priv(target_ulong priv)
+{
+return priv == PRV_U || priv == PRV_S || priv == PRV_M;
+}
+
+void set_privilege(CPURISCVState *env, target_ulong newpriv)
+{
+if (!validate_priv(newpriv)) {
+printf("INVALID PRIV SET\n");
+exit(1);
+}
+helper_tlb_flush(env);
+env->priv = newpriv;
+}
+
 /* Exceptions processing helpers */
 static inline void QEMU_NORETURN do_raise_exception_err(CPURISCVState *env,
   uint32_t exception, uintptr_t pc)
@@ -60,7 +75,136 @@ target_ulong helper_mulhsu(CPURISCVState *env, target_ulong 
arg1,
 }
 #endif
 
+/*
+ * Handle writes to CSRs and any resulting special behavior
+ *
+ * Adapted from Spike's processor_t::set_csr
+ */
+inline void csr_write_helper(CPURISCVState *env, target_ulong val_to_write,
+target_ulong csrno)
+{
+}
+
+/*
+ * Handle reads to CSRs and any resulting special behavior
+ *
+ * Adapted from Spike's processor_t::get_csr
+ */
+inline target_ulong csr_read_helper(CPURISCVState *env, target_ulong csrno)
+{
+return 0;
+}
+
+/*
+ * Check that CSR access is allowed.
+ *
+ * Adapted from Spike's decode.h:validate_csr
+ */
+void validate_csr(CPURISCVState *env, uint64_t which, uint64_t write,
+uint64_t new_pc) {
+unsigned csr_priv = get_field((which), 0x300);
+unsigned csr_read_only = get_field((which), 0xC00) == 3;
+if (((write) && csr_read_only) || (env->priv < csr_priv)) {
+do_raise_exception_err(env, RISCV_EXCP_ILLEGAL_INST, new_pc);
+}
+return;
+}
+
+target_ulong helper_csrrw(CPURISCVState *env, target_ulong src,
+target_ulong csr, target_ulong new_pc)
+{
+validate_csr(env, csr, 1, new_pc);
+uint64_t csr_backup = csr_read_helper(env, csr);
+csr_write_helper(env, src, csr);
+return csr_backup;
+}
+
+target_ulong helper_csrrs(CPURISCVState *env, target_ulong src,
+target_ulong csr, target_ulong new_pc, target_ulong rs1_pass)
+{
+validate_csr(env, csr, rs1_pass != 0, new_pc);
+uint64_t csr_backup = csr_read_helper(env, csr);
+if (rs1_pass != 0) {
+csr_write_helper(env, src | csr_backup, csr);
+}
+return csr_backup;
+}
+
+target_ulong helper_csrrc(CPURISCVState *env, target_ulong src,
+target_ulong csr, target_ulong new_pc, target_ulong rs1_pass) {
+validate_csr(env, csr, rs1_pass != 0, new_pc);
+uint64_t csr_backup = csr_read_helper(env, csr);
+if (rs1_pass != 0) {
+csr_write_helper(env, (~src) & csr_backup, csr);
+}
+return csr_backup;
+}
+
+target_ulong helper_sret(CPURISCVState *env, target_ulong cpu_pc_deb)
+{
+if (!(env->priv >= PRV_S)) {
+helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+}
+
+target_ulong retpc = env->csr[CSR_SEPC];
+if (retpc & 0x3) {
+helper_raise_exception(env, RISCV_EXCP_INST_ADDR_MIS);
+}
+
+target_ulong mstatus = env->csr[CSR_MSTATUS];
+target_ulong prev_priv = get_field(mstatus, MSTATUS_SPP);
+mstatus = set_field(mstatus, MSTATUS_UIE << prev_priv,
+get_field(mstatus, MSTATUS_SPIE));
+mstatus = set_field(mstatus, MSTATUS_SPIE, 0);
+mstatus = set_field(mstatus, MSTATUS_SPP, PRV_U);
+set_privilege(env, prev_priv);
+csr_write_helper(env, mstatus, CSR_MSTATUS);
+
+return retpc;
+}
+
+target_ulong helper_mret(CPURISCVState *env, target_ulong cpu_pc_deb)
+{
+if (!(env->priv >= PRV_M)) {
+helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+}
+
+target_ulong retpc = env->csr[CSR_MEPC];
+if (retpc & 0x3) {
+helper_raise_exception(env, RISCV_EXCP_INST_ADDR_MIS);
+}
+
+target_ulong mstatus =

[Qemu-devel] [PATCH 04/18] target-riscv: Add framework for instruction decode

2016-09-26 Thread Sagar Karandikar

Body of decode_opc with LUI, AUIPC, JAL instructions
Decode table in instmap.h

Signed-off-by: Sagar Karandikar 
---
 target-riscv/instmap.h   | 328 +++
 target-riscv/translate.c |  64 +
 2 files changed, 392 insertions(+)
 create mode 100644 target-riscv/instmap.h

diff --git a/target-riscv/instmap.h b/target-riscv/instmap.h
new file mode 100644
index 000..24f53c3
--- /dev/null
+++ b/target-riscv/instmap.h
@@ -0,0 +1,328 @@
+/*
+ * RISC-V emulation for qemu: Instruction decode helpers
+ *
+ * Author: Sagar Karandikar, sag...@eecs.berkeley.edu
+ *
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#define MASK_OP_MAJOR(op)  (op & 0x7F)
+enum {
+/* rv32i, rv64i, rv32m */
+OPC_RISC_LUI= (0x37),
+OPC_RISC_AUIPC  = (0x17),
+OPC_RISC_JAL= (0x6F),
+OPC_RISC_JALR   = (0x67),
+OPC_RISC_BRANCH = (0x63),
+OPC_RISC_LOAD   = (0x03),
+OPC_RISC_STORE  = (0x23),
+OPC_RISC_ARITH_IMM  = (0x13),
+OPC_RISC_ARITH  = (0x33),
+OPC_RISC_FENCE  = (0x0F),
+OPC_RISC_SYSTEM = (0x73),
+
+/* rv64i, rv64m */
+OPC_RISC_ARITH_IMM_W = (0x1B),
+OPC_RISC_ARITH_W = (0x3B),
+
+/* rv32a, rv64a */
+OPC_RISC_ATOMIC = (0x2F),
+
+/* floating point */
+OPC_RISC_FP_LOAD = (0x7),
+OPC_RISC_FP_STORE = (0x27),
+
+OPC_RISC_FMADD = (0x43),
+OPC_RISC_FMSUB = (0x47),
+OPC_RISC_FNMSUB = (0x4B),
+OPC_RISC_FNMADD = (0x4F),
+
+OPC_RISC_FP_ARITH = (0x53),
+};
+
+#define MASK_OP_ARITH(op)   (MASK_OP_MAJOR(op) | (op & ((0x7 << 12) | \
+(0x7F << 25
+enum {
+OPC_RISC_ADD   = OPC_RISC_ARITH | (0x0 << 12) | (0x00 << 25),
+OPC_RISC_SUB   = OPC_RISC_ARITH | (0x0 << 12) | (0x20 << 25),
+OPC_RISC_SLL   = OPC_RISC_ARITH | (0x1 << 12) | (0x00 << 25),
+OPC_RISC_SLT   = OPC_RISC_ARITH | (0x2 << 12) | (0x00 << 25),
+OPC_RISC_SLTU  = OPC_RISC_ARITH | (0x3 << 12) | (0x00 << 25),
+OPC_RISC_XOR   = OPC_RISC_ARITH | (0x4 << 12) | (0x00 << 25),
+OPC_RISC_SRL   = OPC_RISC_ARITH | (0x5 << 12) | (0x00 << 25),
+OPC_RISC_SRA   = OPC_RISC_ARITH | (0x5 << 12) | (0x20 << 25),
+OPC_RISC_OR= OPC_RISC_ARITH | (0x6 << 12) | (0x00 << 25),
+OPC_RISC_AND   = OPC_RISC_ARITH | (0x7 << 12) | (0x00 << 25),
+
+/* RV64M */
+OPC_RISC_MUL= OPC_RISC_ARITH | (0x0 << 12) | (0x01 << 25),
+OPC_RISC_MULH   = OPC_RISC_ARITH | (0x1 << 12) | (0x01 << 25),
+OPC_RISC_MULHSU = OPC_RISC_ARITH | (0x2 << 12) | (0x01 << 25),
+OPC_RISC_MULHU  = OPC_RISC_ARITH | (0x3 << 12) | (0x01 << 25),
+
+OPC_RISC_DIV= OPC_RISC_ARITH | (0x4 << 12) | (0x01 << 25),
+OPC_RISC_DIVU   = OPC_RISC_ARITH | (0x5 << 12) | (0x01 << 25),
+OPC_RISC_REM= OPC_RISC_ARITH | (0x6 << 12) | (0x01 << 25),
+OPC_RISC_REMU   = OPC_RISC_ARITH | (0x7 << 12) | (0x01 << 25),
+};
+
+
+#define MASK_OP_ARITH_IMM(op)   (MASK_OP_MAJOR(op) | (op & (0x7 << 12)))
+enum {
+OPC_RISC_ADDI   = OPC_RISC_ARITH_IMM | (0x0 << 12),
+OPC_RISC_SLTI   = OPC_RISC_ARITH_IMM | (0x2 << 12),
+OPC_RISC_SLTIU  = OPC_RISC_ARITH_IMM | (0x3 << 12),
+OPC_RISC_XORI   = OPC_RISC_ARITH_IMM | (0x4 << 12),
+OPC_RISC_ORI= OPC_RISC_ARITH_IMM | (0x6 << 12),
+OPC_RISC_ANDI   = OPC_RISC_ARITH_IMM | (0x7 << 12),
+OPC_RISC_SLLI   = OPC_RISC_ARITH_IMM | (0x1 << 12), /* additional part of
+   IMM */
+OPC_RISC_SHIFT_RIGHT_I = OPC_RISC_ARITH_IMM | (0x5 << 12) /* SRAI, SRLI */
+};
+
+#define MASK_OP_BRANCH(op) (MASK_OP_MAJOR(op) | (op & (0x7 << 12)))
+enum {
+OPC_RISC_BEQ  = OPC_RISC_BRANCH  | (0x0  << 12),
+OPC_RISC_BNE  = OPC_RISC_BRANCH  | (0x1  << 12),
+OPC_RISC_BLT  = OPC_RISC_BRANCH  | (0x4  << 12),
+OPC_RISC_BGE  = OPC_RISC_BRANCH  | (0x5  << 12),
+OPC_RISC_BLTU = OPC_RISC_BRANCH  | (0x6  << 12),
+OPC_RISC_BGEU = OPC_RISC_BRANCH  | (0x7  << 12)
+};
+
+enum {
+OPC_RISC_ADDIW   = OPC_RISC_ARITH_IMM_W | (0x0 << 12),
+OPC_RISC_SLLIW   = OPC_RISC_ARITH_IMM_W | (0x1 << 12), /* additional part 
of
+  IMM */
+OPC_RISC_SHIFT_RIGHT_IW = OPC_RISC_ARITH_IMM_W | (0x5 << 12) /* SRAI, SRLI
+  */
+};
+
+enum {
+

[Qemu-devel] [PATCH 2/2] xen: add qemu device for each pvusb backend

2016-09-26 Thread Juergen Gross

In order to be able to specify to which pvusb controller a new pvusb
device should be added we need a qemu device for each pvusb controller
with an associated id.

Add such a device when a new controller is requested and attach the
usb bus of that controller to the new device. Any device connected to
that controller can now specify the bus and port directly via its
properties.

Signed-off-by: Juergen Gross 
---
 hw/usb/xen-usb.c | 81 +++-
 1 file changed, 68 insertions(+), 13 deletions(-)

diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
index 174d715..439d104 100644
--- a/hw/usb/xen-usb.c
+++ b/hw/usb/xen-usb.c
@@ -29,6 +29,7 @@
 #include "hw/usb.h"
 #include "hw/xen/xen_backend.h"
 #include "monitor/qdev.h"
+#include "qapi/error.h"
 #include "qapi/qmp/qbool.h"
 #include "qapi/qmp/qint.h"
 #include "qapi/qmp/qstring.h"
@@ -47,12 +48,16 @@
 struct timeval tv;  \
 \
 gettimeofday(, NULL);\
-xen_be_printf(xendev, lvl, "%8ld.%06ld xen-usb(%s):" fmt,   \
+xen_be_printf(xendev, 0, "%8ld.%06ld xen-usb(%s):" fmt,   \
   tv.tv_sec, tv.tv_usec, __func__, ##args); \
 }
 #define TR_BUS(xendev, fmt, args...) TR(xendev, 2, fmt, ##args)
 #define TR_REQ(xendev, fmt, args...) TR(xendev, 3, fmt, ##args)
 
+#define TYPE_USBBACK"xen-pvusb"
+#define USBBACK_DEVICE(obj) \
+ OBJECT_CHECK(USBBACKDevice, (obj), TYPE_USBBACK)
+
 #define USBBACK_MAXPORTSUSBIF_PIPE_PORT_MASK
 #define USB_DEV_ADDR_SIZE   (USBIF_PIPE_DEV_MASK + 1)
 
@@ -67,6 +72,7 @@ struct usbif_ctrlrequest {
 
 struct usbback_info;
 struct usbback_req;
+struct USBBACKDevice;
 
 struct usbback_stub {
 USBDevice *dev;
@@ -101,6 +107,8 @@ struct usbback_hotplug {
 
 struct usbback_info {
 struct XenDevice xendev;  /* must be first */
+char id[24];
+struct USBBACKDevice *dev;
 USBBus   bus;
 void *urb_sring;
 void *conn_sring;
@@ -116,6 +124,10 @@ struct usbback_info {
 QEMUBH   *bh;
 };
 
+typedef struct USBBACKDevice {
+DeviceState qdev;
+} USBBACKDevice;
+
 static struct usbback_req *usbback_get_req(struct usbback_info *usbif)
 {
 struct usbback_req *usbback_req;
@@ -712,15 +724,10 @@ static void usbback_portid_detach(struct usbback_info 
*usbif, unsigned port)
 
 static void usbback_portid_remove(struct usbback_info *usbif, unsigned port)
 {
-USBPort *p;
-
 if (!usbif->ports[port - 1].dev) {
 return;
 }
 
-p = &(usbif->ports[port - 1].port);
-snprintf(p->path, sizeof(p->path), "%d", 99);
-
 object_unparent(OBJECT(usbif->ports[port - 1].dev));
 usbif->ports[port - 1].dev = NULL;
 usbback_portid_detach(usbif, port);
@@ -733,10 +740,10 @@ static void usbback_portid_add(struct usbback_info 
*usbif, unsigned port,
 {
 unsigned speed;
 char *portname;
-USBPort *p;
 Error *local_err = NULL;
 QDict *qdict;
 QemuOpts *opts;
+char tmp[32];
 
 if (usbif->ports[port - 1].dev) {
 return;
@@ -749,11 +756,14 @@ static void usbback_portid_add(struct usbback_info 
*usbif, unsigned port,
 return;
 }
 portname++;
-p = &(usbif->ports[port - 1].port);
-snprintf(p->path, sizeof(p->path), "%s", portname);
 
 qdict = qdict_new();
 qdict_put(qdict, "driver", qstring_from_str("usb-host"));
+snprintf(tmp, sizeof(tmp), "%s.0", usbif->id);
+qdict_put(qdict, "bus", qstring_from_str(tmp));
+snprintf(tmp, sizeof(tmp), "%s-%u", usbif->id, port);
+qdict_put(qdict, "id", qstring_from_str(tmp));
+qdict_put(qdict, "port", qint_from_int(port));
 qdict_put(qdict, "hostbus", qint_from_int(atoi(busid)));
 qdict_put(qdict, "hostport", qstring_from_str(portname));
 opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict, _err);
@@ -765,7 +775,6 @@ static void usbback_portid_add(struct usbback_info *usbif, 
unsigned port,
 goto err;
 }
 QDECREF(qdict);
-snprintf(p->path, sizeof(p->path), "%d", port);
 speed = usbif->ports[port - 1].dev->speed;
 switch (speed) {
 case USB_SPEED_LOW:
@@ -799,7 +808,6 @@ static void usbback_portid_add(struct usbback_info *usbif, 
unsigned port,
 
 err:
 QDECREF(qdict);
-snprintf(p->path, sizeof(p->path), "%d", 99);
 xen_be_printf(>xendev, 0, "device %s could not be opened\n", busid);
 }
 
@@ -1009,16 +1017,36 @@ static void usbback_alloc(struct XenDevice *xendev)
 struct usbback_info *usbif;
 USBPort *p;
 unsigned int i, max_grants;
+Error *local_err = NULL;
+QDict *qdict;
+QemuOpts *opts;
 
 usbif = container_of(xendev, struct usbback_info, xendev);
 
-usb_bus_new(>bus, sizeof(usbif->bus), _usb_bus_ops, xen_sysdev);

Re: [Qemu-devel] [PATCH 12/18] target-riscv: Add system instructions

2016-09-26 Thread Paolo Bonzini

On 26/09/2016 14:38, Bastian Koppelmann wrote:
> On 09/26/2016 02:21 PM, Paolo Bonzini wrote:
>>
>>
>> On 26/09/2016 12:56, Sagar Karandikar wrote:
>>> +#ifndef CONFIG_USER_ONLY
>>> +DEF_HELPER_4(csrrw, tl, env, tl, tl, tl)
>>> +DEF_HELPER_5(csrrs, tl, env, tl, tl, tl, tl)
>>> +DEF_HELPER_5(csrrc, tl, env, tl, tl, tl, tl)
>>> +DEF_HELPER_2(sret, tl, env, tl)
>>> +DEF_HELPER_2(mret, tl, env, tl)
>>> +DEF_HELPER_1(tlb_flush, void, env)
>>> +DEF_HELPER_1(fence_i, void, env)
>>> +#endif /* !CONFIG_USER_ONLY */
>>
>> The system emulation spec is still in flux, I think we should only add
>> user-mode emulation for now.
>>
> 
> Hi Paolo,
> 
> by user-mode emulation you still mean softmmu and not linux-user, right?
> So just drop the system instructions for now.

I don't think that's possible; all RISC-V machines include at least
M-mode, whose precise definitions requires the privileged interface
specification which hasn't been finalized yet.  So only linux-user is
stable enough.

In fact, based on some recent discussions on the RISC-V isa-dev mailing
list, it looks like some memory protection features _beyond_ the
privileged interface specification are in practice required to secure
M-mode from the supervisor.  I'm not sure what's the point in defining a
separate mandatory M-mode (supervisor mode cannot even enable paging
without help from M-mode, on the other hand a processor that only has M-
and U-modes cannot enable paging) but not providing the tools to
actually enforce privilege separation for it.

All in all, while I'm happy that the RISC-V project uses QEMU for
development, I don't think that the privileged interface specification
is mature enough for inclusion in QEMU.  It's very different for
linux-user user-mode emulation of course, it's great to have that upstream.

Thanks,

Paolo

[Qemu-devel] [PATCH 0/2] Xen pvUSB correction

2016-09-26 Thread Juergen Gross

Trying to use pvUSB in a Xen guest with a qemu emulated USB controller
will crash qemu as it tries to attach a pvUSB device to the emulated
controller.

This can be avoided by adding a unique id to each pvUSB controller which
can be used when attaching the pvUSB device. In order to make this
possible the pvUSB controller has to be a hotpluggable qemu device.

Juergen Gross (2):
  xen: add an own bus for xen backend devices
  xen: add qemu device for each pvusb backend

 hw/usb/xen-usb.c | 81 +---
 hw/xen/xen_backend.c | 19 +--
 include/hw/xen/xen_backend.h |  4 +++
 3 files changed, 88 insertions(+), 16 deletions(-)

-- 
2.6.6

Re: [Qemu-devel] write_zeroes/trim on the whole disk

2016-09-26 Thread Paolo Bonzini



On 24/09/2016 14:27, Vladimir Sementsov-Ogievskiy wrote:
> On 24.09.2016 15:06, Vladimir Sementsov-Ogievskiy wrote:
>> On 24.09.2016 00:21, Wouter Verhelst wrote:
>>> On Fri, Sep 23, 2016 at 02:00:06PM -0500, Eric Blake wrote:
 My preference would be a new flag to the existing commands, with
 explicit documentation that 0 offset and 0 length must be used with
 that
 flag, when requesting a full-device wipe.
>>> Alternatively, what about a flag that says "if you use this flag, the
>>> size should be left-shifted by X bits before processing"? That allows
>>> you to do TRIM or WRITE_ZEROES on much larger chunks, without being
>>> limited to "whole disk" commands. We should probably make it an illegal
>>> flag for any command that actually sends data over the wire, though.
>>
>> Note: if disk size is not aligned to X we will have to send request
>> larger than the disk size to clear the whole disk.
> 
> Also, in this case, which realization of bdrv interface in qemu would be
> most appropriate? Similar flag (in this case X must be defined in some
> very transparent way, as a constant of 64k for example), or flag
> BDRV_REQ_WHOLE_DISK, or separate .bdrv_zero_all and .bdrv_discard_all ?

This makes nice sense.

It also matches the SANITIZE command from the SCSI command set.

Paolo

Re: [Qemu-devel] [PATCH] proto: add 'shift' extension.

2016-09-26 Thread Paolo Bonzini



On 26/09/2016 14:46, Vladimir Sementsov-Ogievskiy wrote:
> This extension allows big requests for TRIM and WRITE_ZEROES through
> special 'shift' parameter, which means that offset and length should be
> shifted left by several bits.
> 
> This is needed for efficient clearing large regions of the disk (up to
> the whole disk).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> Here mentioned WRITE_ZEROES command which is only an experemental
> extension for now.
> 
> To dicuss:
> NBD_OPT_SHIFT Data. It can be reduced to 8 bits actually... Are some
>reserved bits needed here?
> 
>  doc/proto.md | 19 ++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/proto.md b/doc/proto.md
> index 2de3a6a..6fd1b16 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -682,6 +682,8 @@ The field has the following format:
>experimental `WRITE_ZEROES` 
> [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).
>  - bit 7, `NBD_FLAG_SEND_DF`: defined by the experimental `STRUCTURED_REPLY`
>
> [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md).
> +- bit 8, `NBD_FLAG_SEND_SHIFT` : exposes support for `NBD_CMD_FLAG_SHIFT` and
> +  `NBD_OPT_SHIFT`
>  
>  Clients SHOULD ignore unknown flags.
>  
> @@ -765,6 +767,15 @@ of the newstyle negotiation.
>  
>  Defined by the experimental `INFO` 
> [extension](https://github.com/yoe/nbd/blob/extension-info/doc/proto.md).
>  
> +- `NBD_OPT_SHIFT` (10)
> +
> +Defines shift used to calculate request offset and length if
> +`NBD_CMD_FLAG_SHIFT` is set.
> +
> +Data:
> +
> +- 32 bits, shift (unsigned); Must not be larger than 32.
> +
>   Option reply types
>  
>  These values are used in the "reply type" field, sent by the server
> @@ -872,7 +883,13 @@ valid may depend on negotiation during the handshake 
> phase.
>
> [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).
>  - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
>
> [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md).
> -
> +- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and
> +  `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request
> +  *length* and *offset* left by N bits, where N is defined by `NBD_OPT_SHIFT`
> +  option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is
> +  not specified. If after shift `(offset + length)` exceeds disk size, length
> +  should be reduced to `( - offset)`. However, `(offset + length)`
> +  must not exceed disk size by more than `(1 << N) - 1`.
>  
>   Request types
>  
> 

This is very ad hoc.  Can we instead have a block size common to all
commands?  Block devices in practice have one, in fact that's why
they're called block devices...

Paolo

[Qemu-devel] [PATCH] block: Fix error path in qmp_blockdev_change_medium()

2016-09-26 Thread Kevin Wolf

Commit 00949bab incorrectly changed one instance of  into errp while
touching the line. Change it back.

Signed-off-by: Kevin Wolf 
---
 blockdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 29c6561..62d0dd0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2614,7 +2614,7 @@ void qmp_blockdev_change_medium(bool has_device, const 
char *device,
 error_free(err);
 err = NULL;
 
-qmp_x_blockdev_remove_medium(has_device, device, has_id, id, errp);
+qmp_x_blockdev_remove_medium(has_device, device, has_id, id, );
 if (err) {
 error_propagate(errp, err);
 goto fail;
-- 
1.8.3.1

Re: [Qemu-devel] [PULL 17/33] block: Accept device model name for x-blockdev-remove-medium

2016-09-26 Thread Kevin Wolf

Am 26.09.2016 um 12:59 hat Paolo Bonzini geschrieben:
> On 22/09/2016 18:29, Kevin Wolf wrote:
> > -qmp_x_blockdev_remove_medium(device, );
> > +qmp_x_blockdev_remove_medium(true, device, false, NULL, errp);
> >  if (err) {
> >  error_propagate(errp, err);
> >  goto fail;
> 
> Bug:  changed to errp, so err is always NULL.

 vs. errp is kind of hard to spot in a diff. Maybe a good reason to
stick with "local_err" rather than "err" where we introduce new local
error variables.

Anyway, patch sent.

Kevin

[Qemu-devel] [PATCH 1/2] xen: add an own bus for xen backend devices

2016-09-26 Thread Juergen Gross

Add a bus for Xen backend devices in order to be able to establish a
dedicated device path for pluggable devices.

Signed-off-by: Juergen Gross 
---
 hw/xen/xen_backend.c | 19 ---
 include/hw/xen/xen_backend.h |  4 
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 69a2388..687adf4 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -29,13 +29,13 @@
 #include "hw/sysbus.h"
 #include "sysemu/char.h"
 #include "qemu/log.h"
+#include "qapi/error.h"
 #include "hw/xen/xen_backend.h"
 
 #include 
 
-#define TYPE_XENSYSDEV "xensysdev"
-
 DeviceState *xen_sysdev;
+BusState *xen_sysbus;
 
 /* - */
 
@@ -750,6 +750,8 @@ int xen_be_init(void)
 
 xen_sysdev = qdev_create(NULL, TYPE_XENSYSDEV);
 qdev_init_nofail(xen_sysdev);
+xen_sysbus = qbus_create(TYPE_XENSYSBUS, DEVICE(xen_sysdev), "xen-sysbus");
+qbus_set_bus_hotplug_handler(xen_sysbus, _abort);
 
 return 0;
 
@@ -862,6 +864,15 @@ void xen_be_printf(struct XenDevice *xendev, int 
msg_level, const char *fmt, ...
 qemu_log_flush();
 }
 
+static const TypeInfo xensysbus_info = {
+.name   = TYPE_XENSYSBUS,
+.parent = TYPE_BUS,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_HOTPLUG_HANDLER },
+{ }
+}
+};
+
 static int xen_sysdev_init(SysBusDevice *dev)
 {
 return 0;
@@ -878,6 +889,7 @@ static void xen_sysdev_class_init(ObjectClass *klass, void 
*data)
 
 k->init = xen_sysdev_init;
 dc->props = xen_sysdev_properties;
+dc->bus_type = TYPE_XENSYSBUS;
 }
 
 static const TypeInfo xensysdev_info = {
@@ -889,7 +901,8 @@ static const TypeInfo xensysdev_info = {
 
 static void xenbe_register_types(void)
 {
+type_register_static(_info);
 type_register_static(_info);
 }
 
-type_init(xenbe_register_types);
+type_init(xenbe_register_types)
diff --git a/include/hw/xen/xen_backend.h b/include/hw/xen/xen_backend.h
index 0df282a..4087231 100644
--- a/include/hw/xen/xen_backend.h
+++ b/include/hw/xen/xen_backend.h
@@ -54,6 +54,9 @@ struct XenDevice {
 QTAILQ_ENTRY(XenDevice) next;
 };
 
+#define TYPE_XENSYSDEV "xensysdev"
+#define TYPE_XENSYSBUS "xen-sysbus"
+
 /* - */
 
 /* variables */
@@ -62,6 +65,7 @@ extern xenforeignmemory_handle *xen_fmem;
 extern struct xs_handle *xenstore;
 extern const char *xen_protocol;
 extern DeviceState *xen_sysdev;
+extern BusState *xen_sysbus;
 
 /* xenstore helper functions */
 int xenstore_mkdir(char *path, int p);
-- 
2.6.6

[Qemu-devel] [PATCH] proto: add 'shift' extension.

2016-09-26 Thread Vladimir Sementsov-Ogievskiy

This extension allows big requests for TRIM and WRITE_ZEROES through
special 'shift' parameter, which means that offset and length should be
shifted left by several bits.

This is needed for efficient clearing large regions of the disk (up to
the whole disk).

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Here mentioned WRITE_ZEROES command which is only an experemental
extension for now.

To dicuss:
NBD_OPT_SHIFT Data. It can be reduced to 8 bits actually... Are some
   reserved bits needed here?

 doc/proto.md | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/doc/proto.md b/doc/proto.md
index 2de3a6a..6fd1b16 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -682,6 +682,8 @@ The field has the following format:
   experimental `WRITE_ZEROES` 
[extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).
 - bit 7, `NBD_FLAG_SEND_DF`: defined by the experimental `STRUCTURED_REPLY`
   
[extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md).
+- bit 8, `NBD_FLAG_SEND_SHIFT` : exposes support for `NBD_CMD_FLAG_SHIFT` and
+  `NBD_OPT_SHIFT`
 
 Clients SHOULD ignore unknown flags.
 
@@ -765,6 +767,15 @@ of the newstyle negotiation.
 
 Defined by the experimental `INFO` 
[extension](https://github.com/yoe/nbd/blob/extension-info/doc/proto.md).
 
+- `NBD_OPT_SHIFT` (10)
+
+Defines shift used to calculate request offset and length if
+`NBD_CMD_FLAG_SHIFT` is set.
+
+Data:
+
+- 32 bits, shift (unsigned); Must not be larger than 32.
+
  Option reply types
 
 These values are used in the "reply type" field, sent by the server
@@ -872,7 +883,13 @@ valid may depend on negotiation during the handshake phase.
   
[extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).
 - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
   
[extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md).
-
+- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and
+  `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request
+  *length* and *offset* left by N bits, where N is defined by `NBD_OPT_SHIFT`
+  option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is
+  not specified. If after shift `(offset + length)` exceeds disk size, length
+  should be reduced to `( - offset)`. However, `(offset + length)`
+  must not exceed disk size by more than `(1 << N) - 1`.
 
  Request types
 
-- 
1.8.3.1

Re: [Qemu-devel] vhost-user-test failure

2016-09-26 Thread Maxime Coquelin


Hi,

On 09/26/2016 02:13 PM, Eduardo Habkost wrote:

On Sun, Sep 25, 2016 at 04:55:53PM -0400, Marc-André Lureau wrote:

Hi

- Original Message -

This time with Marc-André in cc:...

On 09/23/2016 07:40 PM, Maxime Coquelin wrote:



On 09/23/2016 05:41 PM, Michael S. Tsirkin wrote:

On Fri, Sep 23, 2016 at 12:36:12PM -0300, Eduardo Habkost wrote:

Hi,

I hit a weird vhost-user-test failure on travis-ci recently, on a
branch where I didn't touch any vhost-related code. From a quick
look at the code, it looks like the vhost-user code is unhappy to
see a disconnected socket.

I wasn't able to reproduce it. It seems to be a hard to reproduce
race between vhost-user code and socket reconnection.

The failure can be seen at:

https://travis-ci.org/ehabkost/qemu-hacks/jobs/162077239


Maxime looked at something similiar. Any idea?

No, not really.
Marc-André contributed a lot to these tests, I add him in cc: in case
he has an idea.

I will have a look in the mean time.



I am unable to reproduce locally (over 500x iterations), and I
have no clue what's going on: the warnings there aren't the
problem (that's the main reason why we use the subprocess, to
silence those). Do you have a local reproducer or is it only on
travis? Afaik, there are no other reports of this test failing,
are you sure its not related to changes on your branch?


I don't have a local reproducer, I could only see it once on
travis-ci. Maybe it is not possible to reproduce it if the
machine isn't loaded enough to make the right thread/process be
delayed.


I'm also trying to reproduce it.
Interestingly, launching the test with strace, I reproduce another
problem systematically:
$> strace -o /tmp/vut -ff ./tests/vhost-user-test
/x86_64/vhost-user/read-guest-mem: OK
/x86_64/vhost-user/migrate: Vhost user backend fails to broadcast fake RARP
OK
/x86_64/vhost-user/reconnect: OK

I'll try to load the CPU randomly when executing the test.

Regards,
Maxime

Re: [Qemu-devel] [PATCH v5 7/9] block: don't make snapshots for filters

2016-09-26 Thread Kevin Wolf

Am 26.09.2016 um 11:51 hat Pavel Dovgalyuk geschrieben:
> > From: Kevin Wolf [mailto:kw...@redhat.com]
> > Am 26.09.2016 um 10:08 hat Pavel Dovgalyuk geschrieben:
> > > This patch disables snapshotting for block driver filters.
> > > It is needed, because snapshots should be created
> > > in underlying disk images, not in filters itself.
> > >
> > > Signed-off-by: Pavel Dovgalyuk 
> > 
> > But that's exactly what the existing code implements? If a driver
> > doesn't provide .bdrv_snapshot_goto, the request is redirected to
> > bs->file.
> > 
> > >  block/snapshot.c |3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/block/snapshot.c b/block/snapshot.c
> > > index bf5c2ca..8998b8b 100644
> > > --- a/block/snapshot.c
> > > +++ b/block/snapshot.c
> > > @@ -184,6 +184,9 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
> > >  if (!drv) {
> > >  return -ENOMEDIUM;
> > >  }
> > > +if (drv->is_filter) {
> > > +return 0;
> > > +}
> > 
> > This, on the other hand, doesn't redirect the request, but silently
> > ignores it. That is, loading the snapshot will apparently succeed, but
> > it wouldn't actually load anything and the disk would stay in its
> > current state.
> 
> In my use case bdrv_all_goto_snapshot iterates all block drivers, including
> filters and disk images. Therefore skipping goto for images is ok.

Hm, this can happy today indeed.

Originally, we only called bdrv_goto_snapshot() for all _top level_
BDSes, and this is still what you normally get. However, if you
explicitly create a BDS (e.g. with its own -drive option), it is
considered a top level BDS without actually being top level for the
guest, and therefore the snapshotting function is called for it.

Of course, this is highly inefficient because the goto_snapshot request
is passed by the filter driver and then called another time for the
lower node, effectively loading the snapshot a second time.

On the other hand if you use a single -drive option to create both the
qcow2 BDS and the blkreplay filter, we do need to pass down the
goto_snapshot request because it won't be called for the qcow2 layer
otherwise.

I'm not completely sure yet what the right behaviour would be here.

Kevin

Re: [Qemu-devel] [PATCH] usb: ehci: fix memory leak in ehci_process_itd

2016-09-26 Thread 李强

Ping!

2016-09-19 10:48 GMT+08:00 Li Qiang :

> From: Li Qiang 
>
> While processing isochronous transfer descriptors(iTD), if the page
> select(PG) field value is out of bands it will return. In this
> situation the ehci's sg list doesn't be freed thus leading a memory
> leak issue. This patch avoid this.
>
> Signed-off-by: Li Qiang 
> ---
>  hw/usb/hcd-ehci.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
> index b093db7..f4ece9a 100644
> --- a/hw/usb/hcd-ehci.c
> +++ b/hw/usb/hcd-ehci.c
> @@ -1426,6 +1426,7 @@ static int ehci_process_itd(EHCIState *ehci,
>  if (off + len > 4096) {
>  /* transfer crosses page border */
>  if (pg == 6) {
> +qemu_sglist_destroy(>isgl);
>  return -1;  /* avoid page pg + 1 */
>  }
>  ptr2 = (itd->bufptr[pg + 1] & ITD_BUFPTR_MASK);
> --
> 1.8.3.1
>
>

Re: [Qemu-devel] [PATCH 2/2] build-sys: put glib_cflags in QEMU_CFLAGS

2016-09-26 Thread Paolo Bonzini



On 25/09/2016 22:57, Marc-André Lureau wrote:
> This way, overriding CFLAGS on make command line keeps glib-cflags
> and doesn't break the build.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  configure | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index c831600..5412d4f 100755
> --- a/configure
> +++ b/configure
> @@ -2933,7 +2933,7 @@ for i in $glib_modules; do
>  if $pkg_config --atleast-version=$glib_req_ver $i; then
>  glib_cflags=$($pkg_config --cflags $i)
>  glib_libs=$($pkg_config --libs $i)
> -CFLAGS="$glib_cflags $CFLAGS"
> +QEMU_CFLAGS="$glib_cflags $QEMU_CFLAGS"
>  LIBS="$glib_libs $LIBS"
>  libs_qga="$glib_libs $libs_qga"
>  else
> 

Queued both, thanks.

Paolo

Re: [Qemu-devel] How does a guest OS differentiate between a Reboot/Shutdown ACPI event

2016-09-26 Thread Paolo Bonzini



On 26/09/2016 04:38, Srinivasan J wrote:
> 
> I have Ubuntu 14.04.1 (ubuntu-14.04.1-server-amd64.iso) guest running
> in a KVM host. The host is running Ubuntu 16.04. I'm trying to find
> out how Ubuntu 14.04.1 differentiates between virsh shutdown and virsh
> reboot commands issued in the host. I see that in both cases the ACPI
> event seen at the guest are exactly same. The guest however correctly
> shuts down on issuing "virsh shutdown" and correctly reboots on
> issuing "virsh reboot".

In the case of "virsh reboot", libvirt restarts the guest as soon as the
shutdown is complete.

Paolo

Re: [Qemu-devel] [PATCH 0/3] RDMA error handling

2016-09-26 Thread Dr. David Alan Gilbert

* Michael R. Hines (mrhi...@digitalocean.com) wrote:
> Reviewed-by: Michael R. Hines 
> 
> (By the way, I no longer work for IBM and no longer have direct access to 
> RDMA hardware. If someone is willing to let me login to something that does 
> in the future, I don't mind debugging things. I just don't have any hardware 
> of my own anymore to debug, and the last time I tried to use software RDMA it 
> was an unpleasurable experience.)

Thanks; I did hear a rumour that SoftRoCE was going to go upstream, but
it doesn't seem to have happened yet.

Dave

> 
> /*
>  * Michael R. Hines
>  * Senior Engineer, DigitalOcean.
>  */
> 
> On 09/23/2016 02:14 PM, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> > lp: https://bugs.launchpad.net/qemu/+bug/1545052
> > 
> > The RDMA code tends to hang if the destination dies
> > in the wrong place;  this series doesn't completely fix
> > that, but in cases where the destination knows there's
> > been an error, it makes sure it tells the source and
> > that cleans up quickly.
> > If the destination just dies, then the source still hangs
> > and I still need to look at better ways to fix that.
> > 
> > Dave
> > 
> > Dr. David Alan Gilbert (3):
> >migration/rdma: Pass qemu_file errors across link
> >migration: Make failed migration load set file error
> >migration/rdma: Don't flag an error when we've been told about one
> > 
> >   migration/rdma.c   |  9 -
> >   migration/savevm.c | 19 ---
> >   2 files changed, 20 insertions(+), 8 deletions(-)
> > 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 10/16] docs: include formal model for TCG exclusive sections

2016-09-26 Thread Alex Bennée


Paolo Bonzini  writes:

> Signed-off-by: Paolo Bonzini 
> ---
>  docs/tcg-exclusive.promela | 176 
> +
>  1 file changed, 176 insertions(+)
>  create mode 100644 docs/tcg-exclusive.promela
>
> diff --git a/docs/tcg-exclusive.promela b/docs/tcg-exclusive.promela
> new file mode 100644
> index 000..360edcd
> --- /dev/null
> +++ b/docs/tcg-exclusive.promela
> @@ -0,0 +1,176 @@
> +/*
> + * This model describes the implementation of exclusive sections in
> + * cpus-common.c (start_exclusive, end_exclusive, cpu_exec_start,
> + * cpu_exec_end).
> + *
> + * Author: Paolo Bonzini 
> + *
> + * This file is in the public domain.  If you really want a license,
> + * the WTFPL will do.
> + *
> + * To verify it:
> + * spin -a docs/event.promela
> + * ./a.out -a
> + *
> + * Tunable processor macros: N_CPUS, N_EXCLUSIVE, N_CYCLES, TEST_EXPENSIVE.
> + */

I made some comments on the comments when this was part of the other patch:

  > + *
  > + * Author: Paolo Bonzini 
  > + *
  > + * This file is in the public domain.  If you really want a license,
  > + * the WTFPL will do.
  > + *
  > + * To verify it:
  > + * spin -a docs/event.promela

  wrong docs name

  > + * ./a.out -a

  Which version of spin did you run? I grabbed the latest src release
  (http://spinroot.com/spin/Src/src645.tar.gz) and had to manually build
  the output:

  ~/src/spin/Src6.4.5/spin -a docs/tcg-exclusive.promela
  gcc pan.c
  ../a.out

  > + *
  > + * Tunable processor macros: N_CPUS, N_EXCLUSIVE, N_CYCLES, USE_MUTEX,
  > + *   TEST_EXPENSIVE.
  > + */

  How do you pass these? I tried:

  ~/src/spin/Src6.4.5/spin -a docs/tcg-exclusive.promela -DN_CPUS=4
  ~/src/spin/Src6.4.5/spin -a docs/tcg-exclusive.promela -DN_CPUS 4

  without any joy.


> +
> +// Define the missing parameters for the model
> +#ifndef N_CPUS
> +#define N_CPUS 2
> +#warning defaulting to 2 CPU processes
> +#endif
> +
> +// the expensive test is not so expensive for <= 3 CPUs
> +#if N_CPUS <= 3
> +#define TEST_EXPENSIVE
> +#endif
> +
> +#ifndef N_EXCLUSIVE
> +# if !defined N_CYCLES || N_CYCLES <= 1 || defined TEST_EXPENSIVE
> +#  define N_EXCLUSIVE 2
> +#  warning defaulting to 2 concurrent exclusive sections
> +# else
> +#  define N_EXCLUSIVE 1
> +#  warning defaulting to 1 concurrent exclusive sections
> +# endif
> +#endif
> +#ifndef N_CYCLES
> +# if N_EXCLUSIVE <= 1 || defined TEST_EXPENSIVE
> +#  define N_CYCLES2
> +#  warning defaulting to 2 CPU cycles
> +# else
> +#  define N_CYCLES1
> +#  warning defaulting to 1 CPU cycles
> +# endif
> +#endif
> +
> +
> +// synchronization primitives.  condition variables require a
> +// process-local "cond_t saved;" variable.
> +
> +#define mutex_t  byte
> +#define MUTEX_LOCK(m)atomic { m == 0 -> m = 1 }
> +#define MUTEX_UNLOCK(m)  m = 0
> +
> +#define cond_t   int
> +#define COND_WAIT(c, m)  {  \
> +   saved = c;   \
> +   MUTEX_UNLOCK(m); \
> +   c != saved -> MUTEX_LOCK(m); \
> + }
> +#define COND_BROADCAST(c)c++
> +
> +// this is the logic from cpus-common.c
> +
> +mutex_t mutex;
> +cond_t exclusive_cond;
> +cond_t exclusive_resume;
> +byte pending_cpus;
> +
> +byte running[N_CPUS];
> +byte has_waiter[N_CPUS];
> +
> +#define exclusive_idle()  \
> +  do  \
> +  :: pending_cpus -> COND_WAIT(exclusive_resume, mutex);  \
> +  :: else -> break;   \
> +  od
> +
> +#define start_exclusive() \
> +MUTEX_LOCK(mutex);\
> +exclusive_idle(); \
> +pending_cpus = 1; \
> +  \
> +i = 0;\
> +do\
> +   :: i < N_CPUS -> { \
> +   if \
> +  :: running[i] -> has_waiter[i] = 1; pending_cpus++; \
> +  :: else   -> skip;  \
> +   fi;\
> +   i++;   \
> +   }  \
> +   :: else -> break;  \
> +od;

Re: [Qemu-devel] [PATCH 13/16] cpus-common: simplify locking for start_exclusive/end_exclusive

2016-09-26 Thread Alex Bennée


Paolo Bonzini  writes:

> It is not necessary to hold qemu_cpu_list_mutex throughout the
> exclusive section, because no other exclusive section can run
> while pending_cpus != 0.
>
> exclusive_idle() is called in cpu_exec_start(), and that prevents
> any CPUs created after start_exclusive() from entering cpu_exec()
> during an exclusive section.
>
> Signed-off-by: Paolo Bonzini 

Reviewed-by: Alex Bennée 

> ---
>  cpus-common.c  | 11 ---
>  docs/tcg-exclusive.promela |  4 +++-
>  include/qom/cpu.h  |  4 
>  3 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/cpus-common.c b/cpus-common.c
> index 80aaf9b..429652c 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -171,8 +171,7 @@ static inline void exclusive_idle(void)
>  }
>
>  /* Start an exclusive operation.
> -   Must only be called from outside cpu_exec, takes
> -   qemu_cpu_list_lock.   */
> +   Must only be called from outside cpu_exec.  */
>  void start_exclusive(void)
>  {
>  CPUState *other_cpu;
> @@ -191,11 +190,17 @@ void start_exclusive(void)
>  while (pending_cpus > 1) {
>  qemu_cond_wait(_cond, _cpu_list_lock);
>  }
> +
> +/* Can release mutex, no one will enter another exclusive
> + * section until end_exclusive resets pending_cpus to 0.
> + */
> +qemu_mutex_unlock(_cpu_list_lock);
>  }
>
> -/* Finish an exclusive operation.  Releases qemu_cpu_list_lock.  */
> +/* Finish an exclusive operation.  */
>  void end_exclusive(void)
>  {
> +qemu_mutex_lock(_cpu_list_lock);
>  pending_cpus = 0;
>  qemu_cond_broadcast(_resume);
>  qemu_mutex_unlock(_cpu_list_lock);
> diff --git a/docs/tcg-exclusive.promela b/docs/tcg-exclusive.promela
> index 9e7d9e3..a8896e5 100644
> --- a/docs/tcg-exclusive.promela
> +++ b/docs/tcg-exclusive.promela
> @@ -97,9 +97,11 @@ byte has_waiter[N_CPUS];
>  do\
>:: pending_cpus > 1 -> COND_WAIT(exclusive_cond, mutex);\
>:: else -> break;   \
> -od
> +od;   \
> +MUTEX_UNLOCK(mutex);
>
>  #define end_exclusive()   \
> +MUTEX_LOCK(mutex);\
>  pending_cpus = 0; \
>  COND_BROADCAST(exclusive_resume); \
>  MUTEX_UNLOCK(mutex);
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index f872614..934c07a 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -846,9 +846,6 @@ void cpu_exec_end(CPUState *cpu);
>   * cpu_exec are exited immediately.  CPUs that call cpu_exec_start
>   * during the exclusive section go to sleep until this CPU calls
>   * end_exclusive.
> - *
> - * Returns with the CPU list lock taken (which nests outside all
> - * other locks except the BQL).
>   */
>  void start_exclusive(void);
>
> @@ -856,7 +853,6 @@ void start_exclusive(void);
>   * end_exclusive:
>   *
>   * Concludes an exclusive execution section started by start_exclusive.
> - * Releases the CPU list lock.
>   */
>  void end_exclusive(void);


--
Alex Bennée

Re: [Qemu-devel] [PATCH 1/4] target-cris: Do not dump cpu state with -d in_asm

2016-09-26 Thread Alex Bennée


Richard Henderson  writes:

> Dumping cpu state is what -d cpu is for.
>
> Cc: Edgar E. Iglesias 
> Signed-off-by: Richard Henderson 

Reviewed-by: Alex Bennée 

> ---
>  target-cris/translate.c | 25 ++---
>  1 file changed, 2 insertions(+), 23 deletions(-)
>
> diff --git a/target-cris/translate.c b/target-cris/translate.c
> index f4a8d7d..9de26af 100644
> --- a/target-cris/translate.c
> +++ b/target-cris/translate.c
> @@ -3135,29 +3135,6 @@ void gen_intermediate_code(CPUCRISState *env, struct 
> TranslationBlock *tb)
>
>  dc->cpustate_changed = 0;
>
> -if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
> -qemu_log(
> -"pc=%x %x flg=%" PRIx64 " bt=%x ds=%u ccs=%x\n"
> -"pid=%x usp=%x\n"
> -"%x.%x.%x.%x\n"
> -"%x.%x.%x.%x\n"
> -"%x.%x.%x.%x\n"
> -"%x.%x.%x.%x\n",
> -dc->pc, dc->ppc,
> -(uint64_t)tb->flags,
> -env->btarget, (unsigned)tb->flags & 7,
> -env->pregs[PR_CCS],
> -env->pregs[PR_PID], env->pregs[PR_USP],
> -env->regs[0], env->regs[1], env->regs[2], env->regs[3],
> -env->regs[4], env->regs[5], env->regs[6], env->regs[7],
> -env->regs[8], env->regs[9],
> -env->regs[10], env->regs[11],
> -env->regs[12], env->regs[13],
> -env->regs[14], env->regs[15]);
> -qemu_log("--\n");
> -qemu_log("IN: %s\n", lookup_symbol(pc_start));
> -}
> -
>  next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
>  num_insns = 0;
>  max_insns = tb->cflags & CF_COUNT_MASK;
> @@ -3313,6 +3290,8 @@ void gen_intermediate_code(CPUCRISState *env, struct 
> TranslationBlock *tb)
>  #if !DISAS_CRIS
>  if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
>  && qemu_log_in_addr_range(pc_start)) {
> +qemu_log("--\n");
> +qemu_log("IN: %s\n", lookup_symbol(pc_start));
>  log_target_disas(cs, pc_start, dc->pc - pc_start,
>   env->pregs[PR_VR]);
>  qemu_log("\nisize=%d osize=%d\n",


--
Alex Bennée

Re: [Qemu-devel] [PATCH 1/3] virtio: add virtio_detach_element()

2016-09-26 Thread Greg Kurz

On Mon, 19 Sep 2016 14:28:03 +0100
Stefan Hajnoczi  wrote:

> During device reset or similar situations a VirtQueueElement needs to be
> freed without pushing it onto the used ring or rewinding the virtqueue.
> Extract a new function to do this.
> 
> Later patches add virtio_detach_element() calls to existing device so
> that scatter-gather lists are unmapped and vq->inuse goes back to zero
> during device reset.  Currently some devices don't bother and simply
> call g_free(elem) which is not a clean way to throw away a
> VirtQueueElement.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---

FWIW

Acked-by: Greg Kurz 

>  hw/virtio/virtio.c | 27 +--
>  include/hw/virtio/virtio.h |  2 ++
>  2 files changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index fcf3358..adcef45 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -264,12 +264,35 @@ static void virtqueue_unmap_sg(VirtQueue *vq, const 
> VirtQueueElement *elem,
>0, elem->out_sg[i].iov_len);
>  }
>  
> +/* virtqueue_detach_element:
> + * @vq: The #VirtQueue
> + * @elem: The #VirtQueueElement
> + * @len: number of bytes written
> + *
> + * Detach the element from the virtqueue.  This function is suitable for 
> device
> + * reset or other situations where a #VirtQueueElement is simply freed and 
> will
> + * not be pushed or discarded.
> + */
> +void virtqueue_detach_element(VirtQueue *vq, const VirtQueueElement *elem,
> +  unsigned int len)
> +{
> +vq->inuse--;
> +virtqueue_unmap_sg(vq, elem, len);
> +}
> +
> +/* virtqueue_discard:
> + * @vq: The #VirtQueue
> + * @elem: The #VirtQueueElement
> + * @len: number of bytes written
> + *
> + * Pretend the most recent element wasn't popped from the virtqueue.  The 
> next
> + * call to virtqueue_pop() will refetch the element.
> + */
>  void virtqueue_discard(VirtQueue *vq, const VirtQueueElement *elem,
> unsigned int len)
>  {
>  vq->last_avail_idx--;
> -vq->inuse--;
> -virtqueue_unmap_sg(vq, elem, len);
> +virtqueue_detach_element(vq, elem, len);
>  }
>  
>  /* virtqueue_rewind:
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index f05559d..ad1e2d6 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -152,6 +152,8 @@ void *virtqueue_alloc_element(size_t sz, unsigned 
> out_num, unsigned in_num);
>  void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
>  unsigned int len);
>  void virtqueue_flush(VirtQueue *vq, unsigned int count);
> +void virtqueue_detach_element(VirtQueue *vq, const VirtQueueElement *elem,
> +  unsigned int len);
>  void virtqueue_discard(VirtQueue *vq, const VirtQueueElement *elem,
> unsigned int len);
>  bool virtqueue_rewind(VirtQueue *vq, unsigned int num);

[Qemu-devel] [PATCH v3 8/9] virtio-scsi: convert virtio_scsi_bad_req() to use virtio_error()

2016-09-26 Thread Greg Kurz

The virtio_scsi_bad_req() function is called when a guest sends a
request with missing or ill-sized headers. This generally happens
when the virtio_scsi_parse_req() function returns an error.

With this patch, virtio_scsi_bad_req() will mark the device as broken,
detach the request from the virtqueue and free it, instead of forcing
QEMU to exit.

In nearly all locations where virtio_scsi_bad_req() is called, the only
thing to do next is to return to the caller.

The virtio_scsi_handle_cmd_req_prepare() function is an exception though.

It is called in a loop by virtio_scsi_handle_cmd_vq() and passed requests
freshly popped from a cmd virtqueue; virtio_scsi_handle_cmd_req_prepare()
does some sanity checks on the request and returns a boolean flag to
indicate whether the request should be queued or not. In the latter case,
virtio_scsi_handle_cmd_req_prepare() has detected a non-fatal error and
sent a response back to the guest.

We have now a new condition to take into account: the device is broken
and should stop all processing.

The return value of virtio_scsi_handle_cmd_req_prepare() is hence changed
to an int. A return value of zero means that the request should be queued.
Other non-fatal error cases where the reqyest shoudn't be queued  return
a negative errno (values are vaguely inspired by the error condition, but
the only goal here is to discriminate the case we're interested in).

And finally, if virtio_scsi_bad_req() was called, -EINVAL is returned. In
this case, virtio_scsi_handle_cmd_vq() detaches and frees already queued
requests, instead of submitting them.

Signed-off-by: Greg Kurz 
---
v3: - detach and free element in virtio_scsi_bad_req()
- detach and free all queued requests in virtio_scsi_handle_cmd_vq()
- updated changelog
---
 hw/scsi/virtio-scsi.c |   44 +++-
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index e596b6474131..fca23185a7fd 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -81,10 +81,11 @@ static void virtio_scsi_complete_req(VirtIOSCSIReq *req)
 virtio_scsi_free_req(req);
 }
 
-static void virtio_scsi_bad_req(void)
+static void virtio_scsi_bad_req(VirtIOSCSIReq *req)
 {
-error_report("wrong size for virtio-scsi headers");
-exit(1);
+virtio_error(VIRTIO_DEVICE(req->dev), "wrong size for virtio-scsi 
headers");
+virtqueue_detach_element(req->vq, >elem, 0);
+virtio_scsi_free_req(req);
 }
 
 static size_t qemu_sgl_concat(VirtIOSCSIReq *req, struct iovec *iov,
@@ -387,7 +388,7 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
 
 if (iov_to_buf(req->elem.out_sg, req->elem.out_num, 0,
 , sizeof(type)) < sizeof(type)) {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
 return;
 }
 
@@ -395,7 +396,8 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
 if (type == VIRTIO_SCSI_T_TMF) {
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlTMFReq),
 sizeof(VirtIOSCSICtrlTMFResp)) < 0) {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
+return;
 } else {
 r = virtio_scsi_do_tmf(s, req);
 }
@@ -404,7 +406,8 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
type == VIRTIO_SCSI_T_AN_SUBSCRIBE) {
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlANReq),
 sizeof(VirtIOSCSICtrlANResp)) < 0) {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
+return;
 } else {
 req->resp.an.event_actual = 0;
 req->resp.an.response = VIRTIO_SCSI_S_OK;
@@ -521,7 +524,7 @@ static void virtio_scsi_fail_cmd_req(VirtIOSCSIReq *req)
 virtio_scsi_complete_cmd_req(req);
 }
 
-static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq 
*req)
+static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq 
*req)
 {
 VirtIOSCSICommon *vs = >parent_obj;
 SCSIDevice *d;
@@ -532,17 +535,18 @@ static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI 
*s, VirtIOSCSIReq *req
 if (rc < 0) {
 if (rc == -ENOTSUP) {
 virtio_scsi_fail_cmd_req(req);
+return -ENOTSUP;
 } else {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
+return -EINVAL;
 }
-return false;
 }
 
 d = virtio_scsi_device_find(s, req->req.cmd.lun);
 if (!d) {
 req->resp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
 virtio_scsi_complete_cmd_req(req);
-return false;
+return -ENOENT;
 }
 virtio_scsi_ctx_check(s, d);
 req->sreq = scsi_req_new(d, req->req.cmd.tag,
@@ -554,7 +558,7 @@ static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI 
*s, VirtIOSCSIReq *req

[Qemu-devel] [PULL 08/27] colo-compare: introduce packet comparison thread

2016-09-26 Thread Jason Wang

From: Zhang Chen 

If primary packet is same with secondary packet,
we will send primary packet and drop secondary
packet, otherwise notify COLO frame to do checkpoint.
If primary packet comes but secondary packet does not,
after REGULAR_PACKET_CHECK_MS milliseconds we set
the primary packet as old_packet,then do a checkpoint.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
Signed-off-by: Jason Wang 
---
 net/colo-compare.c | 233 +
 net/colo.c |   1 +
 net/colo.h |   3 +
 trace-events   |   2 +
 4 files changed, 239 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 231654c..645126e 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -33,8 +33,12 @@
 #define COLO_COMPARE(obj) \
 OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
+#define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
+/* TODO: Should be configurable */
+#define REGULAR_PACKET_CHECK_MS 3000
+
 /*
   + CompareState ++
   |   |
@@ -76,6 +80,11 @@ typedef struct CompareState {
 GQueue conn_list;
 /* hashtable to save connection */
 GHashTable *connection_track_table;
+/* compare thread, a thread for each NIC */
+QemuThread thread;
+/* Timer used on the primary to find packets that are never matched */
+QEMUTimer *timer;
+QemuMutex timer_check_lock;
 } CompareState;
 
 typedef struct CompareClass {
@@ -148,6 +157,118 @@ static int packet_enqueue(CompareState *s, int mode)
 return 0;
 }
 
+/*
+ * The IP packets sent by primary and secondary
+ * will be compared in here
+ * TODO support ip fragment, Out-Of-Order
+ * return:0  means packet same
+ *> 0 || < 0 means packet different
+ */
+static int colo_packet_compare(Packet *ppkt, Packet *spkt)
+{
+trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
+   inet_ntoa(ppkt->ip->ip_dst), spkt->size,
+   inet_ntoa(spkt->ip->ip_src),
+   inet_ntoa(spkt->ip->ip_dst));
+
+if (ppkt->size == spkt->size) {
+return memcmp(ppkt->data, spkt->data, spkt->size);
+} else {
+return -1;
+}
+}
+
+static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
+{
+trace_colo_compare_main("compare all");
+return colo_packet_compare(ppkt, spkt);
+}
+
+static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
+{
+int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+
+if ((now - pkt->creation_ms) > (*check_time)) {
+trace_colo_old_packet_check_found(pkt->creation_ms);
+return 0;
+} else {
+return 1;
+}
+}
+
+static void colo_old_packet_check_one_conn(void *opaque,
+   void *user_data)
+{
+Connection *conn = opaque;
+GList *result = NULL;
+int64_t check_time = REGULAR_PACKET_CHECK_MS;
+
+result = g_queue_find_custom(>primary_list,
+ _time,
+ (GCompareFunc)colo_old_packet_check_one);
+
+if (result) {
+/* do checkpoint will flush old packet */
+/* TODO: colo_notify_checkpoint();*/
+}
+}
+
+/*
+ * Look for old packets that the secondary hasn't matched,
+ * if we have some then we have to checkpoint to wake
+ * the secondary up.
+ */
+static void colo_old_packet_check(void *opaque)
+{
+CompareState *s = opaque;
+
+g_queue_foreach(>conn_list, colo_old_packet_check_one_conn, NULL);
+}
+
+/*
+ * Called from the compare thread on the primary
+ * for compare connection
+ */
+static void colo_compare_connection(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+GList *result = NULL;
+int ret;
+
+while (!g_queue_is_empty(>primary_list) &&
+   !g_queue_is_empty(>secondary_list)) {
+qemu_mutex_lock(>timer_check_lock);
+pkt = g_queue_pop_tail(>primary_list);
+qemu_mutex_unlock(>timer_check_lock);
+result = g_queue_find_custom(>secondary_list,
+  pkt, (GCompareFunc)colo_packet_compare_all);
+
+if (result) {
+ret = compare_chr_send(s->chr_out, pkt->data, pkt->size);
+if (ret < 0) {
+error_report("colo_send_primary_packet failed");
+}
+trace_colo_compare_main("packet same and release packet");
+g_queue_remove(>secondary_list, result->data);
+packet_destroy(pkt, NULL);
+} else {
+/*
+ * If one packet arrive late, the secondary_list or
+ * primary_list will be empty, so we can't compare it
+ * until next comparison.
+ */
+

[Qemu-devel] [PATCH 18/18] target-riscv: Add generic test board, activate target

2016-09-26 Thread Sagar Karandikar

Signed-off-by: Sagar Karandikar 
---
 configure   |   6 +
 default-configs/riscv32-softmmu.mak |  38 ++
 default-configs/riscv64-softmmu.mak |  38 ++
 hw/riscv/Makefile.objs  |   2 +
 hw/riscv/riscv_board.c  | 264 
 hw/riscv/riscv_int.c|  67 +
 6 files changed, 415 insertions(+)
 create mode 100644 default-configs/riscv32-softmmu.mak
 create mode 100644 default-configs/riscv64-softmmu.mak
 create mode 100644 hw/riscv/riscv_board.c
 create mode 100644 hw/riscv/riscv_int.c

diff --git a/configure b/configure
index 8fa62ad..e3381b8 100755
--- a/configure
+++ b/configure
@@ -5667,6 +5667,12 @@ case "$target_name" in
 TARGET_BASE_ARCH=mips
 echo "TARGET_ABI_MIPSN64=y" >> $config_target_mak
   ;;
+  riscv32)
+TARGET_BASE_ARCH=riscv
+  ;;
+  riscv64)
+TARGET_BASE_ARCH=riscv
+  ;;
   moxie)
   ;;
   or32)
diff --git a/default-configs/riscv32-softmmu.mak 
b/default-configs/riscv32-softmmu.mak
new file mode 100644
index 000..c8b7fa1
--- /dev/null
+++ b/default-configs/riscv32-softmmu.mak
@@ -0,0 +1,38 @@
+# Default configuration for riscv-softmmu
+
+#include pci.mak
+#include sound.mak
+#include usb.mak
+#CONFIG_ESP=y
+#CONFIG_VGA=y
+#CONFIG_VGA_PCI=y
+#CONFIG_VGA_ISA=y
+#CONFIG_VGA_ISA_MM=y
+#CONFIG_VGA_CIRRUS=y
+#CONFIG_VMWARE_VGA=y
+CONFIG_SERIAL=y
+#CONFIG_PARALLEL=y
+#CONFIG_I8254=y
+#CONFIG_PCSPK=y
+#CONFIG_PCKBD=y
+#CONFIG_FDC=y
+#CONFIG_ACPI=y
+#CONFIG_APM=y
+#CONFIG_I8257=y
+#CONFIG_PIIX4=y
+#CONFIG_IDE_ISA=y
+#CONFIG_IDE_PIIX=y
+#CONFIG_NE2000_ISA=y
+#CONFIG_RC4030=y
+#CONFIG_DP8393X=y
+#CONFIG_DS1225Y=y
+#CONFIG_MIPSNET=y
+#CONFIG_PFLASH_CFI01=y
+#CONFIG_G364FB=y
+CONFIG_I8259=y
+#CONFIG_JAZZ_LED=y
+#CONFIG_MC146818RTC=y
+#CONFIG_VT82C686=y
+#CONFIG_ISA_TESTDEV=y
+#CONFIG_EMPTY_SLOT=y
+CONFIG_VIRTIO=y
diff --git a/default-configs/riscv64-softmmu.mak 
b/default-configs/riscv64-softmmu.mak
new file mode 100644
index 000..c8b7fa1
--- /dev/null
+++ b/default-configs/riscv64-softmmu.mak
@@ -0,0 +1,38 @@
+# Default configuration for riscv-softmmu
+
+#include pci.mak
+#include sound.mak
+#include usb.mak
+#CONFIG_ESP=y
+#CONFIG_VGA=y
+#CONFIG_VGA_PCI=y
+#CONFIG_VGA_ISA=y
+#CONFIG_VGA_ISA_MM=y
+#CONFIG_VGA_CIRRUS=y
+#CONFIG_VMWARE_VGA=y
+CONFIG_SERIAL=y
+#CONFIG_PARALLEL=y
+#CONFIG_I8254=y
+#CONFIG_PCSPK=y
+#CONFIG_PCKBD=y
+#CONFIG_FDC=y
+#CONFIG_ACPI=y
+#CONFIG_APM=y
+#CONFIG_I8257=y
+#CONFIG_PIIX4=y
+#CONFIG_IDE_ISA=y
+#CONFIG_IDE_PIIX=y
+#CONFIG_NE2000_ISA=y
+#CONFIG_RC4030=y
+#CONFIG_DP8393X=y
+#CONFIG_DS1225Y=y
+#CONFIG_MIPSNET=y
+#CONFIG_PFLASH_CFI01=y
+#CONFIG_G364FB=y
+CONFIG_I8259=y
+#CONFIG_JAZZ_LED=y
+#CONFIG_MC146818RTC=y
+#CONFIG_VT82C686=y
+#CONFIG_ISA_TESTDEV=y
+#CONFIG_EMPTY_SLOT=y
+CONFIG_VIRTIO=y
diff --git a/hw/riscv/Makefile.objs b/hw/riscv/Makefile.objs
index d830e5d..de6017e 100644
--- a/hw/riscv/Makefile.objs
+++ b/hw/riscv/Makefile.objs
@@ -1,3 +1,5 @@
 obj-y += riscv_rtc.o
+obj-y += riscv_int.o
 obj-y += htif/elf_symb.o
 obj-y += htif/htif.o
+obj-y += riscv_board.o
diff --git a/hw/riscv/riscv_board.c b/hw/riscv/riscv_board.c
new file mode 100644
index 000..1c136e2
--- /dev/null
+++ b/hw/riscv/riscv_board.c
@@ -0,0 +1,264 @@
+/*
+ * QEMU RISC-V Generic Board Support
+ *
+ * Author: Sagar Karandikar, sag...@eecs.berkeley.edu
+ *
+ * This provides a RISC-V Board with the following devices:
+ *
+ * 0) HTIF Test Pass/Fail Reporting (no syscall proxy)
+ * 1) HTIF Console
+ *
+ * These are created by htif_mm_init below.
+ *
+ * This board currently uses a hardcoded devicetree that indicates one hart.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/hw.h"
+#include "hw/char/serial.h"
+#include "hw/riscv/htif/htif.h"
+#include "hw/riscv/riscv_rtc.h"
+#include "hw/boards.h"
+#include "hw/riscv/cpudevs.h"
+#include "sysemu/char.h"
+#include "sysemu/arch_init.h"

[Qemu-devel] [PATCH 13/18] target-riscv: Add CSR read/write helpers

2016-09-26 Thread Sagar Karandikar

Signed-off-by: Sagar Karandikar 
---
 target-riscv/op_helper.c | 324 +++
 1 file changed, 324 insertions(+)

diff --git a/target-riscv/op_helper.c b/target-riscv/op_helper.c
index ee51f02..8449d1b 100644
--- a/target-riscv/op_helper.c
+++ b/target-riscv/op_helper.c
@@ -39,6 +39,11 @@ void set_privilege(CPURISCVState *env, target_ulong newpriv)
 env->priv = newpriv;
 }
 
+static int validate_vm(target_ulong vm)
+{
+return vm == VM_SV32 || vm == VM_SV39 || vm == VM_SV48 || vm == VM_MBARE;
+}
+
 /* Exceptions processing helpers */
 static inline void QEMU_NORETURN do_raise_exception_err(CPURISCVState *env,
   uint32_t exception, uintptr_t pc)
@@ -83,6 +88,180 @@ target_ulong helper_mulhsu(CPURISCVState *env, target_ulong 
arg1,
 inline void csr_write_helper(CPURISCVState *env, target_ulong val_to_write,
 target_ulong csrno)
 {
+#ifdef RISCV_DEBUG_PRINT
+fprintf(stderr, "Write CSR reg: 0x" TARGET_FMT_lx "\n", csrno);
+fprintf(stderr, "Write CSR val: 0x" TARGET_FMT_lx "\n", val_to_write);
+#endif
+
+uint64_t delegable_ints = MIP_SSIP | MIP_STIP | MIP_SEIP | (1 << IRQ_COP);
+uint64_t all_ints = delegable_ints | MIP_MSIP | MIP_MTIP;
+
+switch (csrno) {
+case CSR_FFLAGS:
+env->csr[CSR_MSTATUS] |= MSTATUS_FS | MSTATUS64_SD;
+env->csr[CSR_FFLAGS] = val_to_write & (FSR_AEXC >> FSR_AEXC_SHIFT);
+break;
+case CSR_FRM:
+env->csr[CSR_MSTATUS] |= MSTATUS_FS | MSTATUS64_SD;
+env->csr[CSR_FRM] = val_to_write & (FSR_RD >> FSR_RD_SHIFT);
+break;
+case CSR_FCSR:
+env->csr[CSR_MSTATUS] |= MSTATUS_FS | MSTATUS64_SD;
+env->csr[CSR_FFLAGS] = (val_to_write & FSR_AEXC) >> FSR_AEXC_SHIFT;
+env->csr[CSR_FRM] = (val_to_write & FSR_RD) >> FSR_RD_SHIFT;
+break;
+case CSR_MSTATUS: {
+target_ulong mstatus = env->csr[CSR_MSTATUS];
+if ((val_to_write ^ mstatus) &
+(MSTATUS_VM | MSTATUS_MPP | MSTATUS_MPRV | MSTATUS_PUM |
+ MSTATUS_MXR)) {
+helper_tlb_flush(env);
+}
+
+/* no extension support */
+target_ulong mask = MSTATUS_SIE | MSTATUS_SPIE | MSTATUS_MIE
+| MSTATUS_MPIE | MSTATUS_SPP | MSTATUS_FS | MSTATUS_MPRV
+| MSTATUS_PUM | MSTATUS_MXR;
+
+if (validate_vm(get_field(val_to_write, MSTATUS_VM))) {
+mask |= MSTATUS_VM;
+}
+if (validate_priv(get_field(val_to_write, MSTATUS_MPP))) {
+mask |= MSTATUS_MPP;
+}
+
+mstatus = (mstatus & ~mask) | (val_to_write & mask);
+
+int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
+dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
+mstatus = set_field(mstatus, MSTATUS64_SD, dirty);
+env->csr[CSR_MSTATUS] = mstatus;
+break;
+}
+case CSR_MIP: {
+target_ulong mask = MIP_SSIP | MIP_STIP;
+env->csr[CSR_MIP] = (env->csr[CSR_MIP] & ~mask) |
+(val_to_write & mask);
+if (env->csr[CSR_MIP] & MIP_SSIP) {
+qemu_irq_raise(SSIP_IRQ);
+} else {
+qemu_irq_lower(SSIP_IRQ);
+}
+if (env->csr[CSR_MIP] & MIP_STIP) {
+qemu_irq_raise(STIP_IRQ);
+} else {
+qemu_irq_lower(STIP_IRQ);
+}
+if (env->csr[CSR_MIP] & MIP_MSIP) {
+qemu_irq_raise(MSIP_IRQ);
+} else {
+qemu_irq_lower(MSIP_IRQ);
+}
+break;
+}
+case CSR_MIE: {
+env->csr[CSR_MIE] = (env->csr[CSR_MIE] & ~all_ints) |
+(val_to_write & all_ints);
+break;
+}
+case CSR_MIDELEG:
+env->csr[CSR_MIDELEG] = (env->csr[CSR_MIDELEG] & ~delegable_ints)
+| (val_to_write & delegable_ints);
+break;
+case CSR_MEDELEG: {
+target_ulong mask = 0;
+mask |= 1ULL << (RISCV_EXCP_INST_ADDR_MIS);
+mask |= 1ULL << (RISCV_EXCP_INST_ACCESS_FAULT);
+mask |= 1ULL << (RISCV_EXCP_ILLEGAL_INST);
+mask |= 1ULL << (RISCV_EXCP_BREAKPOINT);
+mask |= 1ULL << (RISCV_EXCP_LOAD_ADDR_MIS);
+mask |= 1ULL << (RISCV_EXCP_LOAD_ACCESS_FAULT);
+mask |= 1ULL << (RISCV_EXCP_STORE_AMO_ADDR_MIS);
+mask |= 1ULL << (RISCV_EXCP_STORE_AMO_ACCESS_FAULT);
+mask |= 1ULL << (RISCV_EXCP_U_ECALL);
+mask |= 1ULL << (RISCV_EXCP_S_ECALL);
+mask |= 1ULL << (RISCV_EXCP_H_ECALL);
+mask |= 1ULL << (RISCV_EXCP_M_ECALL);
+env->csr[CSR_MEDELEG] = (env->csr[CSR_MEDELEG] & ~mask)
+| (val_to_write & mask);
+break;
+}
+case CSR_MUCOUNTEREN:
+env->csr[CSR_MUCOUNTEREN] = val_to_write & 7;
+break;
+case CSR_MSCOUNTEREN:
+env->csr[CSR_MSCOUNTEREN] = val_to_write & 7;
+break;
+case CSR_SSTATUS: {
+target_ulong ms = env->csr[CSR_MSTATUS];

[Qemu-devel] [PATCH 02/18] target-riscv: Add RISC-V Target stubs inside target-riscv/

2016-09-26 Thread Sagar Karandikar

Signed-off-by: Sagar Karandikar 
---
 target-riscv/Makefile.objs |   1 +
 target-riscv/cpu.c | 154 ++
 target-riscv/cpu.h | 497 +
 target-riscv/helper.c  |  59 ++
 target-riscv/helper.h  |   0
 target-riscv/op_helper.c   |  44 
 target-riscv/translate.c   | 131 
 7 files changed, 886 insertions(+)
 create mode 100644 target-riscv/Makefile.objs
 create mode 100644 target-riscv/cpu.c
 create mode 100644 target-riscv/cpu.h
 create mode 100644 target-riscv/helper.c
 create mode 100644 target-riscv/helper.h
 create mode 100644 target-riscv/op_helper.c
 create mode 100644 target-riscv/translate.c

diff --git a/target-riscv/Makefile.objs b/target-riscv/Makefile.objs
new file mode 100644
index 000..cb448a8
--- /dev/null
+++ b/target-riscv/Makefile.objs
@@ -0,0 +1 @@
+obj-y += translate.o op_helper.o helper.o cpu.o
diff --git a/target-riscv/cpu.c b/target-riscv/cpu.c
new file mode 100644
index 000..0923a75
--- /dev/null
+++ b/target-riscv/cpu.c
@@ -0,0 +1,154 @@
+/*
+ * QEMU RISC-V CPU
+ *
+ * Author: Sagar Karandikar, sag...@eecs.berkeley.edu
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "cpu.h"
+#include "qemu-common.h"
+#include "migration/vmstate.h"
+
+static void riscv_cpu_set_pc(CPUState *cs, vaddr value)
+{
+RISCVCPU *cpu = RISCV_CPU(cs);
+CPURISCVState *env = >env;
+env->PC = value;
+}
+
+static void riscv_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+{
+RISCVCPU *cpu = RISCV_CPU(cs);
+CPURISCVState *env = >env;
+env->PC = tb->pc;
+}
+
+static bool riscv_cpu_has_work(CPUState *cs)
+{
+RISCVCPU *cpu = RISCV_CPU(cs);
+CPURISCVState *env = >env;
+bool has_work = false;
+
+if (cs->interrupt_request & CPU_INTERRUPT_HARD) {
+int interruptno = cpu_riscv_hw_interrupts_pending(env);
+if (interruptno + 1) {
+has_work = true;
+}
+}
+
+return has_work;
+}
+
+static void riscv_cpu_reset(CPUState *s)
+{
+RISCVCPU *cpu = RISCV_CPU(s);
+RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
+CPURISCVState *env = >env;
+CPUState *cs = CPU(cpu);
+
+mcc->parent_reset(s);
+tlb_flush(s, 1);
+
+env->priv = PRV_M;
+env->PC = DEFAULT_RSTVEC;
+env->csr[CSR_MTVEC] = DEFAULT_MTVEC;
+cs->exception_index = EXCP_NONE;
+}
+
+static void riscv_cpu_realizefn(DeviceState *dev, Error **errp)
+{
+CPUState *cs = CPU(dev);
+RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
+
+cpu_reset(cs);
+qemu_init_vcpu(cs);
+
+mcc->parent_realize(dev, errp);
+}
+
+static void riscv_cpu_initfn(Object *obj)
+{
+CPUState *cs = CPU(obj);
+RISCVCPU *cpu = RISCV_CPU(obj);
+CPURISCVState *env = >env;
+
+cs->env_ptr = env;
+cpu_exec_init(cs, _abort);
+
+if (tcg_enabled()) {
+riscv_tcg_init();
+}
+}
+
+static const VMStateDescription vmstate_riscv_cpu = {
+.name = "cpu",
+.unmigratable = 1,
+};
+
+static void riscv_cpu_class_init(ObjectClass *c, void *data)
+{
+RISCVCPUClass *mcc = RISCV_CPU_CLASS(c);
+CPUClass *cc = CPU_CLASS(c);
+DeviceClass *dc = DEVICE_CLASS(c);
+
+mcc->parent_realize = dc->realize;
+dc->realize = riscv_cpu_realizefn;
+
+mcc->parent_reset = cc->reset;
+cc->reset = riscv_cpu_reset;
+
+cc->has_work = riscv_cpu_has_work;
+cc->do_interrupt = riscv_cpu_do_interrupt;
+cc->cpu_exec_interrupt = riscv_cpu_exec_interrupt;
+cc->dump_state = riscv_cpu_dump_state;
+cc->set_pc = riscv_cpu_set_pc;
+cc->synchronize_from_tb = riscv_cpu_synchronize_from_tb;
+#ifdef CONFIG_USER_ONLY
+cc->handle_mmu_fault = riscv_cpu_handle_mmu_fault;
+#else
+cc->do_unassigned_access = riscv_cpu_unassigned_access;
+cc->do_unaligned_access = riscv_cpu_do_unaligned_access;
+cc->get_phys_page_debug = riscv_cpu_get_phys_page_debug;
+#endif
+/* For now, mark unmigratable: */
+cc->vmsd = _riscv_cpu;
+
+/*
+ * Reason: riscv_cpu_initfn() calls cpu_exec_init(), which saves
+ * the object in cpus -> dangling pointer after final
+ * object_unref().
+ */
+dc->cannot_destroy_with_object_finalize_yet = true;
+}
+
+static const TypeInfo riscv_cpu_type_info = {
+.name

[Qemu-devel] [PATCH 09/18] target-riscv: Add FMADD, FMSUB, FNMADD, FNMSUB Instructions,

2016-09-26 Thread Sagar Karandikar

Along with FP helper infrastructure, changes to softfloat-specialize

Signed-off-by: Sagar Karandikar 
---
 fpu/softfloat-specialize.h |   7 ++-
 target-riscv/Makefile.objs |   2 +-
 target-riscv/fpu_helper.c  | 151 +
 target-riscv/helper.h  |  10 +++
 target-riscv/translate.c   | 105 +++
 5 files changed, 271 insertions(+), 4 deletions(-)
 create mode 100644 target-riscv/fpu_helper.c

diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index f5aed72..fa5986d 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -114,7 +114,8 @@ float32 float32_default_nan(float_status *status)
 #if defined(TARGET_SPARC)
 return const_float32(0x7FFF);
 #elif defined(TARGET_PPC) || defined(TARGET_ARM) || defined(TARGET_ALPHA) || \
-  defined(TARGET_XTENSA) || defined(TARGET_S390X) || 
defined(TARGET_TRICORE)
+  defined(TARGET_XTENSA) || defined(TARGET_S390X) || \
+  defined(TARGET_TRICORE) || defined(TARGET_RISCV)
 return const_float32(0x7FC0);
 #else
 if (status->snan_bit_is_one) {
@@ -137,7 +138,7 @@ float64 float64_default_nan(float_status *status)
 #if defined(TARGET_SPARC)
 return const_float64(LIT64(0x7FFF));
 #elif defined(TARGET_PPC) || defined(TARGET_ARM) || defined(TARGET_ALPHA) || \
-  defined(TARGET_S390X)
+  defined(TARGET_S390X) || defined(TARGET_RISCV)
 return const_float64(LIT64(0x7FF8));
 #else
 if (status->snan_bit_is_one) {
@@ -181,7 +182,7 @@ float128 float128_default_nan(float_status *status)
 r.high = LIT64(0x7FFF7FFF);
 } else {
 r.low = LIT64(0x);
-#if defined(TARGET_S390X)
+#if defined(TARGET_S390X) || defined(TARGET_RISCV)
 r.high = LIT64(0x7FFF8000);
 #else
 r.high = LIT64(0x8000);
diff --git a/target-riscv/Makefile.objs b/target-riscv/Makefile.objs
index cb448a8..0149732 100644
--- a/target-riscv/Makefile.objs
+++ b/target-riscv/Makefile.objs
@@ -1 +1 @@
-obj-y += translate.o op_helper.o helper.o cpu.o
+obj-y += translate.o op_helper.o helper.o cpu.o fpu_helper.o
diff --git a/target-riscv/fpu_helper.c b/target-riscv/fpu_helper.c
new file mode 100644
index 000..9023d10
--- /dev/null
+++ b/target-riscv/fpu_helper.c
@@ -0,0 +1,151 @@
+/*
+ * RISC-V FPU Emulation Helpers for QEMU.
+ *
+ * Author: Sagar Karandikar, sag...@eecs.berkeley.edu
+ *
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include "cpu.h"
+#include "qemu/host-utils.h"
+#include "exec/helper-proto.h"
+
+/* convert RISC-V rounding mode to IEEE library numbers */
+unsigned int ieee_rm[] = {
+float_round_nearest_even,
+float_round_to_zero,
+float_round_down,
+float_round_up,
+float_round_ties_away
+};
+
+/* obtain rm value to use in computation
+ * as the last step, convert rm codes to what the softfloat library expects
+ * Adapted from Spike's decode.h:RM
+ */
+#define RM ({ \
+if (rm == 7) {\
+rm = env->csr[CSR_FRM];   \
+} \
+if (rm > 4) { \
+helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); \
+} \
+ieee_rm[rm]; })
+
+/* convert softfloat library flag numbers to RISC-V */
+unsigned int softfloat_flags_to_riscv(unsigned int flag)
+{
+switch (flag) {
+case float_flag_inexact:
+return 1;
+case float_flag_underflow:
+return 2;
+case float_flag_overflow:
+return 4;
+case float_flag_divbyzero:
+return 8;
+case float_flag_invalid:
+return 16;
+default:
+return 0;
+}
+}
+
+/* adapted from Spike's decode.h:set_fp_exceptions */
+#define set_fp_exceptions() do { \
+env->csr[CSR_FFLAGS] |= 
softfloat_flags_to_riscv(get_float_exception_flags(\
+>fp_status)); \
+set_float_exception_flags(0, >fp_status); \
+} while (0)
+
+uint64_t helper_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+uint64_t frs3, uint64_t rm)
+{
+

[Qemu-devel] [PATCH V8 6/6] coroutine: reduce stack size to 60kB

2016-09-26 Thread Peter Lieven

evaluation with the recently introduced maximum stack usage monitoring revealed
that the actual used stack size was never above 4kB so allocating 1MB stack
for each coroutine is a lot of wasted memory. So reduce the stack size to
60kB which should still give enough head room. The guard page added
in qemu_alloc_stack will catch a potential stack overflow introduced
by this commit. The 60kB + guard page will result in an allocation of
64kB per coroutine on systems where a page is 4kB.

Signed-off-by: Peter Lieven 
---
 include/qemu/coroutine_int.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index 14d4f1d..be14260 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -28,7 +28,7 @@
 #include "qemu/queue.h"
 #include "qemu/coroutine.h"
 
-#define COROUTINE_STACK_SIZE (1 << 20)
+#define COROUTINE_STACK_SIZE 61440
 
 typedef enum {
 COROUTINE_YIELD = 1,
-- 
1.9.1

[Qemu-devel] [PATCH 11/18] target-riscv: Add Double Precision Floating-Point Instructions

2016-09-26 Thread Sagar Karandikar

Signed-off-by: Sagar Karandikar 
---
 target-riscv/fpu_helper.c | 225 ++
 target-riscv/helper.h |  30 +++
 target-riscv/translate.c  | 135 
 3 files changed, 390 insertions(+)

diff --git a/target-riscv/fpu_helper.c b/target-riscv/fpu_helper.c
index 8d33fa1..b3d443e 100644
--- a/target-riscv/fpu_helper.c
+++ b/target-riscv/fpu_helper.c
@@ -355,3 +355,228 @@ target_ulong helper_fclass_s(CPURISCVState *env, uint64_t 
frs1)
 frs1 = float32_classify(frs1, >fp_status);
 return frs1;
 }
+
+uint64_t helper_fadd_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float64_add(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fsub_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float64_sub(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fmul_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float64_mul(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fdiv_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float64_div(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fsgnj_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = (frs1 & ~INT64_MIN) | (frs2 & INT64_MIN);
+return frs1;
+}
+
+uint64_t helper_fsgnjn_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = (frs1 & ~INT64_MIN) | ((~frs2) & INT64_MIN);
+return frs1;
+}
+
+uint64_t helper_fsgnjx_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = frs1 ^ (frs2 & INT64_MIN);
+return frs1;
+}
+
+uint64_t helper_fmin_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float64_is_any_nan(frs2) ||
+   float64_lt_quiet(frs1, frs2, >fp_status) ? frs1 : frs2;
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fmax_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float64_is_any_nan(frs2) ||
+   float64_le_quiet(frs2, frs1, >fp_status) ? frs1 : frs2;
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fcvt_s_d(CPURISCVState *env, uint64_t rs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+rs1 = float64_to_float32(rs1, >fp_status);
+set_fp_exceptions();
+return rs1;
+}
+
+uint64_t helper_fcvt_d_s(CPURISCVState *env, uint64_t rs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+rs1 = float32_to_float64(rs1, >fp_status);
+set_fp_exceptions();
+return rs1;
+}
+
+uint64_t helper_fsqrt_d(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float64_sqrt(frs1, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_fle_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float64_le(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_flt_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float64_lt(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_feq_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float64_eq(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_fcvt_w_d(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = (int64_t)((int32_t)float64_to_int32(frs1, >fp_status));
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_fcvt_wu_d(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = (int64_t)((int32_t)float64_to_uint32(frs1, >fp_status));
+set_fp_exceptions();
+return frs1;
+}
+
+#if defined(TARGET_RISCV64)
+uint64_t helper_fcvt_l_d(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float64_to_int64(frs1, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fcvt_lu_d(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float64_to_uint64(frs1, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+#endif
+
+uint64_t helper_fcvt_d_w(CPURISCVState *env, target_ulong rs1, uint64_t rm)
+{
+uint64_t res;
+set_float_rounding_mode(RM, >fp_status);
+res = int32_to_float64((int32_t)rs1, >fp_status);
+set_fp_exceptions();
+return res;
+}
+
+uint64_t helper_fcvt_d_wu(CPURISCVState

[Qemu-devel] [PATCH V8 3/6] coroutine-ucontext: use helper for allocating stack memory

2016-09-26 Thread Peter Lieven

Signed-off-by: Peter Lieven 
---
 util/coroutine-ucontext.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index 31254ab..6621f3f 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -34,6 +34,7 @@
 typedef struct {
 Coroutine base;
 void *stack;
+size_t stack_size;
 sigjmp_buf env;
 
 #ifdef CONFIG_VALGRIND_H
@@ -82,7 +83,6 @@ static void coroutine_trampoline(int i0, int i1)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineUContext *co;
 ucontext_t old_uc, uc;
 sigjmp_buf old_env;
@@ -101,17 +101,18 @@ Coroutine *qemu_coroutine_new(void)
 }
 
 co = g_malloc0(sizeof(*co));
-co->stack = g_malloc(stack_size);
+co->stack_size = COROUTINE_STACK_SIZE;
+co->stack = qemu_alloc_stack(>stack_size);
 co->base.entry_arg = _env; /* stash away our jmp_buf */
 
 uc.uc_link = _uc;
 uc.uc_stack.ss_sp = co->stack;
-uc.uc_stack.ss_size = stack_size;
+uc.uc_stack.ss_size = co->stack_size;
 uc.uc_stack.ss_flags = 0;
 
 #ifdef CONFIG_VALGRIND_H
 co->valgrind_stack_id =
-VALGRIND_STACK_REGISTER(co->stack, co->stack + stack_size);
+VALGRIND_STACK_REGISTER(co->stack, co->stack + co->stack_size);
 #endif
 
 arg.p = co;
@@ -149,7 +150,7 @@ void qemu_coroutine_delete(Coroutine *co_)
 valgrind_stack_deregister(co);
 #endif
 
-g_free(co->stack);
+qemu_free_stack(co->stack, co->stack_size);
 g_free(co);
 }
 
-- 
1.9.1

Re: [Qemu-devel] [PATCH] virtio-serial: virtio console emergency write support

2016-09-26 Thread Amit Shah

Hi,

On (Wed) 21 Sep 2016 [14:17:48], Sascha Silbe wrote:
> Add support for the virtio 1.0 "emergency write"
> (VIRTIO_CONSOLE_F_EMERG_WRITE) feature. This is useful for early guest
> debugging and might be used in cases where the guest crashes so badly
> that it cannot use virtqueues.
> 
> Disabled for compatibility machines to avoid exposing a new feature to
> existing guests.
> 
> Reviewed-by: Cornelia Huck 
> Signed-off-by: Sascha Silbe 

This looks fine, but can you split the patch - adding
find_first_connected_console(), set_config, and then finally enabling
the emergency_write feature?

Thanks,


Amit

[Qemu-devel] [PATCH V8 1/6] oslib-posix: add helpers for stack alloc and free

2016-09-26 Thread Peter Lieven

the allocated stack will be adjusted to the minimum supported stack size
by the OS and rounded up to be a multiple of the system pagesize.
Additionally an architecture dependent guard page is added to the stack
to catch stack overflows.

Signed-off-by: Peter Lieven 
---
 include/sysemu/os-posix.h | 27 +++
 util/oslib-posix.c| 43 +++
 2 files changed, 70 insertions(+)

diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h
index 9c7dfdf..4a0f493 100644
--- a/include/sysemu/os-posix.h
+++ b/include/sysemu/os-posix.h
@@ -60,4 +60,31 @@ int qemu_utimens(const char *path, const qemu_timespec 
*times);
 
 bool is_daemonized(void);
 
+/**
+ * qemu_alloc_stack:
+ * @sz: pointer to a size_t holding the requested stack size
+ *
+ * Allocate memory that can be used as a stack, for instance for
+ * coroutines. If the memory cannot be allocated, this function
+ * will abort (like g_malloc()). This function also inserts an
+ * additional guard page to catch a potential stack overflow.
+ * Note that the useable stack memory can be greater than the
+ * requested stack size due to alignment and minimal stack size
+ * restrictions. In this case the value of sz is adjusted.
+ *
+ * The allocated stack must be freed with qemu_free_stack().
+ *
+ * Returns: pointer to (the lowest address of) the stack memory.
+ */
+void *qemu_alloc_stack(size_t *sz);
+
+/**
+ * qemu_free_stack:
+ * @stack: stack to free
+ * @sz: size of stack in bytes
+ *
+ * Free a stack allocated via qemu_alloc_stack().
+ */
+void qemu_free_stack(void *stack, size_t sz);
+
 #endif
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index f2d4e9e..7d053b8 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -499,3 +499,46 @@ pid_t qemu_fork(Error **errp)
 }
 return pid;
 }
+
+void *qemu_alloc_stack(size_t *sz)
+{
+void *ptr, *guardpage;
+size_t pagesz = getpagesize();
+size_t allocsz;
+#ifdef _SC_THREAD_STACK_MIN
+/* avoid stacks smaller than _SC_THREAD_STACK_MIN */
+long min_stack_sz = sysconf(_SC_THREAD_STACK_MIN);
+*sz = MAX(MAX(min_stack_sz, 0), *sz);
+#endif
+/* adjust stack size to a multiple of the page size */
+*sz = ROUND_UP(*sz, pagesz);
+/* allocate one extra page for the guard page */
+allocsz = *sz + getpagesize();
+
+ptr = mmap(NULL, allocsz, PROT_READ | PROT_WRITE,
+   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+if (ptr == MAP_FAILED) {
+abort();
+}
+
+#if defined(HOST_IA64)
+/* separate register stack */
+guardpage = ptr + (((allocsz - pagesz) / 2) & ~pagesz);
+#elif defined(HOST_HPPA)
+/* stack grows up */
+guardpage = ptr + allocsz - pagesz;
+#else
+/* stack grows down */
+guardpage = ptr;
+#endif
+if (mprotect(guardpage, pagesz, PROT_NONE) != 0) {
+abort();
+}
+
+return ptr;
+}
+
+void qemu_free_stack(void *stack, size_t sz)
+{
+munmap(stack, sz + getpagesize());
+}
-- 
1.9.1

Re: [Qemu-devel] [PATCH 2/5] apic: add send_msi() to APICCommonClass

2016-09-26 Thread Igor Mammedov

On Thu, 22 Sep 2016 23:04:29 +0200
Radim Krčmář  wrote:

> The MMIO based interface to APIC doesn't work well with MSIs that have
> upper address bits set (remapped x2APIC MSIs).  A specialized interface
> is a quick and dirty way to avoid the shortcoming.
> 
> Signed-off-by: Radim Krčmář 
> ---
>  hw/i386/kvm/apic.c  | 19 +--
>  hw/i386/xen/xen_apic.c  |  6 ++
>  hw/intc/apic.c  |  6 ++
>  include/hw/i386/apic_internal.h |  4 
>  4 files changed, 29 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
> index feb00024f20c..7cc1acd63d32 100644
> --- a/hw/i386/kvm/apic.c
> +++ b/hw/i386/kvm/apic.c
> @@ -168,6 +168,17 @@ static void kvm_apic_external_nmi(APICCommonState *s)
>  run_on_cpu(CPU(s->cpu), do_inject_external_nmi, s);
>  }
>  
> +static void kvm_send_msi(MSIMessage *msg)
> +{
> +int ret;
> +
> +ret = kvm_irqchip_send_msi(kvm_state, *msg);
> +if (ret < 0) {
> +fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
> +strerror(-ret));
> +}
> +}
> +
>  static uint64_t kvm_apic_mem_read(void *opaque, hwaddr addr,
>unsigned size)
>  {
> @@ -178,13 +189,8 @@ static void kvm_apic_mem_write(void *opaque, hwaddr addr,
> uint64_t data, unsigned size)
>  {
>  MSIMessage msg = { .address = addr, .data = data };
> -int ret;
>  
> -ret = kvm_irqchip_send_msi(kvm_state, msg);
> -if (ret < 0) {
> -fprintf(stderr, "KVM: injection failed, MSI lost (%s)\n",
> -strerror(-ret));
> -}
> +kvm_send_msi();
>  }
>  
>  static const MemoryRegionOps kvm_apic_io_ops = {
> @@ -231,6 +237,7 @@ static void kvm_apic_class_init(ObjectClass *klass, void 
> *data)
>  k->enable_tpr_reporting = kvm_apic_enable_tpr_reporting;
>  k->vapic_base_update = kvm_apic_vapic_base_update;
>  k->external_nmi = kvm_apic_external_nmi;
> +k->send_msi = kvm_send_msi;
>  }
>  
>  static const TypeInfo kvm_apic_info = {
> diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
> index 21d68ee04b0a..55769eba7ede 100644
> --- a/hw/i386/xen/xen_apic.c
> +++ b/hw/i386/xen/xen_apic.c
> @@ -68,6 +68,11 @@ static void xen_apic_external_nmi(APICCommonState *s)
>  {
>  }
>  
> +static void xen_send_msi(MSIMessage *msi)
> +{
> +xen_hvm_inject_msi(msi->address, msi->data);
> +}
> +
>  static void xen_apic_class_init(ObjectClass *klass, void *data)
>  {
>  APICCommonClass *k = APIC_COMMON_CLASS(klass);
> @@ -78,6 +83,7 @@ static void xen_apic_class_init(ObjectClass *klass, void 
> *data)
>  k->get_tpr = xen_apic_get_tpr;
>  k->vapic_base_update = xen_apic_vapic_base_update;
>  k->external_nmi = xen_apic_external_nmi;
> +k->send_msi = xen_send_msi;
>  }
>  
>  static const TypeInfo xen_apic_info = {
> diff --git a/hw/intc/apic.c b/hw/intc/apic.c
> index 7bd1d279c463..4f3fb44d05e4 100644
> --- a/hw/intc/apic.c
> +++ b/hw/intc/apic.c
> @@ -900,6 +900,11 @@ static void apic_unrealize(DeviceState *dev, Error 
> **errp)
>  local_apics[s->id] = NULL;
>  }
>  
> +static void apic_send_msi_struct(MSIMessage *msi)
> +{
> +apic_send_msi(msi->address, msi->data);
> +}
why not to make apic_send_msi(MSIMessage *msi) instead of adding a wrapper?

Also when interface is switched to send_msi() in 3/5,
aren't you loosing following checks in apic_mem_writel():

if (addr > 0xfff || !index) {   
  


> +
>  static void apic_class_init(ObjectClass *klass, void *data)
>  {
>  APICCommonClass *k = APIC_COMMON_CLASS(klass);
> @@ -913,6 +918,7 @@ static void apic_class_init(ObjectClass *klass, void 
> *data)
>  k->external_nmi = apic_external_nmi;
>  k->pre_save = apic_pre_save;
>  k->post_load = apic_post_load;
> +k->send_msi = apic_send_msi_struct;
>  }
>  
>  static const TypeInfo apic_info = {
> diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
> index 9ba8a5c87f90..32b083ad2926 100644
> --- a/include/hw/i386/apic_internal.h
> +++ b/include/hw/i386/apic_internal.h
> @@ -146,6 +146,10 @@ typedef struct APICCommonClass
>  void (*pre_save)(APICCommonState *s);
>  void (*post_load)(APICCommonState *s);
>  void (*reset)(APICCommonState *s);
> +/* send_msi emulates an APIC bus and its proper place would be in a new
> + * device, but it's convenient to have it here for now.
> + */
> +void (*send_msi)(MSIMessage *msi);
>  } APICCommonClass;
>  
>  struct APICCommonState {

[Qemu-devel] [PATCH 10/18] target-riscv: Add Single Precision Floating-Point Instructions

2016-09-26 Thread Sagar Karandikar

Signed-off-by: Sagar Karandikar 
---
 target-riscv/fpu_helper.c | 206 ++
 target-riscv/helper.h |  28 +++
 target-riscv/translate.c  | 146 
 3 files changed, 380 insertions(+)

diff --git a/target-riscv/fpu_helper.c b/target-riscv/fpu_helper.c
index 9023d10..8d33fa1 100644
--- a/target-riscv/fpu_helper.c
+++ b/target-riscv/fpu_helper.c
@@ -149,3 +149,209 @@ uint64_t helper_fnmadd_d(CPURISCVState *env, uint64_t 
frs1, uint64_t frs2,
 set_fp_exceptions();
 return frs1;
 }
+
+uint64_t helper_fadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float32_add(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fsub_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float32_sub(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fmul_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float32_mul(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
+   uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float32_div(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fsgnj_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = (frs1 & ~(uint32_t)INT32_MIN) | (frs2 & (uint32_t)INT32_MIN);
+return frs1;
+}
+
+uint64_t helper_fsgnjn_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = (frs1 & ~(uint32_t)INT32_MIN) | ((~frs2) & (uint32_t)INT32_MIN);
+return frs1;
+}
+
+uint64_t helper_fsgnjx_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = frs1 ^ (frs2 & (uint32_t)INT32_MIN);
+return frs1;
+}
+
+uint64_t helper_fmin_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float32_is_any_nan(frs2) ||
+   float32_lt_quiet(frs1, frs2, >fp_status) ? frs1 : frs2;
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fmax_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float32_is_any_nan(frs2) ||
+   float32_le_quiet(frs2, frs1, >fp_status) ? frs1 : frs2;
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float32_sqrt(frs1, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_fle_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float32_le(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_flt_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float32_lt(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_feq_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
+{
+frs1 = float32_eq(frs1, frs2, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_fcvt_w_s(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = (int64_t)((int32_t)float32_to_int32(frs1, >fp_status));
+set_fp_exceptions();
+return frs1;
+}
+
+target_ulong helper_fcvt_wu_s(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = (int64_t)((int32_t)float32_to_uint32(frs1, >fp_status));
+set_fp_exceptions();
+return frs1;
+}
+
+#if defined(TARGET_RISCV64)
+uint64_t helper_fcvt_l_s(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float32_to_int64(frs1, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+
+uint64_t helper_fcvt_lu_s(CPURISCVState *env, uint64_t frs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+frs1 = float32_to_uint64(frs1, >fp_status);
+set_fp_exceptions();
+return frs1;
+}
+#endif
+
+uint64_t helper_fcvt_s_w(CPURISCVState *env, target_ulong rs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+rs1 = int32_to_float32((int32_t)rs1, >fp_status);
+set_fp_exceptions();
+return rs1;
+}
+
+uint64_t helper_fcvt_s_wu(CPURISCVState *env, target_ulong rs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+rs1 = uint32_to_float32((uint32_t)rs1, >fp_status);
+set_fp_exceptions();
+return rs1;
+}
+
+#if defined(TARGET_RISCV64)
+uint64_t helper_fcvt_s_l(CPURISCVState *env, uint64_t rs1, uint64_t rm)
+{
+set_float_rounding_mode(RM, >fp_status);
+rs1 = int64_to_float32(rs1, >fp_status);
+set_fp_exceptions();

[Qemu-devel] [PATCH V8 0/6] coroutine: mmap stack memory and stack size

2016-09-26 Thread Peter Lieven

I decided to split this from the rest of the Qemu RSS usage series as
it contains the more or less non contentious patches.

I omitted the MAP_GROWSDOWN flag in mmap as we are not 100% sure which
side effects it has.

I kept the guard page which is now nicely makes the stacks visible in
smaps. The old version of the relevent patch lacked the MAP_FIXED flag
in the second call to mmap.

v7->v8:
 The series failed on platforms with 64kB page size. Thus the following changes
 where made:
 - Patch 1: add the guard page to the stack memory and do not deduct it [Kevin, 
Stephan]
 - Patch 1: Submit the requested page size as a pointer so that 
qemu_alloc_stack can
adjust the size according to system requirements and that the full 
size is usable
to the caller.
 - Patch 6: reduced stack size to 60kB so that on systems with 4kB page size we 
still get
64kB allocations.

v6->v7:
 - Patch 1: avoid multiple calls to sysconf and getpagesize [Richard]

v5->v6:
 - Patch 1: added info that the guard page is deducted from stack memory to
commit msg and headers [Stefan]
 - rebased to master

v4->v5:
 - Patch 1: check if _SC_THREAD_STACK_MIN is defined
 - Patch 1: guard against sysconf(_SC_THREAD_STACK_MIN) returning -1 [Eric]

v3->v4:
 - Patch 1: add a static function to adjust the stack size [Richard]
 - Patch 1: round up the stack size to multiple of the pagesize.

v2->v3:
 - Patch 1,6: adjusted commit message to mention the guard page [Markus]

v1->v2:
 - Patch 1: added an architecture dependend guard page [Richard]
 - Patch 1: avoid stacks smaller than _SC_THREAD_STACK_MIN [Richard]
 - Patch 1: use mmap+mprotect instead of mmap+mmap [Richard]
 - Patch 5: u_int32_t -> uint32_t [Richard]
 - Patch 5: only available if stack grows down

Peter Lieven (6):
  oslib-posix: add helpers for stack alloc and free
  coroutine: add a macro for the coroutine stack size
  coroutine-ucontext: use helper for allocating stack memory
  coroutine-sigaltstack: use helper for allocating stack memory
  oslib-posix: add a configure switch to debug stack usage
  coroutine: reduce stack size to 60kB

 configure| 19 +++
 include/qemu/coroutine_int.h |  2 ++
 include/sysemu/os-posix.h| 27 +++
 util/coroutine-sigaltstack.c |  9 ++---
 util/coroutine-ucontext.c| 11 +++---
 util/coroutine-win32.c   |  2 +-
 util/oslib-posix.c   | 81 
 7 files changed, 141 insertions(+), 10 deletions(-)

-- 
1.9.1

[Qemu-devel] [PATCH V8 4/6] coroutine-sigaltstack: use helper for allocating stack memory

2016-09-26 Thread Peter Lieven

Signed-off-by: Peter Lieven 
---
 util/coroutine-sigaltstack.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index 9c2854c..d9c7f66 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -33,6 +33,7 @@
 typedef struct {
 Coroutine base;
 void *stack;
+size_t stack_size;
 sigjmp_buf env;
 } CoroutineUContext;
 
@@ -143,7 +144,6 @@ static void coroutine_trampoline(int signal)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineUContext *co;
 CoroutineThreadState *coTS;
 struct sigaction sa;
@@ -164,7 +164,8 @@ Coroutine *qemu_coroutine_new(void)
  */
 
 co = g_malloc0(sizeof(*co));
-co->stack = g_malloc(stack_size);
+co->stack_size = COROUTINE_STACK_SIZE;
+co->stack = qemu_alloc_stack(>stack_size);
 co->base.entry_arg = _env; /* stash away our jmp_buf */
 
 coTS = coroutine_get_thread_state();
@@ -189,7 +190,7 @@ Coroutine *qemu_coroutine_new(void)
  * Set the new stack.
  */
 ss.ss_sp = co->stack;
-ss.ss_size = stack_size;
+ss.ss_size = co->stack_size;
 ss.ss_flags = 0;
 if (sigaltstack(, ) < 0) {
 abort();
@@ -253,7 +254,7 @@ void qemu_coroutine_delete(Coroutine *co_)
 {
 CoroutineUContext *co = DO_UPCAST(CoroutineUContext, base, co_);
 
-g_free(co->stack);
+qemu_free_stack(co->stack, co->stack_size);
 g_free(co);
 }
 
-- 
1.9.1

[Qemu-devel] [PATCH 05/18] target-riscv: Add Arithmetic instructions

2016-09-26 Thread Sagar Karandikar

Arithmetic Instructions
Arithmetic Immediate Instructions
MULHSU Helper
GPR Helpers necessary for above

Signed-off-by: Sagar Karandikar 
---
 target-riscv/helper.h|   4 +
 target-riscv/op_helper.c |  10 ++
 target-riscv/translate.c | 338 +++
 3 files changed, 352 insertions(+)

diff --git a/target-riscv/helper.h b/target-riscv/helper.h
index 0461118..c489222 100644
--- a/target-riscv/helper.h
+++ b/target-riscv/helper.h
@@ -2,3 +2,7 @@
 DEF_HELPER_2(raise_exception, noreturn, env, i32)
 DEF_HELPER_1(raise_exception_debug, noreturn, env)
 DEF_HELPER_3(raise_exception_mbadaddr, noreturn, env, i32, tl)
+
+#if defined(TARGET_RISCV64)
+DEF_HELPER_FLAGS_3(mulhsu, TCG_CALL_NO_RWG_SE, tl, env, tl, tl)
+#endif
diff --git a/target-riscv/op_helper.c b/target-riscv/op_helper.c
index fd1ef3c..1a7fb18 100644
--- a/target-riscv/op_helper.c
+++ b/target-riscv/op_helper.c
@@ -50,6 +50,16 @@ void helper_raise_exception_mbadaddr(CPURISCVState *env, 
uint32_t exception,
 do_raise_exception_err(env, exception, 0);
 }
 
+#if defined(TARGET_RISCV64)
+target_ulong helper_mulhsu(CPURISCVState *env, target_ulong arg1,
+  target_ulong arg2)
+{
+int64_t a = arg1;
+uint64_t b = arg2;
+return (int64_t)((__int128_t)a * b >> 64);
+}
+#endif
+
 #ifndef CONFIG_USER_ONLY
 void riscv_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
MMUAccessType access_type, int mmu_idx,
diff --git a/target-riscv/translate.c b/target-riscv/translate.c
index 55f20ee..ccfb795 100644
--- a/target-riscv/translate.c
+++ b/target-riscv/translate.c
@@ -124,10 +124,327 @@ static inline void gen_goto_tb(DisasContext *ctx, int n, 
target_ulong dest)
 }
 }
 
+/* Wrapper for getting reg values - need to check of reg is zero since
+ * cpu_gpr[0] is not actually allocated
+ */
+static inline void gen_get_gpr(TCGv t, int reg_num)
+{
+if (reg_num == 0) {
+tcg_gen_movi_tl(t, 0);
+} else {
+tcg_gen_mov_tl(t, cpu_gpr[reg_num]);
+}
+}
+
+/* Wrapper for setting reg values - need to check of reg is zero since
+ * cpu_gpr[0] is not actually allocated. this is more for safety purposes,
+ * since we usually avoid calling the OP_TYPE_gen function if we see a write to
+ * $zero
+ */
+static inline void gen_set_gpr(int reg_num_dst, TCGv t)
+{
+if (reg_num_dst != 0) {
+tcg_gen_mov_tl(cpu_gpr[reg_num_dst], t);
+}
+}
+
+static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
+{
+#if defined(TARGET_RISCV64)
+gen_helper_mulhsu(ret, cpu_env, arg1, arg2);
+#else
+TCGv_i64 t0 = tcg_temp_new_i64();
+TCGv_i64 t1 = tcg_temp_new_i64();
+
+tcg_gen_ext_i32_i64(t0, arg1);
+tcg_gen_extu_i32_i64(t1, arg2);
+tcg_gen_mul_i64(t0, t0, t1);
+
+tcg_gen_shri_i64(t0, t0, 32);
+tcg_gen_extrl_i64_i32(ret, t0);
+
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+#endif
+}
+
+static inline void gen_arith(DisasContext *ctx, uint32_t opc, int rd, int rs1,
+int rs2)
+{
+TCGv source1, source2, cond1, cond2, zeroreg, resultopt1;
+cond1 = tcg_temp_new();
+cond2 = tcg_temp_new();
+source1 = tcg_temp_new();
+source2 = tcg_temp_new();
+zeroreg = tcg_temp_new();
+resultopt1 = tcg_temp_new();
+gen_get_gpr(source1, rs1);
+gen_get_gpr(source2, rs2);
+tcg_gen_movi_tl(zeroreg, 0); /* hardcoded zero for compare in DIV, etc */
+
+switch (opc) {
+#if defined(TARGET_RISCV64)
+case OPC_RISC_ADDW:
+#endif
+case OPC_RISC_ADD:
+tcg_gen_add_tl(source1, source1, source2);
+break;
+#if defined(TARGET_RISCV64)
+case OPC_RISC_SUBW:
+#endif
+case OPC_RISC_SUB:
+tcg_gen_sub_tl(source1, source1, source2);
+break;
+#if defined(TARGET_RISCV64)
+case OPC_RISC_SLLW:
+tcg_gen_andi_tl(source2, source2, 0x1F);
+/* fall through to SLL */
+#endif
+case OPC_RISC_SLL:
+tcg_gen_andi_tl(source2, source2, TARGET_LONG_BITS - 1);
+tcg_gen_shl_tl(source1, source1, source2);
+break;
+case OPC_RISC_SLT:
+tcg_gen_setcond_tl(TCG_COND_LT, source1, source1, source2);
+break;
+case OPC_RISC_SLTU:
+tcg_gen_setcond_tl(TCG_COND_LTU, source1, source1, source2);
+break;
+case OPC_RISC_XOR:
+tcg_gen_xor_tl(source1, source1, source2);
+break;
+#if defined(TARGET_RISCV64)
+case OPC_RISC_SRLW:
+/* clear upper 32 */
+tcg_gen_andi_tl(source1, source1, 0xLL);
+tcg_gen_andi_tl(source2, source2, 0x1F);
+/* fall through to SRL */
+#endif
+case OPC_RISC_SRL:
+tcg_gen_andi_tl(source2, source2, TARGET_LONG_BITS - 1);
+tcg_gen_shr_tl(source1, source1, source2);
+break;
+#if defined(TARGET_RISCV64)
+case OPC_RISC_SRAW:
+/* first, trick to get it to act like working on 32 bits (get rid of
+upper 32, sign extend to fill space) */
+

[Qemu-devel] [PATCH V8 2/6] coroutine: add a macro for the coroutine stack size

2016-09-26 Thread Peter Lieven

Signed-off-by: Peter Lieven 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Richard Henderson 
---
 include/qemu/coroutine_int.h | 2 ++
 util/coroutine-sigaltstack.c | 2 +-
 util/coroutine-ucontext.c| 2 +-
 util/coroutine-win32.c   | 2 +-
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index 6df9d33..14d4f1d 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -28,6 +28,8 @@
 #include "qemu/queue.h"
 #include "qemu/coroutine.h"
 
+#define COROUTINE_STACK_SIZE (1 << 20)
+
 typedef enum {
 COROUTINE_YIELD = 1,
 COROUTINE_TERMINATE = 2,
diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index a7c3366..9c2854c 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -143,7 +143,7 @@ static void coroutine_trampoline(int signal)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = 1 << 20;
+const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineUContext *co;
 CoroutineThreadState *coTS;
 struct sigaction sa;
diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index 2bb7e10..31254ab 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -82,7 +82,7 @@ static void coroutine_trampoline(int i0, int i1)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = 1 << 20;
+const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineUContext *co;
 ucontext_t old_uc, uc;
 sigjmp_buf old_env;
diff --git a/util/coroutine-win32.c b/util/coroutine-win32.c
index 02e28e8..de6bd4f 100644
--- a/util/coroutine-win32.c
+++ b/util/coroutine-win32.c
@@ -71,7 +71,7 @@ static void CALLBACK coroutine_trampoline(void *co_)
 
 Coroutine *qemu_coroutine_new(void)
 {
-const size_t stack_size = 1 << 20;
+const size_t stack_size = COROUTINE_STACK_SIZE;
 CoroutineWin32 *co;
 
 co = g_malloc0(sizeof(*co));
-- 
1.9.1

[Qemu-devel] [PATCH V8 5/6] oslib-posix: add a configure switch to debug stack usage

2016-09-26 Thread Peter Lieven

this adds a knob to track the maximum stack usage of stacks
created by qemu_alloc_stack.

Signed-off-by: Peter Lieven 
---
 configure  | 19 +++
 util/oslib-posix.c | 40 +++-
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 8fa62ad..93ef00a 100755
--- a/configure
+++ b/configure
@@ -296,6 +296,7 @@ libiscsi=""
 libnfs=""
 coroutine=""
 coroutine_pool=""
+debug_stack_usage="no"
 seccomp=""
 glusterfs=""
 glusterfs_xlator_opt="no"
@@ -1004,6 +1005,8 @@ for opt do
   ;;
   --enable-coroutine-pool) coroutine_pool="yes"
   ;;
+  --enable-debug-stack-usage) debug_stack_usage="yes"
+  ;;
   --disable-docs) docs="no"
   ;;
   --enable-docs) docs="yes"
@@ -4276,6 +4279,17 @@ if test "$coroutine" = "gthread" -a "$coroutine_pool" = 
"yes"; then
   error_exit "'gthread' coroutine backend does not support pool (use 
--disable-coroutine-pool)"
 fi
 
+if test "$debug_stack_usage" = "yes"; then
+  if test "$cpu" = "ia64" -o "$cpu" = "hppa"; then
+error_exit "stack usage debugging is not supported for $cpu"
+  fi
+  if test "$coroutine_pool" = "yes"; then
+echo "WARN: disabling coroutine pool for stack usage debugging"
+coroutine_pool=no
+  fi
+fi
+
+
 ##
 # check if we have open_by_handle_at
 
@@ -4861,6 +4875,7 @@ echo "QGA MSI support   $guest_agent_msi"
 echo "seccomp support   $seccomp"
 echo "coroutine backend $coroutine"
 echo "coroutine pool$coroutine_pool"
+echo "debug stack usage $debug_stack_usage"
 echo "GlusterFS support $glusterfs"
 echo "Archipelago support $archipelago"
 echo "gcov  $gcov_tool"
@@ -5330,6 +5345,10 @@ else
   echo "CONFIG_COROUTINE_POOL=0" >> $config_host_mak
 fi
 
+if test "$debug_stack_usage" = "yes" ; then
+  echo "CONFIG_DEBUG_STACK_USAGE=y" >> $config_host_mak
+fi
+
 if test "$open_by_handle_at" = "yes" ; then
   echo "CONFIG_OPEN_BY_HANDLE=y" >> $config_host_mak
 fi
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 7d053b8..18940d9 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -50,6 +50,10 @@
 
 #include "qemu/mmap-alloc.h"
 
+#ifdef CONFIG_DEBUG_STACK_USAGE
+#include "qemu/error-report.h"
+#endif
+
 int qemu_get_thread_id(void)
 {
 #if defined(__linux__)
@@ -503,6 +507,9 @@ pid_t qemu_fork(Error **errp)
 void *qemu_alloc_stack(size_t *sz)
 {
 void *ptr, *guardpage;
+#ifdef CONFIG_DEBUG_STACK_USAGE
+void *ptr2;
+#endif
 size_t pagesz = getpagesize();
 size_t allocsz;
 #ifdef _SC_THREAD_STACK_MIN
@@ -535,10 +542,41 @@ void *qemu_alloc_stack(size_t *sz)
 abort();
 }
 
+#ifdef CONFIG_DEBUG_STACK_USAGE
+for (ptr2 = ptr + pagesz; ptr2 < ptr + allocsz; ptr2 += sizeof(uint32_t)) {
+*(uint32_t *)ptr2 = 0xdeadbeaf;
+}
+#endif
+
 return ptr;
 }
 
+#ifdef CONFIG_DEBUG_STACK_USAGE
+static __thread unsigned int max_stack_usage;
+#endif
+
 void qemu_free_stack(void *stack, size_t sz)
 {
-munmap(stack, sz + getpagesize());
+#ifdef CONFIG_DEBUG_STACK_USAGE
+unsigned int usage;
+void *ptr;
+#endif
+size_t pagesz = getpagesize();
+size_t allocsz = sz + pagesz;
+
+#ifdef CONFIG_DEBUG_STACK_USAGE
+for (ptr = stack + pagesz; ptr < stack + allocsz; ptr += sizeof(uint32_t)) 
{
+if (*(uint32_t *)ptr != 0xdeadbeaf) {
+break;
+}
+}
+usage = sz - (uintptr_t) (ptr - stack);
+if (usage > max_stack_usage) {
+error_report("thread %d max stack usage increased from %u to %u",
+ qemu_get_thread_id(), max_stack_usage, usage);
+max_stack_usage = usage;
+}
+#endif
+
+munmap(stack, allocsz);
 }
-- 
1.9.1

Re: [Qemu-devel] [PATCH v2 2/5] vmstateify rc4030

2016-09-26 Thread Dr. David Alan Gilbert

Hi Aurelien, Yongbok,
  Can you pick up this one MIPS patch from my vmstate series which is for
MIPS; Peter said he'd prefer it to go through your MIPS trees.

Thanks,

Dave

* Dr. David Alan Gilbert (git) (dgilb...@redhat.com) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> rc4030 seems to be part of a MIPS chipset; this converts it to
> VMState.
> 
>   Note:
> a) It builds but I've not found a way to boot a MIPS Jazz image to
> test it.
> b) It was saving 0..<15 on the 16 entry rem_speed array; I've not
> got a clue what that array is but I'm now saving the whole 16 entries
> rather than 15.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> Reviewed-by: Hervé Poussineau 
> Tested-by: Hervé Poussineau 
> ---
>  hw/dma/rc4030.c | 81 
> +++--
>  1 file changed, 27 insertions(+), 54 deletions(-)
> 
> diff --git a/hw/dma/rc4030.c b/hw/dma/rc4030.c
> index 2f2576f..17c8518 100644
> --- a/hw/dma/rc4030.c
> +++ b/hw/dma/rc4030.c
> @@ -616,34 +616,9 @@ static void rc4030_reset(DeviceState *dev)
>  qemu_irq_lower(s->jazz_bus_irq);
>  }
>  
> -static int rc4030_load(QEMUFile *f, void *opaque, int version_id)
> +static int rc4030_post_load(void *opaque, int version_id)
>  {
>  rc4030State* s = opaque;
> -int i, j;
> -
> -if (version_id != 2)
> -return -EINVAL;
> -
> -s->config = qemu_get_be32(f);
> -s->invalid_address_register = qemu_get_be32(f);
> -for (i = 0; i < 8; i++)
> -for (j = 0; j < 4; j++)
> -s->dma_regs[i][j] = qemu_get_be32(f);
> -s->dma_tl_base = qemu_get_be32(f);
> -s->dma_tl_limit = qemu_get_be32(f);
> -s->cache_maint = qemu_get_be32(f);
> -s->remote_failed_address = qemu_get_be32(f);
> -s->memory_failed_address = qemu_get_be32(f);
> -s->cache_ptag = qemu_get_be32(f);
> -s->cache_ltag = qemu_get_be32(f);
> -s->cache_bmask = qemu_get_be32(f);
> -s->memory_refresh_rate = qemu_get_be32(f);
> -s->nvram_protect = qemu_get_be32(f);
> -for (i = 0; i < 15; i++)
> -s->rem_speed[i] = qemu_get_be32(f);
> -s->imr_jazz = qemu_get_be32(f);
> -s->isr_jazz = qemu_get_be32(f);
> -s->itr = qemu_get_be32(f);
>  
>  set_next_tick(s);
>  update_jazz_irq(s);
> @@ -651,32 +626,31 @@ static int rc4030_load(QEMUFile *f, void *opaque, int 
> version_id)
>  return 0;
>  }
>  
> -static void rc4030_save(QEMUFile *f, void *opaque)
> -{
> -rc4030State* s = opaque;
> -int i, j;
> -
> -qemu_put_be32(f, s->config);
> -qemu_put_be32(f, s->invalid_address_register);
> -for (i = 0; i < 8; i++)
> -for (j = 0; j < 4; j++)
> -qemu_put_be32(f, s->dma_regs[i][j]);
> -qemu_put_be32(f, s->dma_tl_base);
> -qemu_put_be32(f, s->dma_tl_limit);
> -qemu_put_be32(f, s->cache_maint);
> -qemu_put_be32(f, s->remote_failed_address);
> -qemu_put_be32(f, s->memory_failed_address);
> -qemu_put_be32(f, s->cache_ptag);
> -qemu_put_be32(f, s->cache_ltag);
> -qemu_put_be32(f, s->cache_bmask);
> -qemu_put_be32(f, s->memory_refresh_rate);
> -qemu_put_be32(f, s->nvram_protect);
> -for (i = 0; i < 15; i++)
> -qemu_put_be32(f, s->rem_speed[i]);
> -qemu_put_be32(f, s->imr_jazz);
> -qemu_put_be32(f, s->isr_jazz);
> -qemu_put_be32(f, s->itr);
> -}
> +static const VMStateDescription vmstate_rc4030 = {
> +.name = "rc4030",
> +.version_id = 3,
> +.post_load = rc4030_post_load,
> +.fields = (VMStateField []) {
> +VMSTATE_UINT32(config, rc4030State),
> +VMSTATE_UINT32(invalid_address_register, rc4030State),
> +VMSTATE_UINT32_2DARRAY(dma_regs, rc4030State, 8, 4),
> +VMSTATE_UINT32(dma_tl_base, rc4030State),
> +VMSTATE_UINT32(dma_tl_limit, rc4030State),
> +VMSTATE_UINT32(cache_maint, rc4030State),
> +VMSTATE_UINT32(remote_failed_address, rc4030State),
> +VMSTATE_UINT32(memory_failed_address, rc4030State),
> +VMSTATE_UINT32(cache_ptag, rc4030State),
> +VMSTATE_UINT32(cache_ltag, rc4030State),
> +VMSTATE_UINT32(cache_bmask, rc4030State),
> +VMSTATE_UINT32(memory_refresh_rate, rc4030State),
> +VMSTATE_UINT32(nvram_protect, rc4030State),
> +VMSTATE_UINT32_ARRAY(rem_speed, rc4030State, 16),
> +VMSTATE_UINT32(imr_jazz, rc4030State),
> +VMSTATE_UINT32(isr_jazz, rc4030State),
> +VMSTATE_UINT32(itr, rc4030State),
> +VMSTATE_END_OF_LIST()
> +}
> +};
>  
>  static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int 
> is_write)
>  {
> @@ -753,8 +727,6 @@ static void rc4030_initfn(Object *obj)
>  sysbus_init_irq(sysbus, >timer_irq);
>  sysbus_init_irq(sysbus, >jazz_bus_irq);
>  
> -register_savevm(NULL, "rc4030", 0, 2, rc4030_save, rc4030_load, s);
> -
>  sysbus_init_mmio(sysbus, >iomem_chipset);
>

Re: [Qemu-devel] vhost-user-test failure

2016-09-26 Thread Eduardo Habkost

On Sun, Sep 25, 2016 at 04:55:53PM -0400, Marc-André Lureau wrote:
> Hi
> 
> - Original Message -
> > This time with Marc-André in cc:...
> > 
> > On 09/23/2016 07:40 PM, Maxime Coquelin wrote:
> > >
> > >
> > > On 09/23/2016 05:41 PM, Michael S. Tsirkin wrote:
> > >> On Fri, Sep 23, 2016 at 12:36:12PM -0300, Eduardo Habkost wrote:
> > >>> Hi,
> > >>>
> > >>> I hit a weird vhost-user-test failure on travis-ci recently, on a
> > >>> branch where I didn't touch any vhost-related code. From a quick
> > >>> look at the code, it looks like the vhost-user code is unhappy to
> > >>> see a disconnected socket.
> > >>>
> > >>> I wasn't able to reproduce it. It seems to be a hard to reproduce
> > >>> race between vhost-user code and socket reconnection.
> > >>>
> > >>> The failure can be seen at:
> > >>>
> > >>> https://travis-ci.org/ehabkost/qemu-hacks/jobs/162077239
> > >>
> > >> Maxime looked at something similiar. Any idea?
> > > No, not really.
> > > Marc-André contributed a lot to these tests, I add him in cc: in case
> > > he has an idea.
> > >
> > > I will have a look in the mean time.
> > >
> 
> I am unable to reproduce locally (over 500x iterations), and I
> have no clue what's going on: the warnings there aren't the
> problem (that's the main reason why we use the subprocess, to
> silence those). Do you have a local reproducer or is it only on
> travis? Afaik, there are no other reports of this test failing,
> are you sure its not related to changes on your branch?

I don't have a local reproducer, I could only see it once on
travis-ci. Maybe it is not possible to reproduce it if the
machine isn't loaded enough to make the right thread/process be
delayed.

I am pretty sure it's not related to my changes. Below is the
diffstat between master and the commit that was being tested. All
the changes were limited to x86 CPUID code (which shouldn't
affect qtest code at all).

$ git diff --stat master...8de32e0
 include/hw/i386/pc.h  |   7 +-
 include/sysemu/cpus.h |   5 +-
 target-i386/cpu.c | 567 
+++-
 target-i386/cpu.h |  15 +++-
 target-ppc/translate_init.c   |   3 +-
 tests/Makefile.include|   2 +
 tests/test-x86-cpuid-compat.c | 171 
 7 files changed, 516 insertions(+), 254 deletions(-)


-- 
Eduardo

Re: [Qemu-devel] [PATCH 00/18] target-riscv: Add full-system emulation support for the RISC-V Instruction Set Architecture (RV64G, RV32G)

2016-09-26 Thread Paolo Bonzini



On 26/09/2016 12:56, Sagar Karandikar wrote:
> -cpu-qom.h merged into cpu.h

Please follow the model of other targets.  RISCVCPUClass and the
RISCVCPU typedef should be in cpu-qom.h.

Paolo

Re: [Qemu-devel] [PATCH 12/18] target-riscv: Add system instructions

2016-09-26 Thread Paolo Bonzini



On 26/09/2016 12:56, Sagar Karandikar wrote:
> +#ifndef CONFIG_USER_ONLY
> +DEF_HELPER_4(csrrw, tl, env, tl, tl, tl)
> +DEF_HELPER_5(csrrs, tl, env, tl, tl, tl, tl)
> +DEF_HELPER_5(csrrc, tl, env, tl, tl, tl, tl)
> +DEF_HELPER_2(sret, tl, env, tl)
> +DEF_HELPER_2(mret, tl, env, tl)
> +DEF_HELPER_1(tlb_flush, void, env)
> +DEF_HELPER_1(fence_i, void, env)
> +#endif /* !CONFIG_USER_ONLY */

The system emulation spec is still in flux, I think we should only add
user-mode emulation for now.

Paolo

Re: [Qemu-devel] [PATCH 12/18] target-riscv: Add system instructions

2016-09-26 Thread Bastian Koppelmann

On 09/26/2016 02:21 PM, Paolo Bonzini wrote:
> 
> 
> On 26/09/2016 12:56, Sagar Karandikar wrote:
>> +#ifndef CONFIG_USER_ONLY
>> +DEF_HELPER_4(csrrw, tl, env, tl, tl, tl)
>> +DEF_HELPER_5(csrrs, tl, env, tl, tl, tl, tl)
>> +DEF_HELPER_5(csrrc, tl, env, tl, tl, tl, tl)
>> +DEF_HELPER_2(sret, tl, env, tl)
>> +DEF_HELPER_2(mret, tl, env, tl)
>> +DEF_HELPER_1(tlb_flush, void, env)
>> +DEF_HELPER_1(fence_i, void, env)
>> +#endif /* !CONFIG_USER_ONLY */
> 
> The system emulation spec is still in flux, I think we should only add
> user-mode emulation for now.
> 

Hi Paolo,

by user-mode emulation you still mean softmmu and not linux-user, right?
So just drop the system instructions for now.

Cheers,
Bastian

[Qemu-devel] [PATCH v5] MC146818 RTC: coordinate guest clock base to destination host after migration

2016-09-26 Thread Junlian Bell

qemu tracks guest time based on vector [base_rtc, last_update], in which
last_update stands for a monotonic tick which is actually uptime of the
host.
according to rtc implementation codes of recent releases and upstream,
after
migration, the time base vector [base_rtc, last_update] isn't updated to
coordinate with the destionation host, ie. qemu doesnt update last_update
to
uptime of the destination host.
what problem have we got because of this bug? after migration, guest time
may
jump back to several days ago, that will make some critical business
applications,
such as lotus notes, malfunction.
this patch is trying to fix the problem. first, when vmsave in progress,
we
rtc_update_time to refresh time stamp in cmos array, then during
vmrestore,
we rtc_set_time to update qemu base_rtc and last_update variable according
to time
stamp in cmos array.

Signed-off-by: Junlian Bell 
---
 hw/timer/mc146818rtc.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index ea625f2..4e4af43 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -717,11 +717,19 @@ static void rtc_set_date_from_host(ISADevice *dev)
 rtc_set_cmos(s, );
 }
 
+static void rtc_pre_save(void *opaque)
+{
+RTCState *s = opaque;
+
+rtc_update_time(s);
+}
+
 static int rtc_post_load(void *opaque, int version_id)
 {
 RTCState *s = opaque;
 
-if (version_id <= 2) {
+if (version_id <= 2 ||
+rtc_clock == QEMU_CLOCK_REALTIME){
 rtc_set_time(s);
 s->offset = 0;
 check_update_timer(s);
@@ -764,6 +772,7 @@ static const VMStateDescription vmstate_rtc = {
 .name = "mc146818rtc",
 .version_id = 3,
 .minimum_version_id = 1,
+.pre_save = rtc_pre_save,
 .post_load = rtc_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_BUFFER(cmos_data, RTCState),
-- 
2.9.0.windows.1

Re: [Qemu-devel] KVM call for agenda for 2016-08-27

2016-09-26 Thread Juan Quintela

Juan Quintela  wrote:
> Hi
>

Kindly reminder that call is Tomorrow an there aren't yet any topics.

Thanks, Juan.


> Please, send any topic that you are interested in covering.
>
> At the end of Monday I will send an email with the agenda or the
> cancellation of the call, so hurry up.
>
> Call details:
>
> Every two Tuesdays.  Look at the google calendar entry to see the
> correct time and date.
>
>   
> https://www.google.com/calendar/embed?src=dG9iMXRqcXAzN3Y4ZXZwNzRoMHE4a3BqcXNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ
>
> (Let me know if you have any problems with the calendar entry.  I just
> gave up about getting right at the same time CEST, CET, EDT and DST).
>
> If you need phone number details,  contact me privately
>
> Thanks, Juan.

[Qemu-devel] [PULL 01/28] memory: introduce IOMMUNotifier and its caps

2016-09-26 Thread Paolo Bonzini

From: Peter Xu 

IOMMU Notifier list is used for notifying IO address mapping changes.
Currently VFIO is the only user.

However it is possible that future consumer like vhost would like to
only listen to part of its notifications (e.g., cache invalidations).

This patch introduced IOMMUNotifier and IOMMUNotfierFlag bits for a
finer grained control of it.

IOMMUNotifier contains a bitfield for the notify consumer describing
what kind of notification it is interested in. Currently two kinds of
notifications are defined:

- IOMMU_NOTIFIER_MAP:for newly mapped entries (additions)
- IOMMU_NOTIFIER_UNMAP:  for entries to be removed (cache invalidates)

When registering the IOMMU notifier, we need to specify one or multiple
types of messages to listen to.

When notifications are triggered, its type will be checked against the
notifier's type bits, and only notifiers with registered bits will be
notified.

(For any IOMMU implementation, an in-place mapping change should be
 notified with an UNMAP followed by a MAP.)

Signed-off-by: Peter Xu 
Message-Id: <1474606948-14391-2-git-send-email-pet...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/vfio/common.c  |  4 ++--
 include/exec/memory.h | 47 ---
 include/hw/vfio/vfio-common.h |  2 +-
 memory.c  | 37 +-
 4 files changed, 71 insertions(+), 19 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index b313e7c..29188a1 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -293,11 +293,10 @@ static bool 
vfio_listener_skipped_section(MemoryRegionSection *section)
section->offset_within_address_space & (1ULL << 63);
 }
 
-static void vfio_iommu_map_notify(Notifier *n, void *data)
+static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
 {
 VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
 VFIOContainer *container = giommu->container;
-IOMMUTLBEntry *iotlb = data;
 hwaddr iova = iotlb->iova + giommu->iommu_offset;
 MemoryRegion *mr;
 hwaddr xlat;
@@ -454,6 +453,7 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
section->offset_within_region;
 giommu->container = container;
 giommu->n.notify = vfio_iommu_map_notify;
+giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
 QLIST_INSERT_HEAD(>giommu_list, giommu, giommu_next);
 
 memory_region_register_iommu_notifier(giommu->iommu, >n);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 3e4d416..14cda67 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -67,6 +67,27 @@ struct IOMMUTLBEntry {
 IOMMUAccessFlags perm;
 };
 
+/*
+ * Bitmap for different IOMMUNotifier capabilities. Each notifier can
+ * register with one or multiple IOMMU Notifier capability bit(s).
+ */
+typedef enum {
+IOMMU_NOTIFIER_NONE = 0,
+/* Notify cache invalidations */
+IOMMU_NOTIFIER_UNMAP = 0x1,
+/* Notify entry changes (newly created entries) */
+IOMMU_NOTIFIER_MAP = 0x2,
+} IOMMUNotifierFlag;
+
+#define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
+
+struct IOMMUNotifier {
+void (*notify)(struct IOMMUNotifier *notifier, IOMMUTLBEntry *data);
+IOMMUNotifierFlag notifier_flags;
+QLIST_ENTRY(IOMMUNotifier) node;
+};
+typedef struct IOMMUNotifier IOMMUNotifier;
+
 /* New-style MMIO accessors can indicate that the transaction failed.
  * A zero (MEMTX_OK) response means success; anything else is a failure
  * of some kind. The memory subsystem will bitwise-OR together results
@@ -201,7 +222,7 @@ struct MemoryRegion {
 const char *name;
 unsigned ioeventfd_nb;
 MemoryRegionIoeventfd *ioeventfds;
-NotifierList iommu_notify;
+QLIST_HEAD(, IOMMUNotifier) iommu_notify;
 };
 
 /**
@@ -607,6 +628,15 @@ uint64_t 
memory_region_iommu_get_min_page_size(MemoryRegion *mr);
 /**
  * memory_region_notify_iommu: notify a change in an IOMMU translation entry.
  *
+ * The notification type will be decided by entry.perm bits:
+ *
+ * - For UNMAP (cache invalidation) notifies: set entry.perm to IOMMU_NONE.
+ * - For MAP (newly added entry) notifies: set entry.perm to the
+ *   permission of the page (which is definitely !IOMMU_NONE).
+ *
+ * Note: for any IOMMU implementation, an in-place mapping change
+ * should be notified with an UNMAP followed by a MAP.
+ *
  * @mr: the memory region that was changed
  * @entry: the new entry in the IOMMU translation table.  The entry
  * replaces all old entries for the same virtual I/O address range.
@@ -620,11 +650,12 @@ void memory_region_notify_iommu(MemoryRegion *mr,
  * IOMMU translation entries.
  *
  * @mr: the memory region to observe
- * @n: the notifier to be added; the notifier receives a pointer to an
- * #IOMMUTLBEntry as the opaque value; the pointer ceases to be
- *

[Qemu-devel] [PULL 03/28] intel_iommu: allow UNMAP notifiers

2016-09-26 Thread Paolo Bonzini

From: Peter Xu 

Intel vIOMMU is still lacking of a complete IOMMU notifier mechanism.
Before that is achieved, let's open a door for vhost DMAR support, which
only requires cache invalidations (UNMAP operations).

Meanwhile, converting hw_error() to error_report() and exit(1), to make
the error messages clean and obvious (so no CPU registers will be
dumped).

Reviewed-by: David Gibson 
Signed-off-by: Peter Xu 
Message-Id: <1474606948-14391-4-git-send-email-pet...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/i386/intel_iommu.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 9d49be7..e4c3681 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1980,10 +1980,14 @@ static void vtd_iommu_notify_flag_changed(MemoryRegion 
*iommu,
 {
 VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
 
-hw_error("Device at bus %s addr %02x.%d requires iommu notifier which "
- "is currently not supported by intel-iommu emulation",
- vtd_as->bus->qbus.name, PCI_SLOT(vtd_as->devfn),
- PCI_FUNC(vtd_as->devfn));
+if (new & IOMMU_NOTIFIER_MAP) {
+error_report("Device at bus %s addr %02x.%d requires iommu "
+ "notifier which is currently not supported by "
+ "intel-iommu emulation",
+ vtd_as->bus->qbus.name, PCI_SLOT(vtd_as->devfn),
+ PCI_FUNC(vtd_as->devfn));
+exit(1);
+}
 }
 
 static const VMStateDescription vtd_vmstate = {
-- 
2.7.4

[Qemu-devel] [PULL 02/28] memory: introduce IOMMUOps.notify_flag_changed

2016-09-26 Thread Paolo Bonzini

From: Peter Xu 

The new interface can be used to replace the old notify_started() and
notify_stopped(). Meanwhile it provides explicit flags so that IOMMUs
can know what kind of notifications it is requested for.

Acked-by: David Gibson 
Signed-off-by: Peter Xu 
Message-Id: <1474606948-14391-3-git-send-email-pet...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/i386/intel_iommu.c |  6 --
 hw/ppc/spapr_iommu.c  | 18 ++
 include/exec/memory.h |  9 +
 memory.c  | 29 +
 4 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 28c31a2..9d49be7 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1974,7 +1974,9 @@ static IOMMUTLBEntry vtd_iommu_translate(MemoryRegion 
*iommu, hwaddr addr,
 return ret;
 }
 
-static void vtd_iommu_notify_started(MemoryRegion *iommu)
+static void vtd_iommu_notify_flag_changed(MemoryRegion *iommu,
+  IOMMUNotifierFlag old,
+  IOMMUNotifierFlag new)
 {
 VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
 
@@ -2348,7 +2350,7 @@ static void vtd_init(IntelIOMMUState *s)
 memset(s->womask, 0, DMAR_REG_SIZE);
 
 s->iommu_ops.translate = vtd_iommu_translate;
-s->iommu_ops.notify_started = vtd_iommu_notify_started;
+s->iommu_ops.notify_flag_changed = vtd_iommu_notify_flag_changed;
 s->root = 0;
 s->root_extended = false;
 s->dmar_enabled = false;
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index f20b0b8..ae30bbe 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -156,14 +156,17 @@ static uint64_t spapr_tce_get_min_page_size(MemoryRegion 
*iommu)
 return 1ULL << tcet->page_shift;
 }
 
-static void spapr_tce_notify_started(MemoryRegion *iommu)
+static void spapr_tce_notify_flag_changed(MemoryRegion *iommu,
+  IOMMUNotifierFlag old,
+  IOMMUNotifierFlag new)
 {
-spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), true);
-}
+struct sPAPRTCETable *tbl = container_of(iommu, sPAPRTCETable, iommu);
 
-static void spapr_tce_notify_stopped(MemoryRegion *iommu)
-{
-spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), false);
+if (old == IOMMU_NOTIFIER_NONE && new != IOMMU_NOTIFIER_NONE) {
+spapr_tce_set_need_vfio(tbl, true);
+} else if (old != IOMMU_NOTIFIER_NONE && new == IOMMU_NOTIFIER_NONE) {
+spapr_tce_set_need_vfio(tbl, false);
+}
 }
 
 static int spapr_tce_table_post_load(void *opaque, int version_id)
@@ -246,8 +249,7 @@ static const VMStateDescription vmstate_spapr_tce_table = {
 static MemoryRegionIOMMUOps spapr_iommu_ops = {
 .translate = spapr_tce_translate_iommu,
 .get_min_page_size = spapr_tce_get_min_page_size,
-.notify_started = spapr_tce_notify_started,
-.notify_stopped = spapr_tce_notify_stopped,
+.notify_flag_changed = spapr_tce_notify_flag_changed,
 };
 
 static int spapr_tce_table_realize(DeviceState *dev)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 14cda67..a3f988b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -174,10 +174,10 @@ struct MemoryRegionIOMMUOps {
 IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr, bool 
is_write);
 /* Returns minimum supported page size */
 uint64_t (*get_min_page_size)(MemoryRegion *iommu);
-/* Called when the first notifier is set */
-void (*notify_started)(MemoryRegion *iommu);
-/* Called when the last notifier is removed */
-void (*notify_stopped)(MemoryRegion *iommu);
+/* Called when IOMMU Notifier flag changed */
+void (*notify_flag_changed)(MemoryRegion *iommu,
+IOMMUNotifierFlag old_flags,
+IOMMUNotifierFlag new_flags);
 };
 
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
@@ -223,6 +223,7 @@ struct MemoryRegion {
 unsigned ioeventfd_nb;
 MemoryRegionIoeventfd *ioeventfds;
 QLIST_HEAD(, IOMMUNotifier) iommu_notify;
+IOMMUNotifierFlag iommu_notify_flags;
 };
 
 /**
diff --git a/memory.c b/memory.c
index 69d9d9a..27a3f2f 100644
--- a/memory.c
+++ b/memory.c
@@ -1414,6 +1414,7 @@ void memory_region_init_iommu(MemoryRegion *mr,
 mr->iommu_ops = ops,
 mr->terminates = true;  /* then re-forwards */
 QLIST_INIT(>iommu_notify);
+mr->iommu_notify_flags = IOMMU_NOTIFIER_NONE;
 }
 
 static void memory_region_finalize(Object *obj)
@@ -1508,16 +1509,31 @@ bool memory_region_is_logging(MemoryRegion *mr, uint8_t 
client)
 return memory_region_get_dirty_log_mask(mr) & (1 << client);
 }
 
+static void memory_region_update_iommu_notify_flags(MemoryRegion *mr)
+{
+IOMMUNotifierFlag flags =

[Qemu-devel] [PULL 09/28] build-sys: put glib_cflags in QEMU_CFLAGS

2016-09-26 Thread Paolo Bonzini

From: Marc-André Lureau 

This way, overriding CFLAGS on make command line keeps glib-cflags
and doesn't break the build.

Signed-off-by: Marc-André Lureau 
Message-Id: <20160925205748.6280-2-marcandre.lur...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index c831600..5412d4f 100755
--- a/configure
+++ b/configure
@@ -2933,7 +2933,7 @@ for i in $glib_modules; do
 if $pkg_config --atleast-version=$glib_req_ver $i; then
 glib_cflags=$($pkg_config --cflags $i)
 glib_libs=$($pkg_config --libs $i)
-CFLAGS="$glib_cflags $CFLAGS"
+QEMU_CFLAGS="$glib_cflags $QEMU_CFLAGS"
 LIBS="$glib_libs $LIBS"
 libs_qga="$glib_libs $libs_qga"
 else
-- 
2.7.4

[Qemu-devel] [PULL 15/28] cpus-common: move CPU list management to common code

2016-09-26 Thread Paolo Bonzini

Add a mutex for the CPU list to system emulation, as it will be used to
manage safe work.  Abstract manipulation of the CPU list in new functions
cpu_list_add and cpu_list_remove.

Reviewed-by: Richard Henderson 
Reviewed-by: Alex Bennée 
Signed-off-by: Paolo Bonzini 
---
 Makefile.objs |  2 +-
 bsd-user/main.c   |  9 +
 cpus-common.c | 83 +++
 exec.c| 37 ++---
 include/exec/cpu-common.h |  5 +++
 include/exec/exec-all.h   | 11 ---
 include/qom/cpu.h | 12 +++
 linux-user/main.c | 17 +++---
 vl.c  |  1 +
 9 files changed, 109 insertions(+), 68 deletions(-)
 create mode 100644 cpus-common.c

diff --git a/Makefile.objs b/Makefile.objs
index 7301544..a8e0224 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -89,7 +89,7 @@ endif
 
 ###
 # Target-independent parts used in system and user emulation
-common-obj-y += tcg-runtime.o
+common-obj-y += tcg-runtime.o cpus-common.o
 common-obj-y += hw/
 common-obj-y += qom/
 common-obj-y += disas/
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 0fb08e4..591c424 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -95,14 +95,6 @@ void fork_end(int child)
 }
 }
 
-void cpu_list_lock(void)
-{
-}
-
-void cpu_list_unlock(void)
-{
-}
-
 #ifdef TARGET_I386
 /***/
 /* CPUX86 core interface */
@@ -748,6 +740,7 @@ int main(int argc, char **argv)
 if (argc <= 1)
 usage();
 
+qemu_init_cpu_list();
 module_call_init(MODULE_INIT_QOM);
 
 if ((envlist = envlist_create()) == NULL) {
diff --git a/cpus-common.c b/cpus-common.c
new file mode 100644
index 000..fda3848
--- /dev/null
+++ b/cpus-common.c
@@ -0,0 +1,83 @@
+/*
+ * CPU thread main loop - common bits for user and system mode emulation
+ *
+ *  Copyright (c) 2003-2005 Fabrice Bellard
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "exec/cpu-common.h"
+#include "qom/cpu.h"
+#include "sysemu/cpus.h"
+
+static QemuMutex qemu_cpu_list_lock;
+
+void qemu_init_cpu_list(void)
+{
+qemu_mutex_init(_cpu_list_lock);
+}
+
+void cpu_list_lock(void)
+{
+qemu_mutex_lock(_cpu_list_lock);
+}
+
+void cpu_list_unlock(void)
+{
+qemu_mutex_unlock(_cpu_list_lock);
+}
+
+static bool cpu_index_auto_assigned;
+
+static int cpu_get_free_index(void)
+{
+CPUState *some_cpu;
+int cpu_index = 0;
+
+cpu_index_auto_assigned = true;
+CPU_FOREACH(some_cpu) {
+cpu_index++;
+}
+return cpu_index;
+}
+
+void cpu_list_add(CPUState *cpu)
+{
+qemu_mutex_lock(_cpu_list_lock);
+if (cpu->cpu_index == UNASSIGNED_CPU_INDEX) {
+cpu->cpu_index = cpu_get_free_index();
+assert(cpu->cpu_index != UNASSIGNED_CPU_INDEX);
+} else {
+assert(!cpu_index_auto_assigned);
+}
+QTAILQ_INSERT_TAIL(, cpu, node);
+qemu_mutex_unlock(_cpu_list_lock);
+}
+
+void cpu_list_remove(CPUState *cpu)
+{
+qemu_mutex_lock(_cpu_list_lock);
+if (!QTAILQ_IN_USE(cpu, node)) {
+/* there is nothing to undo since cpu_exec_init() hasn't been called */
+qemu_mutex_unlock(_cpu_list_lock);
+return;
+}
+
+assert(!(cpu_index_auto_assigned && cpu != QTAILQ_LAST(, CPUTailQ)));
+
+QTAILQ_REMOVE(, cpu, node);
+cpu->cpu_index = UNASSIGNED_CPU_INDEX;
+qemu_mutex_unlock(_cpu_list_lock);
+}
diff --git a/exec.c b/exec.c
index c81d5ab..c8389f9 100644
--- a/exec.c
+++ b/exec.c
@@ -598,36 +598,11 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int 
asidx)
 }
 #endif
 
-static bool cpu_index_auto_assigned;
-
-static int cpu_get_free_index(void)
-{
-CPUState *some_cpu;
-int cpu_index = 0;
-
-cpu_index_auto_assigned = true;
-CPU_FOREACH(some_cpu) {
-cpu_index++;
-}
-return cpu_index;
-}
-
 void cpu_exec_exit(CPUState *cpu)
 {
 CPUClass *cc = CPU_GET_CLASS(cpu);
 
-cpu_list_lock();
-if (!QTAILQ_IN_USE(cpu, node)) {
-/* there is nothing to undo since cpu_exec_init() hasn't been called */
-cpu_list_unlock();
-return;
-}
-
-assert(!(cpu_index_auto_assigned && cpu !=

[Qemu-devel] [PULL 12/28] cpus: Rename flush_queued_work()

2016-09-26 Thread Paolo Bonzini

From: Sergey Fedorov 

To avoid possible confusion, rename flush_queued_work() to
process_queued_cpu_work().

Signed-off-by: Sergey Fedorov 
Signed-off-by: Sergey Fedorov 
Reviewed-by: Alex Bennée 
Signed-off-by: Alex Bennée 
Message-Id: <1470158864-17651-6-git-send-email-alex.ben...@linaro.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 cpus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index ed7d30a..28d6206 100644
--- a/cpus.c
+++ b/cpus.c
@@ -983,7 +983,7 @@ static void qemu_tcg_destroy_vcpu(CPUState *cpu)
 {
 }
 
-static void flush_queued_work(CPUState *cpu)
+static void process_queued_cpu_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
 
@@ -1018,7 +1018,7 @@ static void qemu_wait_io_event_common(CPUState *cpu)
 cpu->stopped = true;
 qemu_cond_broadcast(_pause_cond);
 }
-flush_queued_work(cpu);
+process_queued_cpu_work(cpu);
 cpu->thread_kicked = false;
 }
 
-- 
2.7.4

Re: [Qemu-devel] [PATCH V8 1/6] oslib-posix: add helpers for stack alloc and free

2016-09-26 Thread Kevin Wolf

Am 26.09.2016 um 13:44 hat Peter Lieven geschrieben:
> the allocated stack will be adjusted to the minimum supported stack size
> by the OS and rounded up to be a multiple of the system pagesize.
> Additionally an architecture dependent guard page is added to the stack
> to catch stack overflows.
> 
> Signed-off-by: Peter Lieven 
> ---
>  include/sysemu/os-posix.h | 27 +++
>  util/oslib-posix.c| 43 +++
>  2 files changed, 70 insertions(+)
> 
> diff --git a/include/sysemu/os-posix.h b/include/sysemu/os-posix.h
> index 9c7dfdf..4a0f493 100644
> --- a/include/sysemu/os-posix.h
> +++ b/include/sysemu/os-posix.h
> @@ -60,4 +60,31 @@ int qemu_utimens(const char *path, const qemu_timespec 
> *times);
>  
>  bool is_daemonized(void);
>  
> +/**
> + * qemu_alloc_stack:
> + * @sz: pointer to a size_t holding the requested stack size
> + *
> + * Allocate memory that can be used as a stack, for instance for
> + * coroutines. If the memory cannot be allocated, this function
> + * will abort (like g_malloc()). This function also inserts an
> + * additional guard page to catch a potential stack overflow.
> + * Note that the useable stack memory can be greater than the
> + * requested stack size due to alignment and minimal stack size
> + * restrictions. In this case the value of sz is adjusted.
> + *
> + * The allocated stack must be freed with qemu_free_stack().
> + *
> + * Returns: pointer to (the lowest address of) the stack memory.

Not quite. It's the pointer to the lowest address of the guard page,
while the returned stack size doesn't include the guard page. This is an
awkward interface, and consequently patch 3 fails to use it correctly.

So you end up with something like:

|||||
   

G = guard page
. = allocated stack page
* = stack as used for makecontext()

That is, the guard page is included in the stack used to create the
coroutine context, and the last page stays unused. On systems where we
only allocate a single page for the stack, this obviously means that the
tests still fail.

Kevin

[Qemu-devel] [PULL 19/28] docs: include formal model for TCG exclusive sections

2016-09-26 Thread Paolo Bonzini

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 docs/tcg-exclusive.promela | 177 +
 1 file changed, 177 insertions(+)
 create mode 100644 docs/tcg-exclusive.promela

diff --git a/docs/tcg-exclusive.promela b/docs/tcg-exclusive.promela
new file mode 100644
index 000..5889b40
--- /dev/null
+++ b/docs/tcg-exclusive.promela
@@ -0,0 +1,177 @@
+/*
+ * This model describes the implementation of exclusive sections in
+ * cpus-common.c (start_exclusive, end_exclusive, cpu_exec_start,
+ * cpu_exec_end).
+ *
+ * Author: Paolo Bonzini 
+ *
+ * This file is in the public domain.  If you really want a license,
+ * the WTFPL will do.
+ *
+ * To verify it:
+ * spin -a docs/tcg-exclusive.promela
+ * gcc pan.c -O2
+ * ./a.out -a
+ *
+ * Tunable processor macros: N_CPUS, N_EXCLUSIVE, N_CYCLES, TEST_EXPENSIVE.
+ */
+
+// Define the missing parameters for the model
+#ifndef N_CPUS
+#define N_CPUS 2
+#warning defaulting to 2 CPU processes
+#endif
+
+// the expensive test is not so expensive for <= 3 CPUs
+#if N_CPUS <= 3
+#define TEST_EXPENSIVE
+#endif
+
+#ifndef N_EXCLUSIVE
+# if !defined N_CYCLES || N_CYCLES <= 1 || defined TEST_EXPENSIVE
+#  define N_EXCLUSIVE 2
+#  warning defaulting to 2 concurrent exclusive sections
+# else
+#  define N_EXCLUSIVE 1
+#  warning defaulting to 1 concurrent exclusive sections
+# endif
+#endif
+#ifndef N_CYCLES
+# if N_EXCLUSIVE <= 1 || defined TEST_EXPENSIVE
+#  define N_CYCLES2
+#  warning defaulting to 2 CPU cycles
+# else
+#  define N_CYCLES1
+#  warning defaulting to 1 CPU cycles
+# endif
+#endif
+
+
+// synchronization primitives.  condition variables require a
+// process-local "cond_t saved;" variable.
+
+#define mutex_t  byte
+#define MUTEX_LOCK(m)atomic { m == 0 -> m = 1 }
+#define MUTEX_UNLOCK(m)  m = 0
+
+#define cond_t   int
+#define COND_WAIT(c, m)  {  \
+   saved = c;   \
+   MUTEX_UNLOCK(m); \
+   c != saved -> MUTEX_LOCK(m); \
+ }
+#define COND_BROADCAST(c)c++
+
+// this is the logic from cpus-common.c
+
+mutex_t mutex;
+cond_t exclusive_cond;
+cond_t exclusive_resume;
+byte pending_cpus;
+
+byte running[N_CPUS];
+byte has_waiter[N_CPUS];
+
+#define exclusive_idle()  \
+  do  \
+  :: pending_cpus -> COND_WAIT(exclusive_resume, mutex);  \
+  :: else -> break;   \
+  od
+
+#define start_exclusive() \
+MUTEX_LOCK(mutex);\
+exclusive_idle(); \
+pending_cpus = 1; \
+  \
+i = 0;\
+do\
+   :: i < N_CPUS -> { \
+   if \
+  :: running[i] -> has_waiter[i] = 1; pending_cpus++; \
+  :: else   -> skip;  \
+   fi;\
+   i++;   \
+   }  \
+   :: else -> break;  \
+od;   \
+  \
+do\
+  :: pending_cpus > 1 -> COND_WAIT(exclusive_cond, mutex);\
+  :: else -> break;   \
+od
+
+#define end_exclusive()   \
+pending_cpus = 0; \
+COND_BROADCAST(exclusive_resume); \
+MUTEX_UNLOCK(mutex);
+
+#define cpu_exec_start(id)   \
+MUTEX_LOCK(mutex);   \
+exclusive_idle();\
+running[id] = 1; \
+MUTEX_UNLOCK(mutex);
+
+#define cpu_exec_end(id) \
+MUTEX_LOCK(mutex);   \
+running[id] = 0; \
+

[Qemu-devel] [PULL 06/28] compiler: Swap 'public domain' header for license

2016-09-26 Thread Paolo Bonzini

From: Felipe Franciosi 

As discussed on the list [1], having a comment stating that this file
is "public domain" is arguably wrong and not legally binding. This patch
replaces that comment with a clear GPLv2+ license as proposed in [2].

[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg06151.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg06217.html

Worth noting, compiler.h was originally created on 5c026320 by splitting
qemu-common.h. At the time, qemu-common.h was already GPLv2+.

Signed-off-by: Felipe Franciosi 
Message-Id: <1474642971-11866-1-git-send-email-fel...@nutanix.com>
Signed-off-by: Paolo Bonzini 
---
 include/qemu/compiler.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 338d3a6..157698b 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -1,4 +1,8 @@
-/* public domain */
+/* compiler.h: macros to abstract away compiler specifics
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
 
 #ifndef COMPILER_H
 #define COMPILER_H
-- 
2.7.4

[Qemu-devel] [PULL 14/28] linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick()

2016-09-26 Thread Paolo Bonzini

From: Sergey Fedorov 

Signed-off-by: Sergey Fedorov 
Signed-off-by: Sergey Fedorov 
Reviewed-by: Alex Bennée 
Signed-off-by: Alex Bennée 
Message-Id: <1470158864-17651-9-git-send-email-alex.ben...@linaro.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 linux-user/main.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/linux-user/main.c b/linux-user/main.c
index 7a056fc..6e14010 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -3777,6 +3777,16 @@ void cpu_loop(CPUTLGState *env)
 
 THREAD CPUState *thread_cpu;
 
+bool qemu_cpu_is_self(CPUState *cpu)
+{
+return thread_cpu == cpu;
+}
+
+void qemu_cpu_kick(CPUState *cpu)
+{
+cpu_exit(cpu);
+}
+
 void task_settid(TaskState *ts)
 {
 if (ts->ts_tid == 0) {
-- 
2.7.4

[Qemu-devel] [PULL 23/28] cpus-common: Introduce async_safe_run_on_cpu()

2016-09-26 Thread Paolo Bonzini

Reviewed-by: Richard Henderson 
Reviewed-by: Alex Bennée 
Signed-off-by: Paolo Bonzini 
---
 cpus-common.c | 33 +++--
 include/qom/cpu.h | 14 ++
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 429652c..38b1d55 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -18,6 +18,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/main-loop.h"
 #include "exec/cpu-common.h"
 #include "qom/cpu.h"
 #include "sysemu/cpus.h"
@@ -106,7 +107,7 @@ struct qemu_work_item {
 struct qemu_work_item *next;
 run_on_cpu_func func;
 void *data;
-bool free, done;
+bool free, exclusive, done;
 };
 
 static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
@@ -139,6 +140,7 @@ void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
void *data,
 wi.data = data;
 wi.done = false;
 wi.free = false;
+wi.exclusive = false;
 
 queue_work_on_cpu(cpu, );
 while (!atomic_mb_read()) {
@@ -229,6 +231,19 @@ void cpu_exec_end(CPUState *cpu)
 qemu_mutex_unlock(_cpu_list_lock);
 }
 
+void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
+{
+struct qemu_work_item *wi;
+
+wi = g_malloc0(sizeof(struct qemu_work_item));
+wi->func = func;
+wi->data = data;
+wi->free = true;
+wi->exclusive = true;
+
+queue_work_on_cpu(cpu, wi);
+}
+
 void process_queued_cpu_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
@@ -245,7 +260,21 @@ void process_queued_cpu_work(CPUState *cpu)
 cpu->queued_work_last = NULL;
 }
 qemu_mutex_unlock(>work_mutex);
-wi->func(cpu, wi->data);
+if (wi->exclusive) {
+/* Running work items outside the BQL avoids the following 
deadlock:
+ * 1) start_exclusive() is called with the BQL taken while another
+ * CPU is running; 2) cpu_exec in the other CPU tries to takes the
+ * BQL, so it goes to sleep; start_exclusive() is sleeping too, so
+ * neither CPU can proceed.
+ */
+qemu_mutex_unlock_iothread();
+start_exclusive();
+wi->func(cpu, wi->data);
+end_exclusive();
+qemu_mutex_lock_iothread();
+} else {
+wi->func(cpu, wi->data);
+}
 qemu_mutex_lock(>work_mutex);
 if (wi->free) {
 g_free(wi);
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 934c07a..4092dd9 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -656,6 +656,20 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void 
*data);
 void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
 
 /**
+ * async_safe_run_on_cpu:
+ * @cpu: The vCPU to run on.
+ * @func: The function to be executed.
+ * @data: Data to pass to the function.
+ *
+ * Schedules the function @func for execution on the vCPU @cpu asynchronously,
+ * while all other vCPUs are sleeping.
+ *
+ * Unlike run_on_cpu and async_run_on_cpu, the function is run outside the
+ * BQL.
+ */
+void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data);
+
+/**
  * qemu_get_cpu:
  * @index: The CPUState@cpu_index value of the CPU to obtain.
  *
-- 
2.7.4

[Qemu-devel] [PULL 20/28] cpus-common: always defer async_run_on_cpu work items

2016-09-26 Thread Paolo Bonzini

async_run_on_cpu is only called from the I/O thread, not from CPU threads,
so it doesn't make any difference.  It will make a difference however
for async_safe_run_on_cpu.

Reviewed-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 cpus-common.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 7d935fd..115f3d4 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -153,11 +153,6 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
void *data)
 {
 struct qemu_work_item *wi;
 
-if (qemu_cpu_is_self(cpu)) {
-func(cpu, data);
-return;
-}
-
 wi = g_malloc0(sizeof(struct qemu_work_item));
 wi->func = func;
 wi->data = data;
-- 
2.7.4

[Qemu-devel] [PULL 25/28] cpus-common: lock-free fast path for cpu_exec_start/end

2016-09-26 Thread Paolo Bonzini

Set cpu->running without taking the cpu_list lock, only requiring it if
there is a concurrent exclusive section.  This requires adding a new
field to CPUState, which records whether a running CPU is being counted
in pending_cpus.

When an exclusive section is started concurrently with cpu_exec_start,
cpu_exec_start can use the new field to determine if it has to wait for
the end of the exclusive section.  Likewise, cpu_exec_end can use it to
see if start_exclusive is waiting for that CPU.

This a separate patch for easier bisection of issues.

Signed-off-by: Paolo Bonzini 
---
 cpus-common.c  | 95 ++
 docs/tcg-exclusive.promela | 53 --
 include/qom/cpu.h  |  5 ++-
 3 files changed, 133 insertions(+), 20 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 38b1d55..3e11452 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -28,6 +28,9 @@ static QemuCond exclusive_cond;
 static QemuCond exclusive_resume;
 static QemuCond qemu_work_cond;
 
+/* >= 1 if a thread is inside start_exclusive/end_exclusive.  Written
+ * under qemu_cpu_list_lock, read with atomic operations.
+ */
 static int pending_cpus;
 
 void qemu_init_cpu_list(void)
@@ -177,18 +180,26 @@ static inline void exclusive_idle(void)
 void start_exclusive(void)
 {
 CPUState *other_cpu;
+int running_cpus;
 
 qemu_mutex_lock(_cpu_list_lock);
 exclusive_idle();
 
 /* Make all other cpus stop executing.  */
-pending_cpus = 1;
+atomic_set(_cpus, 1);
+
+/* Write pending_cpus before reading other_cpu->running.  */
+smp_mb();
+running_cpus = 0;
 CPU_FOREACH(other_cpu) {
-if (other_cpu->running) {
-pending_cpus++;
+if (atomic_read(_cpu->running)) {
+other_cpu->has_waiter = true;
+running_cpus++;
 qemu_cpu_kick(other_cpu);
 }
 }
+
+atomic_set(_cpus, running_cpus + 1);
 while (pending_cpus > 1) {
 qemu_cond_wait(_cond, _cpu_list_lock);
 }
@@ -203,7 +214,7 @@ void start_exclusive(void)
 void end_exclusive(void)
 {
 qemu_mutex_lock(_cpu_list_lock);
-pending_cpus = 0;
+atomic_set(_cpus, 0);
 qemu_cond_broadcast(_resume);
 qemu_mutex_unlock(_cpu_list_lock);
 }
@@ -211,24 +222,78 @@ void end_exclusive(void)
 /* Wait for exclusive ops to finish, and begin cpu execution.  */
 void cpu_exec_start(CPUState *cpu)
 {
-qemu_mutex_lock(_cpu_list_lock);
-exclusive_idle();
-cpu->running = true;
-qemu_mutex_unlock(_cpu_list_lock);
+atomic_set(>running, true);
+
+/* Write cpu->running before reading pending_cpus.  */
+smp_mb();
+
+/* 1. start_exclusive saw cpu->running == true and pending_cpus >= 1.
+ * After taking the lock we'll see cpu->has_waiter == true and run---not
+ * for long because start_exclusive kicked us.  cpu_exec_end will
+ * decrement pending_cpus and signal the waiter.
+ *
+ * 2. start_exclusive saw cpu->running == false but pending_cpus >= 1.
+ * This includes the case when an exclusive item is running now.
+ * Then we'll see cpu->has_waiter == false and wait for the item to
+ * complete.
+ *
+ * 3. pending_cpus == 0.  Then start_exclusive is definitely going to
+ * see cpu->running == true, and it will kick the CPU.
+ */
+if (unlikely(atomic_read(_cpus))) {
+qemu_mutex_lock(_cpu_list_lock);
+if (!cpu->has_waiter) {
+/* Not counted in pending_cpus, let the exclusive item
+ * run.  Since we have the lock, just set cpu->running to true
+ * while holding it; no need to check pending_cpus again.
+ */
+atomic_set(>running, false);
+exclusive_idle();
+/* Now pending_cpus is zero.  */
+atomic_set(>running, true);
+} else {
+/* Counted in pending_cpus, go ahead and release the
+ * waiter at cpu_exec_end.
+ */
+}
+qemu_mutex_unlock(_cpu_list_lock);
+}
 }
 
 /* Mark cpu as not executing, and release pending exclusive ops.  */
 void cpu_exec_end(CPUState *cpu)
 {
-qemu_mutex_lock(_cpu_list_lock);
-cpu->running = false;
-if (pending_cpus > 1) {
-pending_cpus--;
-if (pending_cpus == 1) {
-qemu_cond_signal(_cond);
+atomic_set(>running, false);
+
+/* Write cpu->running before reading pending_cpus.  */
+smp_mb();
+
+/* 1. start_exclusive saw cpu->running == true.  Then it will increment
+ * pending_cpus and wait for exclusive_cond.  After taking the lock
+ * we'll see cpu->has_waiter == true.
+ *
+ * 2. start_exclusive saw cpu->running == false but here pending_cpus >= 1.
+ * This includes the case when an exclusive item started after setting
+ * cpu->running to false and before we read pending_cpus.  Then we'll see
+ * cpu->has_waiter == false and not touch

Re: [Qemu-devel] [PATCH v2 4/5] trace: [tcg] Do not generate TCG code to trace dinamically-disabled events

2016-09-26 Thread Stefan Hajnoczi

On Thu, Sep 15, 2016 at 05:50:58PM +0200, Lluís Vilanova wrote:

In the subject line:
s/dinamically-disabled/dynamically/

>  for e in events:
> +# tracer without checks
> +out('',
> +'static inline void __nocheck__%(api)s(%(args)s)',

In QEMU we avoid using double underscore since that part of the
namespace is reserved according to the C standard:

"7.1.3 Reserved identifiers
[...]
All identifiers that begin with an underscore and either an uppercase
letter or another underscore are always reserved for any use."


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] proto: add 'shift' extension.

2016-09-26 Thread Paolo Bonzini



On 26/09/2016 15:53, Vladimir Sementsov-Ogievskiy wrote:
> On 26.09.2016 15:51, Paolo Bonzini wrote:
>> This is very ad hoc.  Can we instead have a block size common to all
>> commands?  Block devices in practice have one, in fact that's why
>> they're called block devices...
> 
> Block size can be too small to clear the whole disk in one request (i.e.
> (2**31 * block_size) is too small..)

Considering that NBD supports multiple outstanding requests, is it a big
deal to require one request per terabyte of storage?

Paolo

Re: [Qemu-devel] [PATCH v2 0/5] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches

2016-09-26 Thread Stefan Hajnoczi

On Thu, Sep 15, 2016 at 05:50:37PM +0200, Lluís Vilanova wrote:
> Avoids generating TCG code to call guest code tracing events in vCPUs that are
> not dynamically tracing that event.
> 
> Currently, events with the 'tcg' property always generate TCG code to trace 
> that
> event at guest code execution time, when their dynamic tracing state is 
> checked.
> 
> This series adds a performance optimization where TCG code for events with the
> 'tcg' and 'vcpu' properties is not generated if the event is dynamically
> disabled. This optimization raises two issues:
> 
> * An event can be dynamically disabled/enabled after the corresponding TCG 
> code
>   has been generated (i.e., a new TB with the corresponding code should be
>   used).
> 
> * Each vCPU can have a different dynamic state for the same event (i.e., 
> tracing
>   the memory accesses of only one process pinned to a vCPU).
> 
> To handle both issues, this series replicates the shared physical TB cache,
> creating a separate physical TB cache for every combination of event states
> (those with the 'vcpu' and 'tcg' properties). Then, all vCPUs tracing the same
> events will use the same physical TB cache.
> 
> Sharing physical TBs makes this very space efficient (only the physical TB
> caches, simple arrays of pointers, are replicated), sharing physical TB caches
> maximizes TB reuse across vCPUs whenever possible, and makes dynamic event 
> state
> changes more efficient (simply use a different TB array).
> 
> The physical TB cache array is indexed with the vCPU's trace event state
> bitmask. This is simpler and more efficient than emitting TCG code to check if
> an event needs tracing; then we should still move the tracing call code to
> either a cold path (making tracing performance worse), or leave it inlined
> (making non-tracing performance worse).
> 
> It is also more efficient than eliding TCG code only when *zero* vCPUs are
> tracing an event, since enabling it on a single vCPU will impact the 
> performance
> of all other vCPUs that are not tracing that event.
> 
> Signed-off-by: Lluís Vilanova 
> ---

TCG folks?

The design of this patch is more related to TCG than tracing since it
affects TB caching.

Stefan


signature.asc
Description: PGP signature

[Qemu-devel] [PULL 00/28] Misc patches for 2016-09-26

2016-09-26 Thread Paolo Bonzini

The following changes since commit eaff9c4367ac3f7ac44f6c6f4cb7bcd4daa89af5:

  Merge remote-tracking branch 'remotes/lalrae/tags/mips-20160923' into staging 
(2016-09-23 15:28:07 +0100)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to cb9cdb6a50c0e775ebeed2d45921a6e13c62ce5c:

  replay: allow replay stopping and restarting (2016-09-26 10:35:34 +0200)


* thread-safe tb_flush (Fred, Alex, Sergey, me, Richard, Emilio,... :-)
* license clarification for compiler.h (Felipe)
* glib cflags improvement (Marc-André)
* checkpatch silencing (Paolo)
* SMRAM migration fix (Paolo)
* Replay improvements (Pavel)
* IOMMU notifier improvements (Peter)
* IOAPIC now defaults to version 0x20 (Peter)


Alex Bennée (1):
  cpus: pass CPUState to run_on_cpu helpers

Felipe Franciosi (1):
  compiler: Swap 'public domain' header for license

Marc-André Lureau (2):
  build-sys: remove unused GLIB_CFLAGS
  build-sys: put glib_cflags in QEMU_CFLAGS

Paolo Bonzini (11):
  checkpatch: downgrade "architecture specific defines should be avoided"
  migration: sync all address spaces
  cpus-common: move CPU list management to common code
  cpus-common: fix uninitialized variable use in run_on_cpu
  cpus-common: move exclusive work infrastructure from linux-user
  docs: include formal model for TCG exclusive sections
  cpus-common: always defer async_run_on_cpu work items
  cpus-common: remove redundant call to exclusive_idle()
  cpus-common: simplify locking for start_exclusive/end_exclusive
  cpus-common: Introduce async_safe_run_on_cpu()
  cpus-common: lock-free fast path for cpu_exec_start/end

Pavel Dovgalyuk (3):
  replay: move internal data to the structure
  replay: vmstate for replay module
  replay: allow replay stopping and restarting

Peter Xu (4):
  memory: introduce IOMMUNotifier and its caps
  memory: introduce IOMMUOps.notify_flag_changed
  intel_iommu: allow UNMAP notifiers
  x86: ioapic: boost default version to 0x20

Sergey Fedorov (6):
  cpus: Move common code out of {async_, }run_on_cpu()
  cpus: Rename flush_queued_work()
  linux-user: Use QemuMutex and QemuCond
  linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick()
  cpus-common: move CPU work item management to common code
  tcg: Make tb_flush() thread safe

 Makefile.objs |   2 +-
 block/blkreplay.c |  15 +-
 bsd-user/main.c   |  33 +---
 configure |   3 +-
 cpu-exec.c|  12 +-
 cpus-common.c | 352 ++
 cpus.c| 100 +---
 docs/tcg-exclusive.promela| 225 +++
 exec.c|  37 +
 hw/i386/intel_iommu.c |  18 ++-
 hw/i386/kvm/apic.c|   5 +-
 hw/i386/kvmvapic.c|   6 +-
 hw/intc/ioapic.c  |   2 +-
 hw/ppc/ppce500_spin.c |  31 ++--
 hw/ppc/spapr.c|   6 +-
 hw/ppc/spapr_hcall.c  |  17 +-
 hw/ppc/spapr_iommu.c  |  18 ++-
 hw/vfio/common.c  |   4 +-
 include/exec/cpu-common.h |   5 +
 include/exec/exec-all.h   |  11 --
 include/exec/memory.h |  63 ++--
 include/exec/tb-context.h |   2 +-
 include/hw/compat.h   |   4 +
 include/hw/vfio/vfio-common.h |   2 +-
 include/qemu/compiler.h   |   6 +-
 include/qom/cpu.h | 102 ++--
 include/sysemu/replay.h   |   4 +
 kvm-all.c |  21 +--
 linux-user/main.c | 130 +---
 memory.c  | 106 +
 migration/ram.c   |   2 +-
 replay/Makefile.objs  |   1 +
 replay/replay-events.c|  10 +-
 replay/replay-internal.c  |  20 ++-
 replay/replay-internal.h  |  23 ++-
 replay/replay-snapshot.c  |  61 
 replay/replay-time.c  |   2 +-
 replay/replay.c   |  16 +-
 scripts/checkpatch.pl |   2 +-
 stubs/replay.c|   5 +
 target-i386/helper.c  |  19 +--
 target-i386/kvm.c |   6 +-
 target-s390x/cpu.c|   4 +-
 target-s390x/cpu.h|   7 +-
 target-s390x/kvm.c|  98 ++--
 target-s390x/misc_helper.c|   4 +-
 translate-all.c   |  38 +++--
 vl.c  |   2 +
 48 files changed, 1146 insertions(+), 516 deletions(-)
 create mode 100644 cpus-common.c
 create mode 100644 docs/tcg-exclusive.promela
 create mode 100644 replay/replay-snapshot.c
-- 
2.7.4

[Qemu-devel] [PULL 07/28] migration: sync all address spaces

2016-09-26 Thread Paolo Bonzini

Migrating a VM during reboot sometimes results in differences
between the source and destination in the SMRAM area.

This is because migration_bitmap_sync() only fetches from KVM
the dirty log of address_space_memory.  SMRAM memory slots
are ignored and the modifications to SMRAM are not sent to the
destination.

Reported-by: He Rongguang 
Reviewed-by: He Rongguang 
Signed-off-by: Paolo Bonzini 
---
 include/exec/memory.h |  7 +++
 memory.c  | 46 +-
 migration/ram.c   |  2 +-
 3 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index a3f988b..10d7eac 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1188,12 +1188,11 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
hwaddr addr, uint64_t size);
 
 /**
- * address_space_sync_dirty_bitmap: synchronize the dirty log for all memory
+ * memory_global_dirty_log_sync: synchronize the dirty log for all memory
  *
- * Synchronizes the dirty page log for an entire address space.
- * @as: the address space that contains the memory being synchronized
+ * Synchronizes the dirty page log for all address spaces.
  */
-void address_space_sync_dirty_bitmap(AddressSpace *as);
+void memory_global_dirty_log_sync(void);
 
 /**
  * memory_region_transaction_begin: Start a transaction.
diff --git a/memory.c b/memory.c
index 27a3f2f..58f9269 100644
--- a/memory.c
+++ b/memory.c
@@ -158,14 +158,10 @@ static bool memory_listener_match(MemoryListener 
*listener,
 
 /* No need to ref/unref .mr, the FlatRange keeps it alive.  */
 #define MEMORY_LISTENER_UPDATE_REGION(fr, as, dir, callback, _args...)  \
-MEMORY_LISTENER_CALL(callback, dir, (&(MemoryRegionSection) {   \
-.mr = (fr)->mr, \
-.address_space = (as),  \
-.offset_within_region = (fr)->offset_in_region, \
-.size = (fr)->addr.size,\
-.offset_within_address_space = int128_get64((fr)->addr.start),  \
-.readonly = (fr)->readonly, \
-  }), ##_args)
+do {\
+MemoryRegionSection mrs = section_from_flat_range(fr, as);  \
+MEMORY_LISTENER_CALL(callback, dir, , ##_args); \
+} while(0)
 
 struct CoalescedMemoryRange {
 AddrRange addr;
@@ -245,6 +241,19 @@ typedef struct AddressSpaceOps AddressSpaceOps;
 #define FOR_EACH_FLAT_RANGE(var, view)  \
 for (var = (view)->ranges; var < (view)->ranges + (view)->nr; ++var)
 
+static inline MemoryRegionSection
+section_from_flat_range(FlatRange *fr, AddressSpace *as)
+{
+return (MemoryRegionSection) {
+.mr = fr->mr,
+.address_space = as,
+.offset_within_region = fr->offset_in_region,
+.size = fr->addr.size,
+.offset_within_address_space = int128_get64(fr->addr.start),
+.readonly = fr->readonly,
+};
+}
+
 static bool flatrange_equal(FlatRange *a, FlatRange *b)
 {
 return a->mr == b->mr
@@ -2156,16 +2165,27 @@ bool memory_region_present(MemoryRegion *container, 
hwaddr addr)
 return mr && mr != container;
 }
 
-void address_space_sync_dirty_bitmap(AddressSpace *as)
+void memory_global_dirty_log_sync(void)
 {
+MemoryListener *listener;
+AddressSpace *as;
 FlatView *view;
 FlatRange *fr;
 
-view = address_space_get_flatview(as);
-FOR_EACH_FLAT_RANGE(fr, view) {
-MEMORY_LISTENER_UPDATE_REGION(fr, as, Forward, log_sync);
+QTAILQ_FOREACH(listener, _listeners, link) {
+if (!listener->log_sync) {
+continue;
+}
+/* Global listeners are being phased out.  */
+assert(listener->address_space_filter);
+as = listener->address_space_filter;
+view = address_space_get_flatview(as);
+FOR_EACH_FLAT_RANGE(fr, view) {
+MemoryRegionSection mrs = section_from_flat_range(fr, as);
+listener->log_sync(listener, );
+}
+flatview_unref(view);
 }
-flatview_unref(view);
 }
 
 void memory_global_dirty_log_start(void)
diff --git a/migration/ram.c b/migration/ram.c
index a6e1c63..c8ec9f2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -626,7 +626,7 @@ static void migration_bitmap_sync(void)
 }
 
 trace_migration_bitmap_sync_start();
-address_space_sync_dirty_bitmap(_space_memory);
+memory_global_dirty_log_sync();
 
 qemu_mutex_lock(_bitmap_mutex);
 rcu_read_lock();
-- 
2.7.4

[Qemu-devel] [PULL 04/28] x86: ioapic: boost default version to 0x20

2016-09-26 Thread Paolo Bonzini

From: Peter Xu 

It's 2.8 now, and maybe it's time to switch IOAPIC default version to
0x20.

Signed-off-by: Peter Xu 
Message-Id: <1474608795-23058-1-git-send-email-pet...@redhat.com>
Signed-off-by: Paolo Bonzini 
---
 hw/intc/ioapic.c| 2 +-
 include/hw/compat.h | 4 
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 31791b0..fd9208f 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -416,7 +416,7 @@ static void ioapic_realize(DeviceState *dev, Error **errp)
 }
 
 static Property ioapic_properties[] = {
-DEFINE_PROP_UINT8("version", IOAPICCommonState, version, 0x11),
+DEFINE_PROP_UINT8("version", IOAPICCommonState, version, 0x20),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/compat.h b/include/hw/compat.h
index a1d6694..46412b2 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -6,6 +6,10 @@
 .driver   = "virtio-pci",\
 .property = "page-per-vq",\
 .value= "on",\
+},{\
+.driver   = "ioapic",\
+.property = "version",\
+.value= "0x11",\
 },
 
 #define HW_COMPAT_2_6 \
-- 
2.7.4

[Qemu-devel] [PULL 05/28] checkpatch: downgrade "architecture specific defines should be avoided"

2016-09-26 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 scripts/checkpatch.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index dde3f5f..3afa19a 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2407,7 +2407,7 @@ sub process {
 # we have e.g. CONFIG_LINUX and CONFIG_WIN32 for common cases
 # where they might be necessary.
if ($line =~ m@^.\s*\#\s*if.*\b__@) {
-   ERROR("architecture specific defines should be 
avoided\n" .  $herecurr);
+   WARN("architecture specific defines should be 
avoided\n" .  $herecurr);
}
 
 # Check that the storage class is at the beginning of a declaration
-- 
2.7.4

[Qemu-devel] [PULL 17/28] cpus-common: fix uninitialized variable use in run_on_cpu

2016-09-26 Thread Paolo Bonzini

Reviewed-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 cpus-common.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 2005bfe..d6cd426 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -88,8 +88,7 @@ struct qemu_work_item {
 struct qemu_work_item *next;
 run_on_cpu_func func;
 void *data;
-int done;
-bool free;
+bool free, done;
 };
 
 static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
@@ -120,6 +119,7 @@ void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
void *data,
 
 wi.func = func;
 wi.data = data;
+wi.done = false;
 wi.free = false;
 
 queue_work_on_cpu(cpu, );
-- 
2.7.4

[Qemu-devel] [PULL 08/28] build-sys: remove unused GLIB_CFLAGS

2016-09-26 Thread Paolo Bonzini

From: Marc-André Lureau 

Message-Id: <20160925205748.6280-1-marcandre.lur...@redhat.com>

Signed-off-by: Paolo Bonzini 
---
 configure | 1 -
 1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 8fa62ad..c831600 100755
--- a/configure
+++ b/configure
@@ -5140,7 +5140,6 @@ fi
 if test "$glib_subprocess" = "yes" ; then
   echo "CONFIG_HAS_GLIB_SUBPROCESS_TESTS=y" >> $config_host_mak
 fi
-echo "GLIB_CFLAGS=$glib_cflags" >> $config_host_mak
 if test "$gtk" = "yes" ; then
   echo "CONFIG_GTK=y" >> $config_host_mak
   echo "CONFIG_GTKABI=$gtkabi" >> $config_host_mak
-- 
2.7.4

[Qemu-devel] [PULL 11/28] cpus: Move common code out of {async_, }run_on_cpu()

2016-09-26 Thread Paolo Bonzini

From: Sergey Fedorov 

Move the code common between run_on_cpu() and async_run_on_cpu() into a
new function queue_work_on_cpu().

Signed-off-by: Sergey Fedorov 
Signed-off-by: Sergey Fedorov 
Reviewed-by: Alex Bennée 
Signed-off-by: Alex Bennée 
Message-Id: <1470158864-17651-4-git-send-email-alex.ben...@linaro.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 cpus.c | 42 ++
 1 file changed, 18 insertions(+), 24 deletions(-)

diff --git a/cpus.c b/cpus.c
index 1a2a9b0..ed7d30a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -916,6 +916,22 @@ void qemu_init_cpu_loop(void)
 qemu_thread_get_self(_thread);
 }
 
+static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
+{
+qemu_mutex_lock(>work_mutex);
+if (cpu->queued_work_first == NULL) {
+cpu->queued_work_first = wi;
+} else {
+cpu->queued_work_last->next = wi;
+}
+cpu->queued_work_last = wi;
+wi->next = NULL;
+wi->done = false;
+qemu_mutex_unlock(>work_mutex);
+
+qemu_cpu_kick(cpu);
+}
+
 void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
 {
 struct qemu_work_item wi;
@@ -929,18 +945,7 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void 
*data)
 wi.data = data;
 wi.free = false;
 
-qemu_mutex_lock(>work_mutex);
-if (cpu->queued_work_first == NULL) {
-cpu->queued_work_first = 
-} else {
-cpu->queued_work_last->next = 
-}
-cpu->queued_work_last = 
-wi.next = NULL;
-wi.done = false;
-qemu_mutex_unlock(>work_mutex);
-
-qemu_cpu_kick(cpu);
+queue_work_on_cpu(cpu, );
 while (!atomic_mb_read()) {
 CPUState *self_cpu = current_cpu;
 
@@ -963,18 +968,7 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
void *data)
 wi->data = data;
 wi->free = true;
 
-qemu_mutex_lock(>work_mutex);
-if (cpu->queued_work_first == NULL) {
-cpu->queued_work_first = wi;
-} else {
-cpu->queued_work_last->next = wi;
-}
-cpu->queued_work_last = wi;
-wi->next = NULL;
-wi->done = false;
-qemu_mutex_unlock(>work_mutex);
-
-qemu_cpu_kick(cpu);
+queue_work_on_cpu(cpu, wi);
 }
 
 static void qemu_kvm_destroy_vcpu(CPUState *cpu)
-- 
2.7.4

[Qemu-devel] [PULL 16/28] cpus-common: move CPU work item management to common code

2016-09-26 Thread Paolo Bonzini

From: Sergey Fedorov 

Make CPU work core functions common between system and user-mode
emulation. User-mode does not use run_on_cpu, so do not implement it.

Signed-off-by: Sergey Fedorov 
Signed-off-by: Sergey Fedorov 
Reviewed-by: Alex Bennée 
Signed-off-by: Alex Bennée 
Message-Id: <1470158864-17651-10-git-send-email-alex.ben...@linaro.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 bsd-user/main.c   | 11 +--
 cpus-common.c | 94 +++
 cpus.c| 82 +---
 include/qom/cpu.h | 27 +++-
 linux-user/main.c | 25 +++
 5 files changed, 148 insertions(+), 91 deletions(-)

diff --git a/bsd-user/main.c b/bsd-user/main.c
index 591c424..6dfa912 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -68,11 +68,11 @@ int cpu_get_pic_interrupt(CPUX86State *env)
 #endif
 
 /* These are no-ops because we are not threadsafe.  */
-static inline void cpu_exec_start(CPUArchState *env)
+static inline void cpu_exec_start(CPUState *cpu)
 {
 }
 
-static inline void cpu_exec_end(CPUArchState *env)
+static inline void cpu_exec_end(CPUState *cpu)
 {
 }
 
@@ -164,7 +164,11 @@ void cpu_loop(CPUX86State *env)
 //target_siginfo_t info;
 
 for(;;) {
+cpu_exec_start(cs);
 trapnr = cpu_exec(cs);
+cpu_exec_end(cs);
+process_queued_cpu_work(cs);
+
 switch(trapnr) {
 case 0x80:
 /* syscall from int $0x80 */
@@ -505,7 +509,10 @@ void cpu_loop(CPUSPARCState *env)
 //target_siginfo_t info;
 
 while (1) {
+cpu_exec_start(cs);
 trapnr = cpu_exec(cs);
+cpu_exec_end(cs);
+process_queued_cpu_work(cs);
 
 switch (trapnr) {
 #ifndef TARGET_SPARC64
diff --git a/cpus-common.c b/cpus-common.c
index fda3848..2005bfe 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -23,10 +23,12 @@
 #include "sysemu/cpus.h"
 
 static QemuMutex qemu_cpu_list_lock;
+static QemuCond qemu_work_cond;
 
 void qemu_init_cpu_list(void)
 {
 qemu_mutex_init(_cpu_list_lock);
+qemu_cond_init(_work_cond);
 }
 
 void cpu_list_lock(void)
@@ -81,3 +83,95 @@ void cpu_list_remove(CPUState *cpu)
 cpu->cpu_index = UNASSIGNED_CPU_INDEX;
 qemu_mutex_unlock(_cpu_list_lock);
 }
+
+struct qemu_work_item {
+struct qemu_work_item *next;
+run_on_cpu_func func;
+void *data;
+int done;
+bool free;
+};
+
+static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
+{
+qemu_mutex_lock(>work_mutex);
+if (cpu->queued_work_first == NULL) {
+cpu->queued_work_first = wi;
+} else {
+cpu->queued_work_last->next = wi;
+}
+cpu->queued_work_last = wi;
+wi->next = NULL;
+wi->done = false;
+qemu_mutex_unlock(>work_mutex);
+
+qemu_cpu_kick(cpu);
+}
+
+void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data,
+   QemuMutex *mutex)
+{
+struct qemu_work_item wi;
+
+if (qemu_cpu_is_self(cpu)) {
+func(cpu, data);
+return;
+}
+
+wi.func = func;
+wi.data = data;
+wi.free = false;
+
+queue_work_on_cpu(cpu, );
+while (!atomic_mb_read()) {
+CPUState *self_cpu = current_cpu;
+
+qemu_cond_wait(_work_cond, mutex);
+current_cpu = self_cpu;
+}
+}
+
+void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
+{
+struct qemu_work_item *wi;
+
+if (qemu_cpu_is_self(cpu)) {
+func(cpu, data);
+return;
+}
+
+wi = g_malloc0(sizeof(struct qemu_work_item));
+wi->func = func;
+wi->data = data;
+wi->free = true;
+
+queue_work_on_cpu(cpu, wi);
+}
+
+void process_queued_cpu_work(CPUState *cpu)
+{
+struct qemu_work_item *wi;
+
+if (cpu->queued_work_first == NULL) {
+return;
+}
+
+qemu_mutex_lock(>work_mutex);
+while (cpu->queued_work_first != NULL) {
+wi = cpu->queued_work_first;
+cpu->queued_work_first = wi->next;
+if (!cpu->queued_work_first) {
+cpu->queued_work_last = NULL;
+}
+qemu_mutex_unlock(>work_mutex);
+wi->func(cpu, wi->data);
+qemu_mutex_lock(>work_mutex);
+if (wi->free) {
+g_free(wi);
+} else {
+atomic_mb_set(>done, true);
+}
+}
+qemu_mutex_unlock(>work_mutex);
+qemu_cond_broadcast(_work_cond);
+}
diff --git a/cpus.c b/cpus.c
index 28d6206..c3afd18 100644
--- a/cpus.c
+++ b/cpus.c
@@ -902,73 +902,21 @@ static QemuThread io_thread;
 static QemuCond qemu_cpu_cond;
 /* system init */
 static QemuCond qemu_pause_cond;
-static QemuCond qemu_work_cond;
 
 void qemu_init_cpu_loop(void)
 {
 qemu_init_sigbus();
 qemu_cond_init(_cpu_cond);
 qemu_cond_init(_pause_cond);
-

Re: [Qemu-devel] [PATCH V8 4/6] coroutine-sigaltstack: use helper for allocating stack memory

2016-09-26 Thread Kevin Wolf

Am 26.09.2016 um 13:44 hat Peter Lieven geschrieben:
> Signed-off-by: Peter Lieven 
> ---
>  util/coroutine-sigaltstack.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
> index 9c2854c..d9c7f66 100644
> --- a/util/coroutine-sigaltstack.c
> +++ b/util/coroutine-sigaltstack.c
> @@ -33,6 +33,7 @@
>  typedef struct {
>  Coroutine base;
>  void *stack;
> +size_t stack_size;
>  sigjmp_buf env;
>  } CoroutineUContext;

Not related to your patch, but somehow I feel some renaming would be in
order... (compare the struct name and the source file name)

Kevin

[Qemu-devel] [PULL 13/28] linux-user: Use QemuMutex and QemuCond

2016-09-26 Thread Paolo Bonzini

From: Sergey Fedorov 

Convert pthread_mutex_t and pthread_cond_t to QemuMutex and QemuCond.
This will allow to make some locks and conditional variables common
between user and system mode emulation.

Signed-off-by: Sergey Fedorov 
Signed-off-by: Sergey Fedorov 
Reviewed-by: Alex Bennée 
Signed-off-by: Alex Bennée 
Message-Id: <1470158864-17651-7-git-send-email-alex.ben...@linaro.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 linux-user/main.c | 55 ---
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 8daebe0..7a056fc 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -111,17 +111,25 @@ int cpu_get_pic_interrupt(CPUX86State *env)
We don't require a full sync, only that no cpus are executing guest code.
The alternative is to map target atomic ops onto host equivalents,
which requires quite a lot of per host/target work.  */
-static pthread_mutex_t cpu_list_mutex = PTHREAD_MUTEX_INITIALIZER;
-static pthread_mutex_t exclusive_lock = PTHREAD_MUTEX_INITIALIZER;
-static pthread_cond_t exclusive_cond = PTHREAD_COND_INITIALIZER;
-static pthread_cond_t exclusive_resume = PTHREAD_COND_INITIALIZER;
+static QemuMutex cpu_list_lock;
+static QemuMutex exclusive_lock;
+static QemuCond exclusive_cond;
+static QemuCond exclusive_resume;
 static int pending_cpus;
 
+void qemu_init_cpu_loop(void)
+{
+qemu_mutex_init(_list_lock);
+qemu_mutex_init(_lock);
+qemu_cond_init(_cond);
+qemu_cond_init(_resume);
+}
+
 /* Make sure everything is in a consistent state for calling fork().  */
 void fork_start(void)
 {
 qemu_mutex_lock(_ctx.tb_ctx.tb_lock);
-pthread_mutex_lock(_lock);
+qemu_mutex_lock(_lock);
 mmap_fork_start();
 }
 
@@ -138,14 +146,14 @@ void fork_end(int child)
 }
 }
 pending_cpus = 0;
-pthread_mutex_init(_lock, NULL);
-pthread_mutex_init(_list_mutex, NULL);
-pthread_cond_init(_cond, NULL);
-pthread_cond_init(_resume, NULL);
+qemu_mutex_init(_lock);
+qemu_mutex_init(_list_lock);
+qemu_cond_init(_cond);
+qemu_cond_init(_resume);
 qemu_mutex_init(_ctx.tb_ctx.tb_lock);
 gdbserver_fork(thread_cpu);
 } else {
-pthread_mutex_unlock(_lock);
+qemu_mutex_unlock(_lock);
 qemu_mutex_unlock(_ctx.tb_ctx.tb_lock);
 }
 }
@@ -155,7 +163,7 @@ void fork_end(int child)
 static inline void exclusive_idle(void)
 {
 while (pending_cpus) {
-pthread_cond_wait(_resume, _lock);
+qemu_cond_wait(_resume, _lock);
 }
 }
 
@@ -165,7 +173,7 @@ static inline void start_exclusive(void)
 {
 CPUState *other_cpu;
 
-pthread_mutex_lock(_lock);
+qemu_mutex_lock(_lock);
 exclusive_idle();
 
 pending_cpus = 1;
@@ -176,8 +184,8 @@ static inline void start_exclusive(void)
 cpu_exit(other_cpu);
 }
 }
-if (pending_cpus > 1) {
-pthread_cond_wait(_cond, _lock);
+while (pending_cpus > 1) {
+qemu_cond_wait(_cond, _lock);
 }
 }
 
@@ -185,42 +193,42 @@ static inline void start_exclusive(void)
 static inline void __attribute__((unused)) end_exclusive(void)
 {
 pending_cpus = 0;
-pthread_cond_broadcast(_resume);
-pthread_mutex_unlock(_lock);
+qemu_cond_broadcast(_resume);
+qemu_mutex_unlock(_lock);
 }
 
 /* Wait for exclusive ops to finish, and begin cpu execution.  */
 static inline void cpu_exec_start(CPUState *cpu)
 {
-pthread_mutex_lock(_lock);
+qemu_mutex_lock(_lock);
 exclusive_idle();
 cpu->running = true;
-pthread_mutex_unlock(_lock);
+qemu_mutex_unlock(_lock);
 }
 
 /* Mark cpu as not executing, and release pending exclusive ops.  */
 static inline void cpu_exec_end(CPUState *cpu)
 {
-pthread_mutex_lock(_lock);
+qemu_mutex_lock(_lock);
 cpu->running = false;
 if (pending_cpus > 1) {
 pending_cpus--;
 if (pending_cpus == 1) {
-pthread_cond_signal(_cond);
+qemu_cond_signal(_cond);
 }
 }
 exclusive_idle();
-pthread_mutex_unlock(_lock);
+qemu_mutex_unlock(_lock);
 }
 
 void cpu_list_lock(void)
 {
-pthread_mutex_lock(_list_mutex);
+qemu_mutex_lock(_list_lock);
 }
 
 void cpu_list_unlock(void)
 {
-pthread_mutex_unlock(_list_mutex);
+qemu_mutex_unlock(_list_lock);
 }
 
 
@@ -4211,6 +4219,7 @@ int main(int argc, char **argv, char **envp)
 int ret;
 int execfd;
 
+qemu_init_cpu_loop();
 module_call_init(MODULE_INIT_QOM);
 
 if ((envlist = envlist_create()) == NULL) {
-- 
2.7.4

[Qemu-devel] [PULL 10/28] cpus: pass CPUState to run_on_cpu helpers

2016-09-26 Thread Paolo Bonzini

From: Alex Bennée 

CPUState is a fairly common pointer to pass to these helpers. This means
if you need other arguments for the async_run_on_cpu case you end up
having to do a g_malloc to stuff additional data into the routine. For
the current users this isn't a massive deal but for MTTCG this gets
cumbersome when the only other parameter is often an address.

This adds the typedef run_on_cpu_func for helper functions which has an
explicit CPUState * passed as the first parameter. All the users of
run_on_cpu and async_run_on_cpu have had their helpers updated to use
CPUState where available.

Signed-off-by: Alex Bennée 
[Sergey Fedorov:
 - eliminate more CPUState in user data;
 - remove unnecessary user data passing;
 - fix target-s390x/kvm.c and target-s390x/misc_helper.c]
Signed-off-by: Sergey Fedorov 
Acked-by: David Gibson  (ppc parts)
Reviewed-by: Christian Borntraeger  (s390 parts)
Signed-off-by: Alex Bennée 
Message-Id: <1470158864-17651-3-git-send-email-alex.ben...@linaro.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 cpus.c | 15 ---
 hw/i386/kvm/apic.c |  5 +--
 hw/i386/kvmvapic.c |  6 +--
 hw/ppc/ppce500_spin.c  | 31 +--
 hw/ppc/spapr.c |  6 +--
 hw/ppc/spapr_hcall.c   | 17 
 include/qom/cpu.h  |  8 ++--
 kvm-all.c  | 21 --
 target-i386/helper.c   | 19 -
 target-i386/kvm.c  |  6 +--
 target-s390x/cpu.c |  4 +-
 target-s390x/cpu.h |  7 +---
 target-s390x/kvm.c | 98 +++---
 target-s390x/misc_helper.c |  4 +-
 14 files changed, 109 insertions(+), 138 deletions(-)

diff --git a/cpus.c b/cpus.c
index e39ccb7..1a2a9b0 100644
--- a/cpus.c
+++ b/cpus.c
@@ -557,9 +557,8 @@ static const VMStateDescription vmstate_timers = {
 }
 };
 
-static void cpu_throttle_thread(void *opaque)
+static void cpu_throttle_thread(CPUState *cpu, void *opaque)
 {
-CPUState *cpu = opaque;
 double pct;
 double throttle_ratio;
 long sleeptime_ns;
@@ -589,7 +588,7 @@ static void cpu_throttle_timer_tick(void *opaque)
 }
 CPU_FOREACH(cpu) {
 if (!atomic_xchg(>throttle_thread_scheduled, 1)) {
-async_run_on_cpu(cpu, cpu_throttle_thread, cpu);
+async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
 }
 }
 
@@ -917,12 +916,12 @@ void qemu_init_cpu_loop(void)
 qemu_thread_get_self(_thread);
 }
 
-void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
+void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
 {
 struct qemu_work_item wi;
 
 if (qemu_cpu_is_self(cpu)) {
-func(data);
+func(cpu, data);
 return;
 }
 
@@ -950,12 +949,12 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), 
void *data)
 }
 }
 
-void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
+void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data)
 {
 struct qemu_work_item *wi;
 
 if (qemu_cpu_is_self(cpu)) {
-func(data);
+func(cpu, data);
 return;
 }
 
@@ -1006,7 +1005,7 @@ static void flush_queued_work(CPUState *cpu)
 cpu->queued_work_last = NULL;
 }
 qemu_mutex_unlock(>work_mutex);
-wi->func(wi->data);
+wi->func(cpu, wi->data);
 qemu_mutex_lock(>work_mutex);
 if (wi->free) {
 g_free(wi);
diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index f57fed1..c016e63 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -125,7 +125,7 @@ static void kvm_apic_vapic_base_update(APICCommonState *s)
 }
 }
 
-static void kvm_apic_put(void *data)
+static void kvm_apic_put(CPUState *cs, void *data)
 {
 APICCommonState *s = data;
 struct kvm_lapic_state kapic;
@@ -146,10 +146,9 @@ static void kvm_apic_post_load(APICCommonState *s)
 run_on_cpu(CPU(s->cpu), kvm_apic_put, s);
 }
 
-static void do_inject_external_nmi(void *data)
+static void do_inject_external_nmi(CPUState *cpu, void *data)
 {
 APICCommonState *s = data;
-CPUState *cpu = CPU(s->cpu);
 uint32_t lvt;
 int ret;
 
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index a1cd9b5..74a549b 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -483,7 +483,7 @@ typedef struct VAPICEnableTPRReporting {
 bool enable;
 } VAPICEnableTPRReporting;
 
-static void vapic_do_enable_tpr_reporting(void *data)
+static void vapic_do_enable_tpr_reporting(CPUState *cpu, void *data)
 {
 VAPICEnableTPRReporting *info = data;
 
@@ -734,10 +734,10 @@ static void vapic_realize(DeviceState *dev, Error **errp)
 nb_option_roms++;
 }
 
-static void do_vapic_enable(void *data)
+static void

[Qemu-devel] [PULL 21/28] cpus-common: remove redundant call to exclusive_idle()

2016-09-26 Thread Paolo Bonzini

No need to call exclusive_idle() from cpu_exec_end since it is done
immediately afterwards in cpu_exec_start.  Any exclusive section could
run as soon as cpu_exec_end leaves, because cpu->running is false and the
mutex is not taken, so the call does not add any protection either.

Reviewed-by: Richard Henderson 
Reviewed-by: Alex Bennée 
Signed-off-by: Paolo Bonzini 
---
 cpus-common.c  | 1 -
 docs/tcg-exclusive.promela | 1 -
 2 files changed, 2 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 115f3d4..80aaf9b 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -221,7 +221,6 @@ void cpu_exec_end(CPUState *cpu)
 qemu_cond_signal(_cond);
 }
 }
-exclusive_idle();
 qemu_mutex_unlock(_cpu_list_lock);
 }
 
diff --git a/docs/tcg-exclusive.promela b/docs/tcg-exclusive.promela
index 5889b40..8bb0967 100644
--- a/docs/tcg-exclusive.promela
+++ b/docs/tcg-exclusive.promela
@@ -124,7 +124,6 @@ byte has_waiter[N_CPUS];
 }\
 :: else -> skip; \
 fi;  \
-exclusive_idle();\
 MUTEX_UNLOCK(mutex);
 
 // Promela processes
-- 
2.7.4

[Qemu-devel] [PULL 18/28] cpus-common: move exclusive work infrastructure from linux-user

2016-09-26 Thread Paolo Bonzini

This will serve as the base for async_safe_run_on_cpu.  Because
start_exclusive uses CPU_FOREACH, merge exclusive_lock with
qemu_cpu_list_lock: together with a call to exclusive_idle (via
cpu_exec_start/end) in cpu_list_add, this protects exclusive work
against concurrent CPU addition and removal.

Reviewed-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 bsd-user/main.c   | 17 ---
 cpus-common.c | 82 +++
 cpus.c|  2 ++
 include/qom/cpu.h | 44 +++-
 linux-user/main.c | 87 ---
 5 files changed, 127 insertions(+), 105 deletions(-)

diff --git a/bsd-user/main.c b/bsd-user/main.c
index 6dfa912..35125b7 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -67,23 +67,6 @@ int cpu_get_pic_interrupt(CPUX86State *env)
 }
 #endif
 
-/* These are no-ops because we are not threadsafe.  */
-static inline void cpu_exec_start(CPUState *cpu)
-{
-}
-
-static inline void cpu_exec_end(CPUState *cpu)
-{
-}
-
-static inline void start_exclusive(void)
-{
-}
-
-static inline void end_exclusive(void)
-{
-}
-
 void fork_start(void)
 {
 }
diff --git a/cpus-common.c b/cpus-common.c
index d6cd426..7d935fd 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -23,11 +23,21 @@
 #include "sysemu/cpus.h"
 
 static QemuMutex qemu_cpu_list_lock;
+static QemuCond exclusive_cond;
+static QemuCond exclusive_resume;
 static QemuCond qemu_work_cond;
 
+static int pending_cpus;
+
 void qemu_init_cpu_list(void)
 {
+/* This is needed because qemu_init_cpu_list is also called by the
+ * child process in a fork.  */
+pending_cpus = 0;
+
 qemu_mutex_init(_cpu_list_lock);
+qemu_cond_init(_cond);
+qemu_cond_init(_resume);
 qemu_cond_init(_work_cond);
 }
 
@@ -55,6 +65,12 @@ static int cpu_get_free_index(void)
 return cpu_index;
 }
 
+static void finish_safe_work(CPUState *cpu)
+{
+cpu_exec_start(cpu);
+cpu_exec_end(cpu);
+}
+
 void cpu_list_add(CPUState *cpu)
 {
 qemu_mutex_lock(_cpu_list_lock);
@@ -66,6 +82,8 @@ void cpu_list_add(CPUState *cpu)
 }
 QTAILQ_INSERT_TAIL(, cpu, node);
 qemu_mutex_unlock(_cpu_list_lock);
+
+finish_safe_work(cpu);
 }
 
 void cpu_list_remove(CPUState *cpu)
@@ -148,6 +166,70 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, 
void *data)
 queue_work_on_cpu(cpu, wi);
 }
 
+/* Wait for pending exclusive operations to complete.  The CPU list lock
+   must be held.  */
+static inline void exclusive_idle(void)
+{
+while (pending_cpus) {
+qemu_cond_wait(_resume, _cpu_list_lock);
+}
+}
+
+/* Start an exclusive operation.
+   Must only be called from outside cpu_exec, takes
+   qemu_cpu_list_lock.   */
+void start_exclusive(void)
+{
+CPUState *other_cpu;
+
+qemu_mutex_lock(_cpu_list_lock);
+exclusive_idle();
+
+/* Make all other cpus stop executing.  */
+pending_cpus = 1;
+CPU_FOREACH(other_cpu) {
+if (other_cpu->running) {
+pending_cpus++;
+qemu_cpu_kick(other_cpu);
+}
+}
+while (pending_cpus > 1) {
+qemu_cond_wait(_cond, _cpu_list_lock);
+}
+}
+
+/* Finish an exclusive operation.  Releases qemu_cpu_list_lock.  */
+void end_exclusive(void)
+{
+pending_cpus = 0;
+qemu_cond_broadcast(_resume);
+qemu_mutex_unlock(_cpu_list_lock);
+}
+
+/* Wait for exclusive ops to finish, and begin cpu execution.  */
+void cpu_exec_start(CPUState *cpu)
+{
+qemu_mutex_lock(_cpu_list_lock);
+exclusive_idle();
+cpu->running = true;
+qemu_mutex_unlock(_cpu_list_lock);
+}
+
+/* Mark cpu as not executing, and release pending exclusive ops.  */
+void cpu_exec_end(CPUState *cpu)
+{
+qemu_mutex_lock(_cpu_list_lock);
+cpu->running = false;
+if (pending_cpus > 1) {
+pending_cpus--;
+if (pending_cpus == 1) {
+qemu_cond_signal(_cond);
+}
+}
+exclusive_idle();
+qemu_mutex_unlock(_cpu_list_lock);
+}
+
 void process_queued_cpu_work(CPUState *cpu)
 {
 struct qemu_work_item *wi;
diff --git a/cpus.c b/cpus.c
index c3afd18..fbd70f5 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1457,7 +1457,9 @@ static int tcg_cpu_exec(CPUState *cpu)
 cpu->icount_decr.u16.low = decr;
 cpu->icount_extra = count;
 }
+cpu_exec_start(cpu);
 ret = cpu_exec(cpu);
+cpu_exec_end(cpu);
 #ifdef CONFIG_PROFILER
 tcg_time += profile_getclock() - ti;
 #endif
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index c04e510..f872614 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -242,7 +242,8 @@ struct qemu_work_item;
  * @nr_threads: Number of threads within this CPU.
  * @numa_node: NUMA node this CPU is belonging to.
  * @host_tid: Host thread ID.
- * @running: #true if CPU is currently running (usermode).
+ * @running: #true if CPU is currently running;
+ *

[Qemu-devel] [PULL 24/28] tcg: Make tb_flush() thread safe

2016-09-26 Thread Paolo Bonzini

From: Sergey Fedorov 

Use async_safe_run_on_cpu() to make tb_flush() thread safe.  This is
possible now that code generation does not happen in the middle of
execution.

It can happen that multiple threads schedule a safe work to flush the
translation buffer. To keep statistics and debugging output sane, always
check if the translation buffer has already been flushed.

Signed-off-by: Sergey Fedorov 
Signed-off-by: Sergey Fedorov 
[AJB: minor re-base fixes]
Signed-off-by: Alex Bennée 
Message-Id: <1470158864-17651-13-git-send-email-alex.ben...@linaro.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 cpu-exec.c| 12 ++--
 include/exec/tb-context.h |  2 +-
 include/qom/cpu.h |  2 --
 translate-all.c   | 38 --
 4 files changed, 31 insertions(+), 23 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 9f4bd0b..8823d23 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -204,20 +204,16 @@ static void cpu_exec_nocache(CPUState *cpu, int 
max_cycles,
  TranslationBlock *orig_tb, bool ignore_icount)
 {
 TranslationBlock *tb;
-bool old_tb_flushed;
 
 /* Should never happen.
We only end up here when an existing TB is too long.  */
 if (max_cycles > CF_COUNT_MASK)
 max_cycles = CF_COUNT_MASK;
 
-old_tb_flushed = cpu->tb_flushed;
-cpu->tb_flushed = false;
 tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
  max_cycles | CF_NOCACHE
  | (ignore_icount ? CF_IGNORE_ICOUNT : 0));
-tb->orig_tb = cpu->tb_flushed ? NULL : orig_tb;
-cpu->tb_flushed |= old_tb_flushed;
+tb->orig_tb = orig_tb;
 /* execute the generated code */
 trace_exec_tb_nocache(tb, tb->pc);
 cpu_tb_exec(cpu, tb);
@@ -338,10 +334,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
 tb_lock();
 have_tb_lock = true;
 }
-/* Check if translation buffer has been flushed */
-if (cpu->tb_flushed) {
-cpu->tb_flushed = false;
-} else if (!tb->invalid) {
+if (!tb->invalid) {
 tb_add_jump(last_tb, tb_exit, tb);
 }
 }
@@ -606,7 +599,6 @@ int cpu_exec(CPUState *cpu)
 break;
 }
 
-atomic_mb_set(>tb_flushed, false); /* reset before first TB 
lookup */
 for(;;) {
 cpu_handle_interrupt(cpu, _tb);
 tb = tb_find(cpu, last_tb, tb_exit);
diff --git a/include/exec/tb-context.h b/include/exec/tb-context.h
index dce95d9..c7f17f2 100644
--- a/include/exec/tb-context.h
+++ b/include/exec/tb-context.h
@@ -38,7 +38,7 @@ struct TBContext {
 QemuMutex tb_lock;
 
 /* statistics */
-int tb_flush_count;
+unsigned tb_flush_count;
 int tb_phys_invalidate_count;
 };
 
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 4092dd9..5dfe74a 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -253,7 +253,6 @@ struct qemu_work_item;
  * @crash_occurred: Indicates the OS reported a crash (panic) for this CPU
  * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
  *   CPU and return to its top level loop.
- * @tb_flushed: Indicates the translation buffer has been flushed.
  * @singlestep_enabled: Flags for single-stepping.
  * @icount_extra: Instructions until next timer event.
  * @icount_decr: Number of cycles left, with interrupt flag in high bit.
@@ -306,7 +305,6 @@ struct CPUState {
 bool unplug;
 bool crash_occurred;
 bool exit_request;
-bool tb_flushed;
 uint32_t interrupt_request;
 int singlestep_enabled;
 int64_t icount_extra;
diff --git a/translate-all.c b/translate-all.c
index e9bc90c..8ca393c 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -834,12 +834,19 @@ static void page_flush_tb(void)
 }
 
 /* flush all the translation blocks */
-/* XXX: tb_flush is currently not thread safe */
-void tb_flush(CPUState *cpu)
+static void do_tb_flush(CPUState *cpu, void *data)
 {
-if (!tcg_enabled()) {
-return;
+unsigned tb_flush_req = (unsigned) (uintptr_t) data;
+
+tb_lock();
+
+/* If it's already been done on request of another CPU,
+ * just retry.
+ */
+if (tcg_ctx.tb_ctx.tb_flush_count != tb_flush_req) {
+goto done;
 }
+
 #if defined(DEBUG_FLUSH)
 printf("qemu: flush code_size=%ld nb_tbs=%d avg_tb_size=%ld\n",
(unsigned long)(tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer),
@@ -858,7 +865,6 @@ void tb_flush(CPUState *cpu)
 for (i = 0; i < TB_JMP_CACHE_SIZE; ++i) {
 atomic_set(>tb_jmp_cache[i], NULL);
 }
-atomic_mb_set(>tb_flushed, true);
 }
 
 tcg_ctx.tb_ctx.nb_tbs = 0;
@@ -868,7 +874,19 @@ void tb_flush(CPUState *cpu)

[Qemu-devel] [PULL 27/28] replay: vmstate for replay module

2016-09-26 Thread Paolo Bonzini

From: Pavel Dovgalyuk 

This patch introduces vmstate for replay data structures.
It allows saving and loading vmstate while replaying.

Signed-off-by: Pavel Dovgalyuk 
Message-Id: <20160926080810.6992.68420.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini 
---
 replay/Makefile.objs |  1 +
 replay/replay-internal.h |  9 
 replay/replay-snapshot.c | 60 
 replay/replay.c  |  1 +
 4 files changed, 71 insertions(+)
 create mode 100644 replay/replay-snapshot.c

diff --git a/replay/Makefile.objs b/replay/Makefile.objs
index fcb3f74..c8ad3eb 100644
--- a/replay/Makefile.objs
+++ b/replay/Makefile.objs
@@ -4,3 +4,4 @@ common-obj-y += replay-events.o
 common-obj-y += replay-time.o
 common-obj-y += replay-input.o
 common-obj-y += replay-char.o
+common-obj-y += replay-snapshot.o
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 9b02d7d..e07eb7d 100644
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -66,6 +66,8 @@ typedef struct ReplayState {
 unsigned int data_kind;
 /*! Flag which indicates that event is not processed yet. */
 unsigned int has_unread_data;
+/*! Temporary variable for saving current log offset. */
+uint64_t file_offset;
 } ReplayState;
 extern ReplayState replay_state;
 
@@ -157,4 +159,11 @@ void replay_event_char_read_save(void *opaque);
 /*! Reads char event read from the file. */
 void *replay_event_char_read_load(void);
 
+/* VMState-related functions */
+
+/* Registers replay VMState.
+   Should be called before virtual devices initialization
+   to make cached timers available for post_load functions. */
+void replay_vmstate_register(void);
+
 #endif
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
new file mode 100644
index 000..bd95dcd
--- /dev/null
+++ b/replay/replay-snapshot.c
@@ -0,0 +1,60 @@
+/*
+ * replay-snapshot.c
+ *
+ * Copyright (c) 2010-2016 Institute for System Programming
+ * of the Russian Academy of Sciences.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "sysemu/replay.h"
+#include "replay-internal.h"
+#include "sysemu/sysemu.h"
+#include "monitor/monitor.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/error-report.h"
+#include "migration/vmstate.h"
+
+static void replay_pre_save(void *opaque)
+{
+ReplayState *state = opaque;
+state->file_offset = ftello64(replay_file);
+}
+
+static int replay_post_load(void *opaque, int version_id)
+{
+ReplayState *state = opaque;
+fseeko64(replay_file, state->file_offset, SEEK_SET);
+/* If this was a vmstate, saved in recording mode,
+   we need to initialize replay data fields. */
+replay_fetch_data_kind();
+
+return 0;
+}
+
+static const VMStateDescription vmstate_replay = {
+.name = "replay",
+.version_id = 1,
+.minimum_version_id = 1,
+.pre_save = replay_pre_save,
+.post_load = replay_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_INT64_ARRAY(cached_clock, ReplayState, REPLAY_CLOCK_COUNT),
+VMSTATE_UINT64(current_step, ReplayState),
+VMSTATE_INT32(instructions_count, ReplayState),
+VMSTATE_UINT32(data_kind, ReplayState),
+VMSTATE_UINT32(has_unread_data, ReplayState),
+VMSTATE_UINT64(file_offset, ReplayState),
+VMSTATE_END_OF_LIST()
+},
+};
+
+void replay_vmstate_register(void)
+{
+vmstate_register(NULL, 0, _replay, _state);
+}
diff --git a/replay/replay.c b/replay/replay.c
index cc2238d..c797aea 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -292,6 +292,7 @@ void replay_configure(QemuOpts *opts)
 exit(1);
 }
 
+replay_vmstate_register();
 replay_enable(fname, mode);
 
 out:
-- 
2.7.4

[Qemu-devel] [PULL 22/28] cpus-common: simplify locking for start_exclusive/end_exclusive

2016-09-26 Thread Paolo Bonzini

It is not necessary to hold qemu_cpu_list_mutex throughout the
exclusive section, because no other exclusive section can run
while pending_cpus != 0.

exclusive_idle() is called in cpu_exec_start(), and that prevents
any CPUs created after start_exclusive() from entering cpu_exec()
during an exclusive section.

Reviewed-by: Richard Henderson 
Reviewed-by: Alex Bennée 
Signed-off-by: Paolo Bonzini 
---
 cpus-common.c  | 11 ---
 docs/tcg-exclusive.promela |  4 +++-
 include/qom/cpu.h  |  4 
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 80aaf9b..429652c 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -171,8 +171,7 @@ static inline void exclusive_idle(void)
 }
 
 /* Start an exclusive operation.
-   Must only be called from outside cpu_exec, takes
-   qemu_cpu_list_lock.   */
+   Must only be called from outside cpu_exec.  */
 void start_exclusive(void)
 {
 CPUState *other_cpu;
@@ -191,11 +190,17 @@ void start_exclusive(void)
 while (pending_cpus > 1) {
 qemu_cond_wait(_cond, _cpu_list_lock);
 }
+
+/* Can release mutex, no one will enter another exclusive
+ * section until end_exclusive resets pending_cpus to 0.
+ */
+qemu_mutex_unlock(_cpu_list_lock);
 }
 
-/* Finish an exclusive operation.  Releases qemu_cpu_list_lock.  */
+/* Finish an exclusive operation.  */
 void end_exclusive(void)
 {
+qemu_mutex_lock(_cpu_list_lock);
 pending_cpus = 0;
 qemu_cond_broadcast(_resume);
 qemu_mutex_unlock(_cpu_list_lock);
diff --git a/docs/tcg-exclusive.promela b/docs/tcg-exclusive.promela
index 8bb0967..feac679 100644
--- a/docs/tcg-exclusive.promela
+++ b/docs/tcg-exclusive.promela
@@ -98,9 +98,11 @@ byte has_waiter[N_CPUS];
 do\
   :: pending_cpus > 1 -> COND_WAIT(exclusive_cond, mutex);\
   :: else -> break;   \
-od
+od;   \
+MUTEX_UNLOCK(mutex);
 
 #define end_exclusive()   \
+MUTEX_LOCK(mutex);\
 pending_cpus = 0; \
 COND_BROADCAST(exclusive_resume); \
 MUTEX_UNLOCK(mutex);
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index f872614..934c07a 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -846,9 +846,6 @@ void cpu_exec_end(CPUState *cpu);
  * cpu_exec are exited immediately.  CPUs that call cpu_exec_start
  * during the exclusive section go to sleep until this CPU calls
  * end_exclusive.
- *
- * Returns with the CPU list lock taken (which nests outside all
- * other locks except the BQL).
  */
 void start_exclusive(void);
 
@@ -856,7 +853,6 @@ void start_exclusive(void);
  * end_exclusive:
  *
  * Concludes an exclusive execution section started by start_exclusive.
- * Releases the CPU list lock.
  */
 void end_exclusive(void);
 
-- 
2.7.4

[Qemu-devel] [PULL 20/27] e1000e: Fix CTRL_EXT.EIAME behavior

2016-09-26 Thread Jason Wang

From: Dmitry Fleytman 

CTRL_EXT.EIAME bit controls clearing of IAM bits,
but current code clears IMS bits instead.

See spec. 10.2.2.5 Extended Device Control Register.

Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000e_core.c | 4 ++--
 hw/net/trace-events  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index e8d50f6..a198a88 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2008,8 +2008,8 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t cause, 
uint32_t int_cfg)
 }
 
 if (core->mac[CTRL_EXT] & E1000_CTRL_EXT_EIAME) {
-trace_e1000e_irq_ims_clear_eiame(core->mac[IAM], cause);
-e1000e_clear_ims_bits(core, core->mac[IAM] & cause);
+trace_e1000e_irq_iam_clear_eiame(core->mac[IAM], cause);
+core->mac[IAM] &= ~cause;
 }
 
 trace_e1000e_irq_icr_clear_eiac(core->mac[ICR], core->mac[EIAC]);
diff --git a/hw/net/trace-events b/hw/net/trace-events
index 47ab14a..1a5c909 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -223,7 +223,7 @@ e1000e_irq_icr_read_entry(uint32_t icr) "Starting ICR read. 
Current ICR: 0x%x"
 e1000e_irq_icr_read_exit(uint32_t icr) "Ending ICR read. Current ICR: 0x%x"
 e1000e_irq_icr_clear_zero_ims(void) "Clearing ICR on read due to zero IMS"
 e1000e_irq_icr_clear_iame(void) "Clearing ICR on read due to IAME"
-e1000e_irq_ims_clear_eiame(uint32_t iam, uint32_t cause) "Clearing IMS due to 
EIAME, IAM: 0x%X, cause: 0x%X"
+e1000e_irq_iam_clear_eiame(uint32_t iam, uint32_t cause) "Clearing IMS due to 
EIAME, IAM: 0x%X, cause: 0x%X"
 e1000e_irq_icr_clear_eiac(uint32_t icr, uint32_t eiac) "Clearing ICR bits due 
to EIAC, ICR: 0x%X, EIAC: 0x%X"
 e1000e_irq_ims_clear_set_imc(uint32_t val) "Clearing IMS bits due to IMC write 
0x%x"
 e1000e_irq_fire_delayed_interrupts(void) "Firing delayed interrupts"
-- 
2.7.4

Re: [Qemu-devel] [PATCH v3 0/2] Produce better termination message

2016-09-26 Thread Michal Privoznik

On 22.09.2016 18:43, Paolo Bonzini wrote:
> 
> 
> On 21/09/2016 18:27, Michal Privoznik wrote:
>> This is v2 of:
>> http://lists.nongnu.org/archive/html/qemu-devel/2016-09/msg05058.html
>>
>> Diff to v2:
>> - In 1/2 I've dropped stdio funcs in favour of g_file_get_contents() (thanks 
>> Dan!)
>>
>> Michal Privoznik (2):
>>   util: Introduce qemu_get_pid_name
>>   qemu_kill_report: Report PID name too
>>
>>  include/qemu/osdep.h | 10 ++
>>  util/oslib-posix.c   | 27 +++
>>  util/oslib-win32.c   |  7 +++
>>  vl.c |  8 ++--
>>  4 files changed, 50 insertions(+), 2 deletions(-)
>>
> 
> Patch 2/2 breaks "make check".  You cannot call malloc from a signal
> handler, and this shows as a deadlock in
> /x86_64/virtio/scsi/pci/hotplug.  You have to use the large buffer,
> _but_ I cannot just keep patch 2 because you also have to use
> open/read/close instead of stdio.

Huh, this has beacame more hairy than I initially thought. An
alternative suggestion might be to not call PID->name translate function
from the signal handler, but call it just from the qemu_kill_report().
Yes, this will increase the chances of reporting incorrect process name,
but there's no way to make this 100% correct. I mean even at the time
that our signal callback is ran, the sender might be dead already and
kernel might have spawn a different process under the same PID.
Therefore I guess there's no real harm in doing the translation later.
Moreover, if we want this to work on *BSD-s (where an libutil function
is called which does malloc), then we must call the translate function
from a safe place. On the other hand, malloc there could be reentrant.

Michal

[Qemu-devel] Questions about gcc linker errors in crypto sub-directory

2016-09-26 Thread Gonglei (Arei)

Hi Daniel,

I'm coding cryptodev-vhost-user.c as a new cryptodev backend,
but the gcc report some linker errors:

crypto/cryptodev-vhost-user.o: In function 
`qcrypto_cryptodev_vhost_crypto_cleanup':
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:87: undefined 
reference to `vhost_dev_cleanup'
crypto/cryptodev-vhost-user.o: In function 
`qcrypto_cryptodev_vhost_crypto_init':
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:126: undefined 
reference to `vhost_dev_init'
crypto/cryptodev-vhost-user.o: In function 
`qcrypto_cryptodev_vhost_user_opened':
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:187: undefined 
reference to `qemu_chr_find'
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:194: undefined 
reference to `qemu_chr_fe_claim_no_fail'
crypto/cryptodev-vhost-user.o: In function `qcrypto_cryptodev_vhost_user_event':
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:213: undefined 
reference to `qemu_chr_set_reconnect_time'
crypto/cryptodev-vhost-user.o: In function `qcrypto_cryptodev_vhost_user_init':
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:257: undefined 
reference to `qemu_chr_add_handlers'
crypto/cryptodev-vhost-user.o: In function 
`qcrypto_cryptodev_vhost_user_finalize':
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:353: undefined 
reference to `qemu_chr_add_handlers'
/mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:354: undefined 
reference to `qemu_chr_fe_release'
collect2: ld returned 1 exit status
make: *** [qemu-nbd] Error 1

Currently I only change the crypto/Makefile.objs:

diff --git a/crypto/Makefile.objs b/crypto/Makefile.objs
index b9ad26a..575f64e 100644
--- a/crypto/Makefile.objs
+++ b/crypto/Makefile.objs
@@ -28,6 +28,7 @@ crypto-obj-y += block-qcow.o
 crypto-obj-y += block-luks.o
 crypto-obj-y += cryptodev.o
 crypto-obj-y += cryptodev-builtin.o
+crypto-obj-y += cryptodev-vhost-user.o
 
 # Let the userspace emulators avoid linking gnutls/etc
 crypto-aes-obj-y = aes.o

Any others do I need to change? Thanks!

Regards,
-Gonglei

Re: [Qemu-devel] Questions about gcc linker errors in crypto sub-directory

2016-09-26 Thread Daniel P. Berrange

On Mon, Sep 26, 2016 at 09:03:45AM +, Gonglei (Arei) wrote:
> Hi Daniel,
> 
> I'm coding cryptodev-vhost-user.c as a new cryptodev backend,
> but the gcc report some linker errors:
> 
> crypto/cryptodev-vhost-user.o: In function 
> `qcrypto_cryptodev_vhost_crypto_cleanup':
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:87: undefined 
> reference to `vhost_dev_cleanup'
> crypto/cryptodev-vhost-user.o: In function 
> `qcrypto_cryptodev_vhost_crypto_init':
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:126: undefined 
> reference to `vhost_dev_init'
> crypto/cryptodev-vhost-user.o: In function 
> `qcrypto_cryptodev_vhost_user_opened':
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:187: undefined 
> reference to `qemu_chr_find'
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:194: undefined 
> reference to `qemu_chr_fe_claim_no_fail'
> crypto/cryptodev-vhost-user.o: In function 
> `qcrypto_cryptodev_vhost_user_event':
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:213: undefined 
> reference to `qemu_chr_set_reconnect_time'
> crypto/cryptodev-vhost-user.o: In function 
> `qcrypto_cryptodev_vhost_user_init':
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:257: undefined 
> reference to `qemu_chr_add_handlers'
> crypto/cryptodev-vhost-user.o: In function 
> `qcrypto_cryptodev_vhost_user_finalize':
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:353: undefined 
> reference to `qemu_chr_add_handlers'
> /mnt/sdb/gonglei/qemu.git/qemu/crypto/cryptodev-vhost-user.c:354: undefined 
> reference to `qemu_chr_fe_release'
> collect2: ld returned 1 exit status
> make: *** [qemu-nbd] Error 1
> 
> Currently I only change the crypto/Makefile.objs:
> 
> diff --git a/crypto/Makefile.objs b/crypto/Makefile.objs
> index b9ad26a..575f64e 100644
> --- a/crypto/Makefile.objs
> +++ b/crypto/Makefile.objs
> @@ -28,6 +28,7 @@ crypto-obj-y += block-qcow.o
>  crypto-obj-y += block-luks.o
>  crypto-obj-y += cryptodev.o
>  crypto-obj-y += cryptodev-builtin.o
> +crypto-obj-y += cryptodev-vhost-user.o

The $(crypto-obj-y) variable is intended to only contain general purpose
crypto code, since it needs to be linked to all QEMU programs. Your
cryptodev file is specific to system emulators, so must only be linked
to the system emulator targets. Thus, it should not be added to the
crypto-obj-y variable.

It probably needs to be added to either common-obj-y or obj-y - I can't
remember which is "best"

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] [PULL 26/27] mcf_fec: fix error in qemu_send_packet argument

2016-09-26 Thread Jason Wang

From: Paolo Bonzini 

This uses the wrong frame size for packets composed of multiple
descriptors.

Signed-off-by: Paolo Bonzini 
Signed-off-by: Jason Wang 
---
 hw/net/mcf_fec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/mcf_fec.c b/hw/net/mcf_fec.c
index d31fea1..dc61bac 100644
--- a/hw/net/mcf_fec.c
+++ b/hw/net/mcf_fec.c
@@ -177,7 +177,7 @@ static void mcf_fec_do_tx(mcf_fec_state *s)
 if (bd.flags & FEC_BD_L) {
 /* Last buffer in frame.  */
 DPRINTF("Sending packet\n");
-qemu_send_packet(qemu_get_queue(s->nic), frame, len);
+qemu_send_packet(qemu_get_queue(s->nic), frame, frame_size);
 ptr = frame;
 frame_size = 0;
 s->eir |= FEC_INT_TXF;
-- 
2.7.4

Re: [Qemu-devel] [Qemu-block] [PATCH] MAINTAINERS: Add some more headers to the IDE section

2016-09-26 Thread Thomas Huth

On 26.09.2016 10:22, Kevin Wolf wrote:
> Am 23.09.2016 um 18:42 hat John Snow geschrieben:
>> On 09/23/2016 12:09 PM, Thomas Huth wrote:
>>> The folder include/hw/ide/ belongs to the IDE section.
>>>
>>> Signed-off-by: Thomas Huth 
>>> ---
>>> MAINTAINERS | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index d8a0cfc..acf6d6c 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -791,6 +791,7 @@ M: John Snow 
>>> L: qemu-bl...@nongnu.org
>>> S: Supported
>>> F: include/hw/ide.h
>>> +F: include/hw/ide/
>>> F: hw/ide/
>>> F: hw/block/block.c
>>> F: hw/block/cdrom.c
>>>
>>
>> Ah, yeah. These got missed when they were moved over. Thanks.
>>
>> Reviewed-by: John Snow 
> 
> Who is supposed to merge this if you only give an R-b?

I've CC:ed this patch to qemu-trivial, so I hope it will get picked up
there if John does not want to apply this directly.

 Thomas




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v3 0/3] object: Add 'help' option for all available backends and properties

2016-09-26 Thread Lin Ma

Print available object backend types and the relevant properties.

V2->v3:
* make type user-creatable abstract.
* auto generate enum value strings during qemu configuration.(Borrowwed 
Daniel's code)
* save the generated enum value strings into member description of 
ObjectProperty.
* drop the judgement logic of whether a property has an enumeration type 
anymore,
  output member description of ObjectProperty directly.
* at least, user_creatable_help_func should be put after
  'object_property_add_child(object_get_root(), 
"machine",OBJECT(current_machine), ...)',
  because host_memory_backend_init needs to access an instance of type machine.

V1->V2:
* Output the acceptable values of enum types by "-object TYPE-NAME,help"

Lin Ma (3):
  qom: make base type user-creatable abstract
  qapi: auto generate enum value strings
  object: Add 'help' option for all available backends and properties

 backends/hostmem.c  |  4 
 crypto/secret.c |  4 
 crypto/tlscreds.c   |  4 
 include/qom/object_interfaces.h |  2 ++
 net/filter.c|  4 
 qemu-options.hx |  7 +-
 qom/object_interfaces.c | 49 +
 scripts/qapi-types.py   |  2 ++
 scripts/qapi.py |  9 
 vl.c|  5 +
 10 files changed, 89 insertions(+), 1 deletion(-)

-- 
2.9.2

[Qemu-devel] [PATCH v4]MC146818 RTC: coordinate guest clock base to destination host after migration

2016-09-26 Thread zhong...@sangfor.com.cn

MC146818 RTC: coordinate guest clock base to destination host after migration

qemu tracks guest time based on vector [base_rtc, last_update], in which
last_update stands for a monotonic tick which is actually uptime of the host.
according to rtc implementation codes of recent releases and upstream, after
migration, the time base vector [base_rtc, last_update] isn't updated to
coordinate with the destionation host, ie. qemu doesnt update last_update to
uptime of the destination host.
what problem have we got because of this bug? after migration, guest time may
jump back to several days ago, that will make some critical business 
applications,
such as lotus notes, malfunction.
this patch is trying to fix the problem. first, when vmsave in progress, we 
rtc_update_time to refresh time stamp in cmos array, then during vmrestore,
we rtc_set_time to update qemu base_rtc and last_update variable according to 
time
stamp in cmos array.

Signed-off-by: Junlian Bell 
---
 hw/timer/mc146818rtc.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index ea625f2..4e4af43 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -717,11 +717,19 @@ static void rtc_set_date_from_host(ISADevice *dev)
 rtc_set_cmos(s, );
 }
 
+static void rtc_pre_save(void *opaque)
+{
+RTCState *s = opaque;
+
+rtc_update_time(s);
+}
+
 static int rtc_post_load(void *opaque, int version_id)
 {
 RTCState *s = opaque;
 
-if (version_id <= 2) {
+if (version_id <= 2 ||
+rtc_clock == QEMU_CLOCK_REALTIME){
 rtc_set_time(s);
 s->offset = 0;
 check_update_timer(s);
@@ -764,6 +772,7 @@ static const VMStateDescription vmstate_rtc = {
 .name = "mc146818rtc",
 .version_id = 3,
 .minimum_version_id = 1,
+.pre_save = rtc_pre_save,
 .post_load = rtc_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_BUFFER(cmos_data, RTCState),
-- 
2.9.0.windows.1

Re: [Qemu-devel] [PATCH v2 07/14] pc: apic_common: extend APIC ID property to 32bit

2016-09-26 Thread Igor Mammedov

On Thu, 22 Sep 2016 18:16:47 +0200
Paolo Bonzini  wrote:

> On 22/09/2016 18:00, Igor Mammedov wrote:
> > > Why not just return initial_apic_id?  This is the meaning the property
> > > had before your patch.  
> > 
> > initial_apic_id is immutable but 'id' could be changed at runtime by guest 
> > in xAPIC mode
> > so returned value depends on xAPIC/x2APIC mode  
> 
> Understood, but this is just a possibly poorly-named property.  "id"
> (e.g. from info qtree as opposed to info lapic) used to be the initial
> APIC ID always, even in x2APIC mode.

'info qtree' doesn't show CPUs anymore (since ICC bus has been removed),
but if it were it would show effective APIC ID. Same applie[ds] for
reading property value with qom-get.

> 
> Not a big deal, but thought I'd mention it since you can keep using
> static properties.
PS:
changing initial APIC ID from guest probably wouldn't work anyway
and beak somewhere else, so we could just continue to ignore
it and use static properties for now if you prefer.


> 
> Paolo
> 
> > so I'm just following spec here.

Re: [Qemu-devel] [PATCH v4]MC146818 RTC: coordinate guest clock base to destination host after migration

2016-09-26 Thread Paolo Bonzini



On 26/09/2016 12:54, zhong...@sangfor.com.cn wrote:
> MC146818 RTC: coordinate guest clock base to destination host after migration
> 
> qemu tracks guest time based on vector [base_rtc, last_update], in which
> last_update stands for a monotonic tick which is actually uptime of the host.
> according to rtc implementation codes of recent releases and upstream, after
> migration, the time base vector [base_rtc, last_update] isn't updated to
> coordinate with the destionation host, ie. qemu doesnt update last_update to
> uptime of the destination host.
> what problem have we got because of this bug? after migration, guest time may
> jump back to several days ago, that will make some critical business 
> applications,
> such as lotus notes, malfunction.
> this patch is trying to fix the problem. first, when vmsave in progress, we 
> rtc_update_time to refresh time stamp in cmos array, then during vmrestore,
> we rtc_set_time to update qemu base_rtc and last_update variable according to 
> time
> stamp in cmos array.
> 
> Signed-off-by: Junlian Bell 
> ---
>  hw/timer/mc146818rtc.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
> index ea625f2..4e4af43 100644
> --- a/hw/timer/mc146818rtc.c
> +++ b/hw/timer/mc146818rtc.c
> @@ -717,11 +717,19 @@ static void rtc_set_date_from_host(ISADevice *dev)
>  rtc_set_cmos(s, );
>  }
>  
> +static void rtc_pre_save(void *opaque)
> +{
> +RTCState *s = opaque;
> +
> +rtc_update_time(s);
> +}
> +
>  static int rtc_post_load(void *opaque, int version_id)
>  {
>  RTCState *s = opaque;
>  
> -if (version_id <= 2) {
> +if (version_id <= 2 ||
> +rtc_clock == QEMU_CLOCK_REALTIME){
>  rtc_set_time(s);
>  s->offset = 0;
>  check_update_timer(s);
> @@ -764,6 +772,7 @@ static const VMStateDescription vmstate_rtc = {
>  .name = "mc146818rtc",
>  .version_id = 3,
>  .minimum_version_id = 1,
> +.pre_save = rtc_pre_save,
>  .post_load = rtc_post_load,
>  .fields = (VMStateField[]) {
>  VMSTATE_BUFFER(cmos_data, RTCState),
> -- 
> 2.9.0.windows.1
> 

Still doesn't pass scripts/checkpatch.pl.  Also please avoid HTML email.

Paolo

Re: [Qemu-devel] [PATCH 15/18] target-riscv: Interrupt Handling

2016-09-26 Thread Richard Henderson


On 09/26/2016 03:56 AM, Sagar Karandikar wrote:

+if (interruptno + 1) {


This is a very odd way to write interruptno != -1.
And did you really mean interruptno >= 0?


r~

Re: [Qemu-devel] [Nbd] [PATCH] proto: add 'shift' extension.

2016-09-26 Thread Wouter Verhelst

On Mon, Sep 26, 2016 at 03:21:46PM -0500, Eric Blake wrote:
> I'd much rather support a single flag that says to zero the entire disk
> than to introduce stateful variable-amount shifting.

That's almost exactly the opposite of what I said :)

Now, I don't feel very strong either way, but what matters to me is:

- NBD is a simple, easy to understand protocol; that is a feature, and
  so it should remain that way.
- Every time we add another option, flag, or command, we make the
  protocol slightly more complex, which is counter to that goal.
- Adding a command with a single use case (i.e., a "wipe the whole
  device" command) seems like it would not see much use, except perhaps
  in the use case that Virtuozzo is thinking about. In other words, it
  makes things slightly more complex for little benefit.

I thought a negotiated shift size could be creatively used for other
things beyond just "wipe the whole disk" commands, and that it might be
elegant in that way. On the other hand, I recognize that adding state in
that manner also complicates the protocol in that an observer which sees
only part of the traffic may not understand what's going on anymore.

So let's just say that an NBD_CMD_FLAG_SHIFT would:
- Left-shift the size by 16 bits; no more, no less
  - 2^32-1 is too large a granularity for this to be useful beyond "wipe
whole disk" commands; 2^16-1 (65535) seems like a more useful
granularity.
  - This allows for a maximum number of 2^48-1 bytes (one byte shy of
256 tebibytes) to be affected by a single command, which seems
sufficient for the given purpose.
  - If someone really wants to wipe 2^64-1 bytes (i.e., 16 exbibytes),
they are probably using the wrong tools.
- Be only valid for commands that don't send or expect data to be sent
  out over the wire.
  - currently TRIM and WRITE_ZEROES, but not READ or WRITE.

Thoughts?

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
   people in the world who think they really understand all of its rules,
   and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

Re: [Qemu-devel] [PATCH] checkpatch.pl: disable arch-specific test for linux-user

2016-09-26 Thread Peter Maydell

On 26 September 2016 at 16:36, Riku Voipio  wrote:
> On 27 September 2016 at 00:08, Peter Maydell  wrote:
>> Do you have some examples of the false positives you want
>> to suppress here? For new code I would hope that we can
>> handle host-arch-specifics by having new files (or just
>> new #defines etc) in linux-user/host/$ARCH/ rather than
>> inline #ifdeffery in the main files.
>
> One example from your patch:
>
> https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg05650.html
>
> And another from Laurent:
>
> https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg06486.html
>
> Every new syscall will comes with "#ifdef TARGET_NR_foo and
> defined(__NR_foo)", while host/target combos catch up. Now, most
> TARGET_NR_foo's are needed only for unicore32, but the __NR_foo
> defines will be needed for a very long time.

Oh, I see; I don't think of the __NR_foo as being "architecture
specific". I think we'd be better off specifically whitelisting
those in checkpatch rather than turning off the whole check
for linux-user.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v3 07/10] ppc/pnv: add XSCOM infrastructure

2016-09-26 Thread Cédric Le Goater

On 09/27/2016 04:35 AM, David Gibson wrote:
> On Mon, Sep 26, 2016 at 06:11:36PM +0200, Cédric Le Goater wrote:
>> On 09/23/2016 04:46 AM, David Gibson wrote:
>>> On Thu, Sep 22, 2016 at 10:25:59AM +0200, Cédric Le Goater wrote:
>> @@ -493,6 +525,8 @@ static void pnv_chip_power9_class_init(ObjectClass 
>> *klass, void *data)
>>  k->chip_cfam_id = 0x100d10498000ull; /* P9 Nimbus DD1.0 */
>>  k->cores_mask = POWER9_CORE_MASK;
>>  k->core_pir = pnv_chip_core_pir_p9;
>> +k->xscom_addr = pnv_chip_xscom_addr_p9;
>> +k->xscom_pcba = pnv_chip_xscom_pcba_p9;
>
> So if you do as BenH (and I) suggested and have the "scom address
> space" actually be addressed by (pcba << 3), I think you can probably
> avoid these.  

 I will look at that option again. 

 I was trying to untangle a few things at the same time. I have better
 view of the problem to solve now. The bus is gone, that's was one 
 thing. How we map these xscom regions is the next. 

 Ben suggested to add some P7/P8 mangling before the dispatch in 
 the _space_xscom. This should make things cleaner. I had 
 not thought of doing that and this is why I introduced these helpers :

 +uint32_t pnv_xscom_pcba(PnvXScomInterface *dev, uint64_t addr)
 +uint64_t pnv_xscom_addr(PnvXScomInterface *dev, uint32_t pcba)

 which I don't really like ...

 but we must make sure that we can do the mapping of the xscom 
 subregions in the _space_xscom using (pcba << 3)


> Instead you can handle it in the chip or ADU realize function by either:
>
> P8: * map one big subregion for the ADU into _space_memory
> * have the handler for that subregion do the address mangling,
>   then redispatch into the xscom address space
>
> P9: * Map the appropriate chunk of the xscom address space
>   directly into address_space_memory

 Yes that was my feeling for a better solution but Ben chimed in with the 
 HMER topic. I need to look at that.
>>>
>>> Right.  Doesn't change the basic concept though - it just means you
>>> need (slightly different) redispatchers for both P8 and P9.
>>
>> In fact they are the same, you only need an "addr to pcba" handler at the
>> chip class level : 
> 
> Ok.  I'd been thinking of using different dispatchers as an
> alternative to using the chip class translator hook, 

ah. yes, why not. We could have per-chip dispatchers but they 
would have a lot in common. However, I think we can get rid of 
the xscom_pcba' handlers, they should not be needed any where 
else than in the XSCOM dispatchers. 

> but I guess if you have the decoding of those "core" registers 
> here as well, then that doesn't make so much sense.

yes and there is also the handling of the XSCOM failures.

I can add some prologue handler to cover those "core" registers
but adding a MemoryRegion, ops, init and mapping would be a lot 
of churn just to return 0.

Thanks,

C. 


>> static uint64_t xscom_read(void *opaque, hwaddr addr, unsigned width)
>> {
>>  PnvChip *chip = opaque;
>>  uint32_t pcba = PNV_CHIP_GET_CLASS(chip)->xscom_pcba(addr);
>>  uint64_t val = 0;
>>  MemTxResult result;
>>
>>  ...
>>
>> val = address_space_ldq(>xscom_as, pcba << 3,
>> MEMTXATTRS_UNSPECIFIED, );
>> if (result != MEMTX_OK) {
>>
>>   
>>
>> And so, the result is pretty clean. I killed the proxy object and merged 
>> the regions in the chip but I have kept the pnv_xscom.c file because the 
>> code related to xscom is rather large : ~250 lines. 
> 
> Sure, makes sense.
> 
>> The objects declaring a xscom region need to do some register shifting but 
>> this is usual in mmio regions.
>>
>> You will see in v4.
> 
> Ok.
> 
>> +static bool xscom_dispatch_read(PnvXScom *xscom, hwaddr addr, uint64_t 
>> *val)
>> +{
>> +uint32_t success;
>> +uint8_t data[8];
>> +
>> +success = !address_space_rw(>xscom_as, addr, 
>> MEMTXATTRS_UNSPECIFIED,
>> +data, 8, false);
>> +*val = (((uint64_t) data[0]) << 56 |
>> +((uint64_t) data[1]) << 48 |
>> +((uint64_t) data[2]) << 40 |
>> +((uint64_t) data[3]) << 32 |
>> +((uint64_t) data[4]) << 24 |
>> +((uint64_t) data[5]) << 16 |
>> +((uint64_t) data[6]) << 8  |
>> +((uint64_t) data[7]));
>
> AFAICT this is basically assuming data is always encoded BE.  With the
> right choice of endian flags on the individual SCOM device
> registrations with the scom address space, I think you should be able
> to avoid this mangling.

 yes. I should but curiously I had to do this, and this works the same on
 an intel host or a ppc64 host.
>>>
>>> Hmm.. I suspect what you actually need is

Re: [Qemu-devel] [PATCH] spapr_vscsi: fix build error introduced by f19661c8

2016-09-26 Thread David Gibson

On Mon, Sep 26, 2016 at 03:17:44PM +0100, Felipe Franciosi wrote:
> A typo introduced in f19661c8 prevents qemu from building when configured
> with --enable-trace-backend=dtrace.
> 
> Signed-off-by: Felipe Franciosi 

Applied to ppc-for-2.8, thanks.


> ---
>  hw/scsi/spapr_vscsi.c | 2 +-
>  hw/scsi/trace-events  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
> index d8a2296..6090a20 100644
> --- a/hw/scsi/spapr_vscsi.c
> +++ b/hw/scsi/spapr_vscsi.c
> @@ -658,7 +658,7 @@ static void vscsi_process_login(VSCSIState *s, vscsi_req 
> *req)
>  struct srp_login_rsp *rsp = >srp.login_rsp;
>  uint64_t tag = iu->srp.rsp.tag;
>  
> -trace_spapr_vscsi__process_login();
> +trace_spapr_vscsi_process_login();
>  
>  /* TODO handle case that requested size is wrong and
>   * buffer format is wrong
> diff --git a/hw/scsi/trace-events b/hw/scsi/trace-events
> index d1995b8..4a2e5d6 100644
> --- a/hw/scsi/trace-events
> +++ b/hw/scsi/trace-events
> @@ -225,7 +225,7 @@ spapr_vscsi_command_complete_sense_data2(unsigned s8, 
> unsigned s9, unsigned s10,
>  spapr_vscsi_command_complete_status(uint32_t status) "Command complete 
> err=%"PRIu32
>  spapr_vscsi_save_request(uint32_t qtag, unsigned desc, unsigned offset) 
> "saving tag=%"PRIu32", current desc#%u, offset=0x%x"
>  spapr_vscsi_load_request(uint32_t qtag, unsigned desc, unsigned offset) 
> "restoring tag=%"PRIu32", current desc#%u, offset=0x%x"
> -spapr_vscsi__process_login(void) "Got login, sending response !"
> +spapr_vscsi_process_login(void) "Got login, sending response !"
>  spapr_vscsi_queue_cmd_no_drive(uint64_t lun) "Command for lun %08" PRIx64 " 
> with no drive"
>  spapr_vscsi_queue_cmd(uint32_t qtag, unsigned cdb, const char *cmd, int lun, 
> int ret) "Queued command tag 0x%"PRIx32" CMD 0x%x=%s LUN %d ret: %d"
>  spapr_vscsi_do_crq(unsigned c0, unsigned c1) "crq: %02x %02x ..."

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

< 1 2 3 4 >

201 - 300 of 355 matches

Mail list logo