date:20180206

Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as migration

2018-02-06 Thread Tan, Jianfeng



> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Tuesday, February 6, 2018 1:32 AM
> To: Igor Mammedov
> Cc: Tan, Jianfeng; qemu-devel@nongnu.org; Jason Wang; Maxime Coquelin;
> Michael S . Tsirkin
> Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> migration
> 
> On 05/02/2018 18:15, Igor Mammedov wrote:
> >>>
> >>> Then we would have both ram block named pc.ram:
> >>>   Block Name    PSize
> >>>   pc.ram 4 KiB
> >>>   /objects/pc.ram    2 MiB
> >>>
> >>> But I assume it's a corner case which not really happen.
> >> Yeah, you're right. :/  I hadn't thought of hotplug.  It can happen indeed.
> >
> > perhaps we should fail object_add memory-backend-foo if it resulted
> > in creating ramblock with duplicate id
> 
> Note that it would only be duplicated with Jianfeng's patch.  So I'm
> worried that his patch is worse than what we have now, because it may
> create conflicts with system RAMBlock names are not necessarily
> predictable.  Right now, -object creates RAMBlock names that are nicely
> constrained within /object/.

So we are trading off between the benefit it takes and the bad effect it brings.

I'm wondering if the above example is the only failed case this patch leads to, 
i.e, only there is a ram named "pc.ram" and "/object/pc.ram" in the src VM?

Please also consider the second option, that adding an alias name for RAMBlock; 
I'm not a big fan for that one, as it just pushes the problem to 
OpenStack/Libvirt.

Or any other suggestions?

Thanks,
Jianfeng

[Qemu-devel] [PATCH v2 4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation

2018-02-06 Thread Haozhong Zhang

Guest writes to vNVDIMM labels are intercepted and performed on the
backend by QEMU. When the backend is a real persistent memort, QEMU
needs to take proper operations to ensure its write persistence on the
persistent memory. Otherwise, a host power failure may result in the
loss of guest label configurations.

Signed-off-by: Haozhong Zhang 
---
 hw/mem/nvdimm.c |  9 -
 include/qemu/pmem.h | 31 +++
 2 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 include/qemu/pmem.h

diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 61e677f92f..18861d1a7a 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -23,6 +23,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/pmem.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "qapi-visit.h"
@@ -156,11 +157,17 @@ static void nvdimm_write_label_data(NVDIMMDevice *nvdimm, 
const void *buf,
 {
 MemoryRegion *mr;
 PCDIMMDevice *dimm = PC_DIMM(nvdimm);
+bool is_pmem = object_property_get_bool(OBJECT(dimm->hostmem),
+"pmem", NULL);
 uint64_t backend_offset;
 
 nvdimm_validate_rw_label_data(nvdimm, size, offset);
 
-memcpy(nvdimm->label_data + offset, buf, size);
+if (!is_pmem) {
+memcpy(nvdimm->label_data + offset, buf, size);
+} else {
+pmem_memcpy_persist(nvdimm->label_data + offset, buf, size);
+}
 
 mr = host_memory_backend_get_memory(dimm->hostmem, _abort);
 backend_offset = memory_region_size(mr) - nvdimm->label_size + offset;
diff --git a/include/qemu/pmem.h b/include/qemu/pmem.h
new file mode 100644
index 00..9017596ff0
--- /dev/null
+++ b/include/qemu/pmem.h
@@ -0,0 +1,31 @@
+/*
+ * Stub functions for libpmem.
+ *
+ * Copyright (c) 2018 Intel Corporation.
+ *
+ * Author: Haozhong Zhang 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_PMEM_H
+#define QEMU_PMEM_H
+
+#ifdef CONFIG_LIBPMEM
+#include 
+#else  /* !CONFIG_LIBPMEM */
+
+#include 
+
+/* Stubs */
+
+static inline void *
+pmem_memcpy_persist(void *pmemdest, const void *src, size_t len)
+{
+return memcpy(pmemdest, src, len);
+}
+
+#endif /* CONFIG_LIBPMEM */
+
+#endif /* !QEMU_PMEM_H */
-- 
2.14.1

[Qemu-devel] [PATCH v2 3/8] configure: add libpmem support

2018-02-06 Thread Haozhong Zhang

Add a pair of configure options --{enable,disable}-libpmem to control
whether QEMU is compiled with PMDK libpmem [1].

QEMU may write to the host persistent memory (e.g. in vNVDIMM label
emulation and live migration), so it must take the proper operations
to ensure the persistence of its own writes. Depending on the CPU
models and available instructions, the optimal operation can vary [2].
PMDK libpmem have already implemented those operations on multiple CPU
models (x86 and ARM) and the logic to select the optimal ones, so QEMU
can just use libpmem rather than re-implement them.

[1] PMDK (formerly known as NMVL), https://github.com/pmem/pmdk/
[2] 
https://github.com/pmem/pmdk/blob/38bfa652721a37fd94c0130ce0e3f5d8baa3ed40/src/libpmem/pmem.c#L33

Signed-off-by: Haozhong Zhang 
---
 configure | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/configure b/configure
index 302fdc92ff..595967e5df 100755
--- a/configure
+++ b/configure
@@ -436,6 +436,7 @@ jemalloc="no"
 replication="yes"
 vxhs=""
 libxml2=""
+libpmem=""
 
 supported_cpu="no"
 supported_os="no"
@@ -1341,6 +1342,10 @@ for opt do
   ;;
   --disable-git-update) git_update=no
   ;;
+  --enable-libpmem) libpmem=yes
+  ;;
+  --disable-libpmem) libpmem=no
+  ;;
   *)
   echo "ERROR: unknown option $opt"
   echo "Try '$0 --help' for more information"
@@ -1592,6 +1597,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   crypto-afalgLinux AF_ALG crypto backend driver
   vhost-user  vhost-user support
   capstonecapstone disassembler support
+  libpmem libpmem support
 
 NOTE: The object files are built at the place where configure is launched
 EOF
@@ -5205,6 +5211,30 @@ if compile_prog "" "" ; then
 have_utmpx=yes
 fi
 
+##
+# check for libpmem
+
+if test "$libpmem" != "no"; then
+  cat > $TMPC <
+int main(void)
+{
+  pmem_is_pmem(0, 0);
+  return 0;
+}
+EOF
+  libpmem_libs="-lpmem"
+  if compile_prog "" "$libpmem_libs" ; then
+libs_softmmu="$libpmem_libs $libs_softmmu"
+libpmem="yes"
+  else
+if test "$libpmem" = "yes" ; then
+  feature_not_found "libpmem" "Install nvml or pmdk"
+fi
+libpmem="no"
+  fi
+fi
+
 ##
 # End of CC checks
 # After here, no more $cc or $ld runs
@@ -5657,6 +5687,7 @@ echo "avx2 optimization $avx2_opt"
 echo "replication support $replication"
 echo "VxHS block device $vxhs"
 echo "capstone  $capstone"
+echo "libpmem support   $libpmem"
 
 if test "$sdl_too_old" = "yes"; then
 echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -6374,6 +6405,10 @@ if test "$vxhs" = "yes" ; then
   echo "VXHS_LIBS=$vxhs_libs" >> $config_host_mak
 fi
 
+if test "$libpmem" = "yes" ; then
+  echo "CONFIG_LIBPMEM=y" >> $config_host_mak
+fi
+
 if test "$tcg_interpreter" = "yes"; then
   QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
 elif test "$ARCH" = "sparc64" ; then
-- 
2.14.1

[Qemu-devel] [PATCH v2 6/8] migration/ram: ensure write persistence on loading normal pages to PMEM

2018-02-06 Thread Haozhong Zhang

When loading a normal page to persistent memory, load its data by
libpmem function pmem_memcpy_nodrain() instead of memcpy(). Combined
with a call to pmem_drain() at the end of memory loading, we can
guarantee all those normal pages are persistenly loaded to PMEM.

Signed-off-by: Haozhong Zhang 
---
 include/migration/qemu-file-types.h |  1 +
 include/qemu/pmem.h |  6 ++
 migration/qemu-file.c   | 41 -
 migration/ram.c |  6 +-
 4 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/include/migration/qemu-file-types.h 
b/include/migration/qemu-file-types.h
index bd6d7dd7f9..bb5c547498 100644
--- a/include/migration/qemu-file-types.h
+++ b/include/migration/qemu-file-types.h
@@ -34,6 +34,7 @@ void qemu_put_be16(QEMUFile *f, unsigned int v);
 void qemu_put_be32(QEMUFile *f, unsigned int v);
 void qemu_put_be64(QEMUFile *f, uint64_t v);
 size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size);
+size_t qemu_get_buffer_to_pmem(QEMUFile *f, uint8_t *buf, size_t size);
 
 int qemu_get_byte(QEMUFile *f);
 
diff --git a/include/qemu/pmem.h b/include/qemu/pmem.h
index 861d8ecc21..77ee1fc4eb 100644
--- a/include/qemu/pmem.h
+++ b/include/qemu/pmem.h
@@ -26,6 +26,12 @@ pmem_memcpy_persist(void *pmemdest, const void *src, size_t 
len)
 return memcpy(pmemdest, src, len);
 }
 
+static inline void *
+pmem_memcpy_nodrain(void *pmemdest, const void *src, size_t len)
+{
+return memcpy(pmemdest, src, len);
+}
+
 static inline void *pmem_memset_nodrain(void *pmemdest, int c, size_t len)
 {
 return memset(pmemdest, c, len);
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 2ab2bf362d..7e573010d9 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -26,6 +26,7 @@
 #include "qemu-common.h"
 #include "qemu/error-report.h"
 #include "qemu/iov.h"
+#include "qemu/pmem.h"
 #include "migration.h"
 #include "qemu-file.h"
 #include "trace.h"
@@ -471,15 +472,8 @@ size_t qemu_peek_buffer(QEMUFile *f, uint8_t **buf, size_t 
size, size_t offset)
 return size;
 }
 
-/*
- * Read 'size' bytes of data from the file into buf.
- * 'size' can be larger than the internal buffer.
- *
- * It will return size bytes unless there was an error, in which case it will
- * return as many as it managed to read (assuming blocking fd's which
- * all current QEMUFile are)
- */
-size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
+static size_t
+qemu_get_buffer_common(QEMUFile *f, uint8_t *buf, size_t size, bool is_pmem)
 {
 size_t pending = size;
 size_t done = 0;
@@ -492,7 +486,11 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t 
size)
 if (res == 0) {
 return done;
 }
-memcpy(buf, src, res);
+if (!is_pmem) {
+memcpy(buf, src, res);
+} else {
+pmem_memcpy_nodrain(buf, src, res);
+}
 qemu_file_skip(f, res);
 buf += res;
 pending -= res;
@@ -501,6 +499,29 @@ size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t 
size)
 return done;
 }
 
+/*
+ * Read 'size' bytes of data from the file into buf.
+ * 'size' can be larger than the internal buffer.
+ *
+ * It will return size bytes unless there was an error, in which case it will
+ * return as many as it managed to read (assuming blocking fd's which
+ * all current QEMUFile are)
+ */
+size_t qemu_get_buffer(QEMUFile *f, uint8_t *buf, size_t size)
+{
+return qemu_get_buffer_common(f, buf, size, false);
+}
+
+/*
+ * Mostly the same as qemu_get_buffer(), except that
+ * 1) it's for the case that 'buf' is in the persistent memory, and
+ * 2) it takes necessary operations to ensure the data persistence in 'buf'.
+ */
+size_t qemu_get_buffer_to_pmem(QEMUFile *f, uint8_t *buf, size_t size)
+{
+return qemu_get_buffer_common(f, buf, size, true);
+}
+
 /*
  * Read 'size' bytes of data from the file.
  * 'size' can be larger than the internal buffer.
diff --git a/migration/ram.c b/migration/ram.c
index 5a0e503818..5a79bbff64 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2950,7 +2950,11 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 break;
 
 case RAM_SAVE_FLAG_PAGE:
-qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
+if (!is_pmem) {
+qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
+} else {
+qemu_get_buffer_to_pmem(f, host, TARGET_PAGE_SIZE);
+}
 break;
 
 case RAM_SAVE_FLAG_COMPRESS_PAGE:
-- 
2.14.1

Re: [Qemu-devel] [PATCH v4] m68k: implement movep instruction

2018-02-06 Thread Pavel Dovgalyuk

> From: Laurent Vivier [mailto:laur...@vivier.eu]
> Le 06/02/2018 à 14:30, Pavel Dovgalyuk a écrit :
> >> From: Laurent Vivier [mailto:laur...@vivier.eu]
> > Thanks!
> >
> > By the way, we also handled reset interrupt, but it is not compatible with 
> > other m68k
> platforms:
> >
> > @@ -66,8 +66,9 @@ static void m68k_cpu_reset(CPUState *s)
> >  cpu_m68k_set_fpcr(env, 0);
> >  env->fpsr = 0;
> >
> > -/* TODO: We should set PC from the interrupt vector.  */
> > -env->pc = 0;
> > +env->vbr = 0;
> > +/* PC and SP (for m68k) will be initialized by the reset handler */
> > +s->exception_index = EXCP_RESET;
> >  }
> >
> > @@ -378,6 +380,8 @@ static void m68k_interrupt_all(CPUM68KState *env, int 
> > is_hw)
> >  cpu_m68k_set_sr(env, sr &= ~SR_M);
> >  sp = env->aregs[7] & ~1;
> >  do_stack_frame(env, , 1, oldsr, 0, retaddr);
> > +} else if (cs->exception_index == EXCP_RESET) {
> > +sp = cpu_ldl_kernel(env, env->vbr + vector - 4);
> >  } else {
> >  do_stack_frame(env, , 0, oldsr, 0, retaddr);
> >  }
> 
> It looks better of what I have already coded :)
> 
> Do you work using code in
> https://github.com/vivier/qemu-m68k , branch q800-dev ?

No, it was a project for our students couple of years ago.
We used Qemu 2.3 with not-yet-included patches for 68000.
I believe that someday we'll port our peripherals onto the new version.

There were some fixes for processing the interrupts. As I can see, all of them 
are
not needed for the mainline Qemu.

We didn't find a solution for 24-bit address bus of 68000. Macintosh stores 
32-bit values 
in address registers and uses them to access the memory. We just duplicated the 
memory layout,
but I believe that there is a better solution.

> I'm already emulating a Quadra 800, it can help for Macintosh-128k

Here is the repository with Mac-128: https://github.com/Dovgalyuk/qemu
We didn't finally fix all the bugs, but it can boot the OS, using some hacks.
One of the hack is related to IWM. We couldn't emulate all timings for that.
CPU controls disk rotation speed through controlling the strobe signal.
It was hard to synchronize this, because icount wasn't fully working and we used
semihosting - we intercepted the file operation system calls and didn't execute
ROM code, emulating them in Qemu instead.

Pavel Dovgalyuk

[Qemu-devel] [PATCH v2 8/8] migration/ram: ensure write persistence on loading xbzrle pages to PMEM

2018-02-06 Thread Haozhong Zhang

When loading a xbzrle encoded page to persistent memory, load the data
via libpmem function pmem_memcpy_nodrain() instead of memcpy().
Combined with a call to pmem_drain() at the end of memory loading, we
can guarantee those xbzrle encoded pages are persistently loaded to PMEM.

Signed-off-by: Haozhong Zhang 
---
 migration/ram.c| 15 ++-
 migration/xbzrle.c | 20 ++--
 migration/xbzrle.h |  1 +
 3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 924d2b9537..87f977617d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2388,10 +2388,10 @@ static void ram_save_pending(QEMUFile *f, void *opaque, 
uint64_t max_size,
 }
 }
 
-static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
+static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host, bool is_pmem)
 {
 unsigned int xh_len;
-int xh_flags;
+int xh_flags, rc;
 uint8_t *loaded_data;
 
 /* extract RLE header */
@@ -2413,8 +2413,13 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, 
void *host)
 qemu_get_buffer_in_place(f, _data, xh_len);
 
 /* decode RLE */
-if (xbzrle_decode_buffer(loaded_data, xh_len, host,
- TARGET_PAGE_SIZE) == -1) {
+if (!is_pmem) {
+rc = xbzrle_decode_buffer(loaded_data, xh_len, host, TARGET_PAGE_SIZE);
+} else {
+rc = xbzrle_decode_buffer_to_pmem(loaded_data, xh_len, host,
+  TARGET_PAGE_SIZE);
+}
+if (rc == -1) {
 error_report("Failed to load XBZRLE page - decode error!");
 return -1;
 }
@@ -2974,7 +2979,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 break;
 
 case RAM_SAVE_FLAG_XBZRLE:
-if (load_xbzrle(f, addr, host) < 0) {
+if (load_xbzrle(f, addr, host, is_pmem) < 0) {
 error_report("Failed to decompress XBZRLE page at "
  RAM_ADDR_FMT, addr);
 ret = -EINVAL;
diff --git a/migration/xbzrle.c b/migration/xbzrle.c
index 1ba482ded9..499d8e1bfb 100644
--- a/migration/xbzrle.c
+++ b/migration/xbzrle.c
@@ -12,6 +12,7 @@
  */
 #include "qemu/osdep.h"
 #include "qemu/cutils.h"
+#include "qemu/pmem.h"
 #include "xbzrle.h"
 
 /*
@@ -126,7 +127,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t 
*new_buf, int slen,
 return d;
 }
 
-int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
+static int xbzrle_decode_buffer_common(uint8_t *src, int slen, uint8_t *dst,
+   int dlen, bool is_pmem)
 {
 int i = 0, d = 0;
 int ret;
@@ -167,10 +169,24 @@ int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t 
*dst, int dlen)
 return -1;
 }
 
-memcpy(dst + d, src + i, count);
+if (!is_pmem) {
+memcpy(dst + d, src + i, count);
+} else {
+pmem_memcpy_nodrain(dst + d, src + i, count);
+}
 d += count;
 i += count;
 }
 
 return d;
 }
+
+int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
+{
+return xbzrle_decode_buffer_common(src, slen, dst, dlen, false);
+}
+
+int xbzrle_decode_buffer_to_pmem(uint8_t *src, int slen, uint8_t *dst, int 
dlen)
+{
+return xbzrle_decode_buffer_common(src, slen, dst, dlen, true);
+}
diff --git a/migration/xbzrle.h b/migration/xbzrle.h
index a0db507b9c..ac5ae32666 100644
--- a/migration/xbzrle.h
+++ b/migration/xbzrle.h
@@ -18,4 +18,5 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, 
int slen,
  uint8_t *dst, int dlen);
 
 int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
+int xbzrle_decode_buffer_to_pmem(uint8_t *src, int slen, uint8_t *dst, int 
dlen);
 #endif
-- 
2.14.1

[Qemu-devel] [PATCH v2 2/8] hostmem-file: add the 'pmem' option

2018-02-06 Thread Haozhong Zhang

When QEMU emulates vNVDIMM labels and migrates vNVDIMM devices, it
needs to know whether the backend storage is a real persistent memory,
in order to decide whether special operations should be performed to
ensure the data persistence.

This boolean option 'pmem' allows users to specify whether the backend
storage of memory-backend-file is a real persistent memory. If
'pmem=on', QEMU will set the flag RAM_PMEM in the RAM block of the
corresponding memory region.

Signed-off-by: Haozhong Zhang 
---
 backends/hostmem-file.c | 26 +-
 docs/nvdimm.txt | 14 ++
 exec.c  | 16 +++-
 include/exec/memory.h   |  2 ++
 include/exec/ram_addr.h |  3 +++
 qemu-options.hx |  9 -
 6 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 30df843d90..5d706d471f 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -34,6 +34,7 @@ struct HostMemoryBackendFile {
 bool discard_data;
 char *mem_path;
 uint64_t align;
+bool is_pmem;
 };
 
 static void
@@ -59,7 +60,8 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 memory_region_init_ram_from_file(>mr, OBJECT(backend),
  path,
  backend->size, fb->align,
- backend->share ? QEMU_RAM_SHARE : 0,
+ (backend->share ? QEMU_RAM_SHARE : 0) |
+ (fb->is_pmem ? QEMU_RAM_PMEM : 0),
  fb->mem_path, errp);
 g_free(path);
 }
@@ -131,6 +133,25 @@ static void file_memory_backend_set_align(Object *o, 
Visitor *v,
 error_propagate(errp, local_err);
 }
 
+static bool file_memory_backend_get_pmem(Object *o, Error **errp)
+{
+return MEMORY_BACKEND_FILE(o)->is_pmem;
+}
+
+static void file_memory_backend_set_pmem(Object *o, bool value, Error **errp)
+{
+HostMemoryBackend *backend = MEMORY_BACKEND(o);
+HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
+
+if (host_memory_backend_mr_inited(backend)) {
+error_setg(errp, "cannot change property 'pmem' of %s '%s'",
+   object_get_typename(o), backend->id);
+return;
+}
+
+fb->is_pmem = value;
+}
+
 static void file_backend_unparent(Object *obj)
 {
 HostMemoryBackend *backend = MEMORY_BACKEND(obj);
@@ -162,6 +183,9 @@ file_backend_class_init(ObjectClass *oc, void *data)
 file_memory_backend_get_align,
 file_memory_backend_set_align,
 NULL, NULL, _abort);
+object_class_property_add_bool(oc, "pmem",
+file_memory_backend_get_pmem, file_memory_backend_set_pmem,
+_abort);
 }
 
 static void file_backend_instance_finalize(Object *o)
diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
index e903d8bb09..bcb2032672 100644
--- a/docs/nvdimm.txt
+++ b/docs/nvdimm.txt
@@ -153,3 +153,17 @@ guest NVDIMM region mapping structure.  This unarmed flag 
indicates
 guest software that this vNVDIMM device contains a region that cannot
 accept persistent writes. In result, for example, the guest Linux
 NVDIMM driver, marks such vNVDIMM device as read-only.
+
+If the vNVDIMM backend is on the host persistent memory that can be
+accessed in SNIA NVM Programming Model [1] (e.g., Intel NVDIMM), it's
+suggested to set the 'pmem' option of memory-backend-file to 'on'. When
+'pmem=on' and QEMU is built with libpmem [2] support (configured with
+--enable-libpmem), QEMU will take necessary operations to guarantee
+the persistence of its own writes to the vNVDIMM backend (e.g., in
+vNVDIMM label emulation and live migration).
+
+References
+--
+
+[1] SNIA NVM Programming Model: 
https://www.snia.org/sites/default/files/technical_work/final/NVMProgrammingModel_v1.2.pdf
+[2] PMDK: http://pmem.io/pmdk/
diff --git a/exec.c b/exec.c
index 16b373a86b..1d83441afe 100644
--- a/exec.c
+++ b/exec.c
@@ -99,6 +99,9 @@ static MemoryRegion io_mem_unassigned;
  */
 #define RAM_RESIZEABLE (1 << 2)
 
+/* RAM is backed by the persistent memory. */
+#define RAM_PMEM   (1 << 3)
+
 #endif
 
 #ifdef TARGET_PAGE_BITS_VARY
@@ -2007,6 +2010,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, 
MemoryRegion *mr,
 Error *local_err = NULL;
 int64_t file_size;
 bool share = flags & QEMU_RAM_SHARE;
+bool is_pmem = flags & QEMU_RAM_PMEM;
 
 if (xen_enabled()) {
 error_setg(errp, "-mem-path not supported with Xen");
@@ -2043,7 +2047,8 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, 
MemoryRegion *mr,
 new_block->mr = mr;
 new_block->used_length = size;
 new_block->max_length = size;
-new_block->flags = share ? RAM_SHARED : 0;
+new_block->flags = (share ? RAM_SHARED : 0) |
+   (is_pmem ? RAM_PMEM : 0);
 new_block->host = file_ram_alloc(new_block, size, fd, !file_size, errp);
 if

[Qemu-devel] [PATCH v2 5/8] migration/ram: ensure write persistence on loading zero pages to PMEM

2018-02-06 Thread Haozhong Zhang

When loading a zero page, check whether it will be loaded to
persistent memory If yes, load it by libpmem function
pmem_memset_nodrain().  Combined with a call to pmem_drain() at the
end of RAM loading, we can guarantee all those zero pages are
persistently loaded.

Depending on the host HW/SW configurations, pmem_drain() can be
"sfence".  Therefore, we do not call pmem_drain() after each
pmem_memset_nodrain(), or use pmem_memset_persist() (equally
pmem_memset_nodrain() + pmem_drain()), in order to avoid unnecessary
overhead.

Signed-off-by: Haozhong Zhang 
---
 include/qemu/pmem.h |  9 +
 migration/ram.c | 34 +-
 2 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/include/qemu/pmem.h b/include/qemu/pmem.h
index 9017596ff0..861d8ecc21 100644
--- a/include/qemu/pmem.h
+++ b/include/qemu/pmem.h
@@ -26,6 +26,15 @@ pmem_memcpy_persist(void *pmemdest, const void *src, size_t 
len)
 return memcpy(pmemdest, src, len);
 }
 
+static inline void *pmem_memset_nodrain(void *pmemdest, int c, size_t len)
+{
+return memset(pmemdest, c, len);
+}
+
+static inline void pmem_drain(void)
+{
+}
+
 #endif /* CONFIG_LIBPMEM */
 
 #endif /* !QEMU_PMEM_H */
diff --git a/migration/ram.c b/migration/ram.c
index cb1950f3eb..5a0e503818 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -49,6 +49,7 @@
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
 #include "migration/block.h"
+#include "qemu/pmem.h"
 
 /***/
 /* ram save/restore */
@@ -2467,6 +2468,20 @@ static inline void *host_from_ram_block_offset(RAMBlock 
*block,
 return block->host + offset;
 }
 
+static void ram_handle_compressed_common(void *host, uint8_t ch, uint64_t size,
+ bool is_pmem)
+{
+if (!ch && is_zero_range(host, size)) {
+return;
+}
+
+if (!is_pmem) {
+memset(host, ch, size);
+} else {
+pmem_memset_nodrain(host, ch, size);
+}
+}
+
 /**
  * ram_handle_compressed: handle the zero page case
  *
@@ -2479,9 +2494,7 @@ static inline void *host_from_ram_block_offset(RAMBlock 
*block,
  */
 void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
 {
-if (ch != 0 || !is_zero_range(host, size)) {
-memset(host, ch, size);
-}
+return ram_handle_compressed_common(host, ch, size, false);
 }
 
 static void *do_data_decompress(void *opaque)
@@ -2823,6 +2836,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 bool postcopy_running = postcopy_is_running();
 /* ADVISE is earlier, it shows the source has the postcopy capability on */
 bool postcopy_advised = postcopy_is_advised();
+bool need_pmem_drain = false;
 
 seq_iter++;
 
@@ -2848,6 +2862,8 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 ram_addr_t addr, total_ram_bytes;
 void *host = NULL;
 uint8_t ch;
+RAMBlock *block = NULL;
+bool is_pmem = false;
 
 addr = qemu_get_be64(f);
 flags = addr & ~TARGET_PAGE_MASK;
@@ -2864,7 +2880,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 
 if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE |
  RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
-RAMBlock *block = ram_block_from_stream(f, flags);
+block = ram_block_from_stream(f, flags);
 
 host = host_from_ram_block_offset(block, addr);
 if (!host) {
@@ -2874,6 +2890,9 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 }
 ramblock_recv_bitmap_set(block, host);
 trace_ram_load_loop(block->idstr, (uint64_t)addr, flags, host);
+
+is_pmem = ramblock_is_pmem(block);
+need_pmem_drain = need_pmem_drain || is_pmem;
 }
 
 switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
@@ -2927,7 +2946,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 
 case RAM_SAVE_FLAG_ZERO:
 ch = qemu_get_byte(f);
-ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
+ram_handle_compressed_common(host, ch, TARGET_PAGE_SIZE, is_pmem);
 break;
 
 case RAM_SAVE_FLAG_PAGE:
@@ -2970,6 +2989,11 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 }
 
 wait_for_decompress_done();
+
+if (need_pmem_drain) {
+pmem_drain();
+}
+
 rcu_read_unlock();
 trace_ram_load_complete(ret, seq_iter);
 return ret;
-- 
2.14.1

[Qemu-devel] [PATCH v2 7/8] migration/ram: ensure write persistence on loading compressed pages to PMEM

2018-02-06 Thread Haozhong Zhang

When loading a compressed page to persistent memory, flush CPU cache
after the data is decompressed. Combined with a call to pmem_drain()
at the end of memory loading, we can guarantee those compressed pages
are persistently loaded to PMEM.

Signed-off-by: Haozhong Zhang 
---
 include/qemu/pmem.h |  4 
 migration/ram.c | 16 +++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/qemu/pmem.h b/include/qemu/pmem.h
index 77ee1fc4eb..20e3f6e71d 100644
--- a/include/qemu/pmem.h
+++ b/include/qemu/pmem.h
@@ -37,6 +37,10 @@ static inline void *pmem_memset_nodrain(void *pmemdest, int 
c, size_t len)
 return memset(pmemdest, c, len);
 }
 
+static inline void pmem_flush(const void *addr, size_t len)
+{
+}
+
 static inline void pmem_drain(void)
 {
 }
diff --git a/migration/ram.c b/migration/ram.c
index 5a79bbff64..924d2b9537 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -274,6 +274,7 @@ struct DecompressParam {
 void *des;
 uint8_t *compbuf;
 int len;
+bool is_pmem;
 };
 typedef struct DecompressParam DecompressParam;
 
@@ -2502,7 +2503,7 @@ static void *do_data_decompress(void *opaque)
 DecompressParam *param = opaque;
 unsigned long pagesize;
 uint8_t *des;
-int len;
+int len, rc;
 
 qemu_mutex_lock(>mutex);
 while (!param->quit) {
@@ -2518,8 +2519,11 @@ static void *do_data_decompress(void *opaque)
  * not a problem because the dirty page will be retransferred
  * and uncompress() won't break the data in other pages.
  */
-uncompress((Bytef *)des, ,
-   (const Bytef *)param->compbuf, len);
+rc = uncompress((Bytef *)des, ,
+(const Bytef *)param->compbuf, len);
+if (rc == Z_OK && param->is_pmem) {
+pmem_flush(des, len);
+}
 
 qemu_mutex_lock(_done_lock);
 param->done = true;
@@ -2605,7 +2609,8 @@ static void compress_threads_load_cleanup(void)
 }
 
 static void decompress_data_with_multi_threads(QEMUFile *f,
-   void *host, int len)
+   void *host, int len,
+   bool is_pmem)
 {
 int idx, thread_count;
 
@@ -2619,6 +2624,7 @@ static void decompress_data_with_multi_threads(QEMUFile 
*f,
 qemu_get_buffer(f, decomp_param[idx].compbuf, len);
 decomp_param[idx].des = host;
 decomp_param[idx].len = len;
+decomp_param[idx].is_pmem = is_pmem;
 qemu_cond_signal(_param[idx].cond);
 qemu_mutex_unlock(_param[idx].mutex);
 break;
@@ -2964,7 +2970,7 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
 ret = -EINVAL;
 break;
 }
-decompress_data_with_multi_threads(f, host, len);
+decompress_data_with_multi_threads(f, host, len, is_pmem);
 break;
 
 case RAM_SAVE_FLAG_XBZRLE:
-- 
2.14.1

[Qemu-devel] [PATCH v2 1/8] memory, exec: switch file ram allocation functions to 'flags' parameters

2018-02-06 Thread Haozhong Zhang

As more flag parameters besides the existing 'share' are going to be
added to following functions
memory_region_init_ram_from_file
qemu_ram_alloc_from_fd
qemu_ram_alloc_from_file
, let's switch them to use the 'flags' parameters so as to ease future
flag additions.

The existing 'share' flag is converted to the QEMU_RAM_SHARE bit in
flags, and other flag bits are ignored by above functions right now.

Signed-off-by: Haozhong Zhang 
---
 backends/hostmem-file.c |  3 ++-
 exec.c  |  7 ---
 include/exec/memory.h   | 10 --
 include/exec/ram_addr.h | 25 +++--
 memory.c|  8 +---
 numa.c  |  2 +-
 6 files changed, 43 insertions(+), 12 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 134b08d63a..30df843d90 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -58,7 +58,8 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 path = object_get_canonical_path(OBJECT(backend));
 memory_region_init_ram_from_file(>mr, OBJECT(backend),
  path,
- backend->size, fb->align, backend->share,
+ backend->size, fb->align,
+ backend->share ? QEMU_RAM_SHARE : 0,
  fb->mem_path, errp);
 g_free(path);
 }
diff --git a/exec.c b/exec.c
index 5e56efefeb..16b373a86b 100644
--- a/exec.c
+++ b/exec.c
@@ -2000,12 +2000,13 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp, bool shared)
 
 #ifdef __linux__
 RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
- bool share, int fd,
+ uint64_t flags, int fd,
  Error **errp)
 {
 RAMBlock *new_block;
 Error *local_err = NULL;
 int64_t file_size;
+bool share = flags & QEMU_RAM_SHARE;
 
 if (xen_enabled()) {
 error_setg(errp, "-mem-path not supported with Xen");
@@ -2061,7 +2062,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, 
MemoryRegion *mr,
 
 
 RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
-   bool share, const char *mem_path,
+   uint64_t flags, const char *mem_path,
Error **errp)
 {
 int fd;
@@ -2073,7 +2074,7 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, 
MemoryRegion *mr,
 return NULL;
 }
 
-block = qemu_ram_alloc_from_fd(size, mr, share, fd, errp);
+block = qemu_ram_alloc_from_fd(size, mr, flags, fd, errp);
 if (!block) {
 if (created) {
 unlink(mem_path);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 1b02bbd334..d87258b6ae 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -479,6 +479,9 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
void *host),
Error **errp);
 #ifdef __linux__
+
+#define QEMU_RAM_SHARE  (1UL << 0)
+
 /**
  * memory_region_init_ram_from_file:  Initialize RAM memory region with a
  *mmap-ed backend.
@@ -490,7 +493,10 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
  * @size: size of the region.
  * @align: alignment of the region base address; if 0, the default alignment
  * (getpagesize()) will be used.
- * @share: %true if memory must be mmaped with the MAP_SHARED flag
+ * @flags: specify properties of this memory region, which can be one or bit-or
+ * of following values:
+ * - QEMU_RAM_SHARE: memory must be mmaped with the MAP_SHARED flag
+ * Other bits are ignored.
  * @path: the path in which to allocate the RAM.
  * @errp: pointer to Error*, to store an error if it happens.
  *
@@ -502,7 +508,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
   const char *name,
   uint64_t size,
   uint64_t align,
-  bool share,
+  uint64_t flags,
   const char *path,
   Error **errp);
 
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index cf2446a176..b8b01d1eb9 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -72,12 +72,33 @@ static inline unsigned long int 
ramblock_recv_bitmap_offset(void *host_addr,
 
 long qemu_getrampagesize(void);
 unsigned long last_ram_page(void);
+
+/**
+ * qemu_ram_alloc_from_file,
+ * qemu_ram_alloc_from_fd:  Allocate a ram block from the specified back
+ *  file or device
+ *
+ * Parameters:
+ *  @size:

[Qemu-devel] [PATCH v2 0/8] nvdimm: guarantee persistence of QEMU writes to persistent memory

2018-02-06 Thread Haozhong Zhang

This v2 patch series extends v1 [1] by covering the migration path as
well.

QEMU writes to vNVDIMM backends in the vNVDIMM label emulation and
live migration. If the backend is on the persistent memory, QEMU needs
to take proper operations to ensure its writes persistent on the
persistent memory. Otherwise, a host power failure may result in the
loss the guest data on the persistent memory.

This patch series is based on Marcel's patch "mem: add share parameter
to memory-backend-ram" [2] because of the changes in patch 1.

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg05040.html
[2] http://lists.gnu.org/archive/html/qemu-devel/2018-02/msg00768.html

Changes in v2:
 * (Patch 1) Use a flags parameter in file ram allocation functions.
 * (Patch 2) Add a new option 'pmem' to hostmem-file.
 * (Patch 3) Use libpmem to operate on the persistent memory, rather
   than re-implementing those operations in QEMU.
 * (Patch 5-8) Consider the write persistence in the migration path.

Haozhong Zhang (8):
 [1/8] memory, exec: switch file ram allocation functions to 'flags' parameters
 [2/8] hostmem-file: add the 'pmem' option
 [3/8] configure: add libpmem support
 [4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation
 [5/8] migration/ram: ensure write persistence on loading zero pages to PMEM
 [6/8] migration/ram: ensure write persistence on loading normal pages to PMEM
 [7/8] migration/ram: ensure write persistence on loading compressed pages to 
PMEM
 [8/8] migration/ram: ensure write persistence on loading xbzrle pages to PMEM

 backends/hostmem-file.c | 27 +-
 configure   | 35 ++
 docs/nvdimm.txt | 14 
 exec.c  | 23 +---
 hw/mem/nvdimm.c |  9 -
 include/exec/memory.h   | 12 +--
 include/exec/ram_addr.h | 28 +--
 include/migration/qemu-file-types.h |  1 +
 include/qemu/pmem.h | 50 ++
 memory.c|  8 +++--
 migration/qemu-file.c   | 41 +++--
 migration/ram.c | 71 -
 migration/xbzrle.c  | 20 +--
 migration/xbzrle.h  |  1 +
 numa.c  |  2 +-
 qemu-options.hx |  9 -
 16 files changed, 308 insertions(+), 43 deletions(-)
 create mode 100644 include/qemu/pmem.h

-- 
2.14.1

[Qemu-devel] [qemu-web PATCH v2] Add a blog post with the presentations from DevConf and FOSDEM 2018

2018-02-06 Thread Thomas Huth

Let's provide some links to the videos from FOSDEM and DevConf.

Reviewed-by: Alex Bennée 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Eduardo Otubo 
Signed-off-by: Thomas Huth 
---
 v2:
 - Added link to the virtio 1.1 talk
 - Added links to the schedules of the virtualization tracks

 _posts/2018-02-06-devconf-fosdem.md | 46 +
 1 file changed, 46 insertions(+)
 create mode 100644 _posts/2018-02-06-devconf-fosdem.md

diff --git a/_posts/2018-02-06-devconf-fosdem.md 
b/_posts/2018-02-06-devconf-fosdem.md
new file mode 100644
index 000..74f7eb4
--- /dev/null
+++ b/_posts/2018-02-06-devconf-fosdem.md
@@ -0,0 +1,46 @@
+---
+layout: post
+title:  "Presentations from DevConf and FOSDEM 2018"
+date:   2018-02-06 17:00:00 +0100
+author: Thomas Huth
+categories: [presentations, conferences]
+---
+During the past two weeks, there were two important conferences for Open
+Source developers in Europe, where you could also enjoy some QEMU related
+presentations. The following QEMU-related talks were held at the
+[DevConf 2018](https://devconf.cz/cz/2018) conference in Brno:
+
+* [Eliminating guest page cache](https://www.youtube.com/watch?v=NG0n5MTXOa4)
+  by Pankaj Gupta
+
+* [Anatomy of KVM Guest](https://www.youtube.com/watch?v=t-MSukwDqeM)
+  by Prasad J Pandit
+
+* [QEMU Sandboxing for dummies](https://www.youtube.com/watch?v=_7yGiafZdVc)
+  by Eduardo Otubo
+
+And at the [FOSDEM 2018](https://fosdem.org/2018/) in Brussels, you could
+listen to the following QEMU related talks:
+
+* [QEMU in UEFI](https://fosdem.org/2018/schedule/event/vai_qemu_in_uefi/)
+  by Alexander Graf
+
+* [What's new in Virtio 1.1](https://fosdem.org/2018/schedule/event/virtio/)
+  by Jens Freimann
+
+* [Live Block Device Operations in
+  QEMU](https://fosdem.org/2018/schedule/event/vai_qemu_live_dev_operations/)
+  by Kashyap Chamarthy
+
+* [Vectors Meet
+  
Virtualization](https://fosdem.org/2018/schedule/event/vai_vectors_meet_virtualization/)
+  by Alex Bennée
+
+* [Finding your way through the QEMU parameter
+  jungle](https://fosdem.org/2018/schedule/event/vai_qemu_jungle/)
+  by Thomas Huth
+
+More virtualization related talks can be found in the [schedule from
+the DevConf](https://devconfcz2018.sched.com/type/virtualization) and
+in the [schedule from the
+FOSDEM](https://fosdem.org/2018/schedule/track/virtualization_and_iaas/).
-- 
1.8.3.1

[Qemu-devel] [PATCH] ratelimit: don't align wait time with slices

2018-02-06 Thread Wolfgang Bumiller

It is possible for rate limited writes to keep overshooting a slice's
quota by a tiny amount causing the slice-aligned waiting period to
effectively halve the rate.

Signed-off-by: Wolfgang Bumiller 
---
Copied the Ccs from the discussion thread, hope that's fine, as I also
just noticed that for my reply containing this snippet I had hit reply
on the mail that did not contain those Ccs yet, sorry about that.

 include/qemu/ratelimit.h | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h
index 8dece483f5..1b38291823 100644
--- a/include/qemu/ratelimit.h
+++ b/include/qemu/ratelimit.h
@@ -36,7 +36,7 @@ typedef struct {
 static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n)
 {
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-uint64_t delay_slices;
+double delay_slices;
 
 assert(limit->slice_quota && limit->slice_ns);
 
@@ -55,12 +55,11 @@ static inline int64_t ratelimit_calculate_delay(RateLimit 
*limit, uint64_t n)
 return 0;
 }
 
-/* Quota exceeded. Calculate the next time slice we may start
- * sending data again. */
-delay_slices = (limit->dispatched + limit->slice_quota - 1) /
-limit->slice_quota;
+/* Quota exceeded. Wait based on the excess amount and then start a new
+ * slice. */
+delay_slices = (double)limit->dispatched / limit->slice_quota;
 limit->slice_end_time = limit->slice_start_time +
-delay_slices * limit->slice_ns;
+(uint64_t)(delay_slices * limit->slice_ns);
 return limit->slice_end_time - now;
 }
 
-- 
2.11.0

Re: [Qemu-devel] [PATCH v2 3/6] qapi: add nbd-server-remove

2018-02-06 Thread Markus Armbruster

"Dr. David Alan Gilbert"  writes:

> * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
>> 26.01.2018 18:05, Dr. David Alan Gilbert wrote:
>> > * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
[...]
>> > > most of commands, ported to hmp are done in same style: they just call
>> > > corresponding qmp command.

HMP commands *should* call the QMP command to do the actual work.  That
way, we *know* all the functionality is available in QMP, and HMP is
consistent with it.

Sometimes, calling helpers shared with QMP is more convenient, and
that's okay, but then you have to think about QMP completeness and
HMP/QMP consistency.

The only exception are HMP commands that don't make sense in QMP, such
as @cpu.

>> > > Isn't it better to provide common interface for calling qmp commands 
>> > > through
>> > > HMP monitor, to never
>> > > create hmp versions of new commands? they will be available 
>> > > automatically.
>> > It would be nice to do that, but they're not that consistent in how they
>> > convert parameters and options, but I occasionally wonder if we could
>> > automate more of it.
>> 
>> 
>> What about allowing some new syntax in hmp, directly mapped to qmp?
>> 
>> something like
>> 
>> >>> blockdev-add id disk driver qcow2 cache {writeback true direct true} aio
>> native discard unmap file {driver file filename /tmp/somedisk}
>> 
>> ?
>
> Hmm, I don't particularly find that easy to read either; however the
> actual block device specification for HMP should be the same as what we
> pass on the command line, so we only have to worry about any extra
> things that are part of blockdev_add.
> (I'm sure we can find a way of making the one we pass on the commandline
> more readable as well, there's so much duplication).

Good points.

QMP syntax is different for a good reason: it serves machines rather
than humans.

Both HMP and command line serve the same humans, yet the syntax they
wrap around common functionality is different.  Sad waste of developer
time, sad waste of user brain power.  The former could perhaps be
reduced with better tooling, say having QAPI generate the details.

If you have QAPI generate HMP and command line from the exact same
definitions as QMP, you get what Vladimir wants: different interfaces to
the exact same functionality, without additional coding.

Note that the needs of humans and machines differ in more ways than just
syntax.  For instance, humans appreciate convenience features to save
typing.  In a machine interface, they'd add unnecessary and
inappropriate complexity.  Adding convenience is a good reason for
actually designing the HMP interface, rather than copying the QMP one
blindly.

Re: [Qemu-devel] [PATCH 00/54] Patch Round-up for stable 2.11.1, freeze on 2018-02-12

2018-02-06 Thread Thomas Huth

On 06.02.2018 20:14, Michael Roth wrote:
> Hi everyone,  
> 
> 
> The following new patches are queued for QEMU stable v2.11.1:
> 
>   https://github.com/mdroth/qemu/commits/stable-2.11-staging
> 
> The release is planned for 2017-02-14:
> 
>   https://wiki.qemu.org/Planning/2.11
> 
> Please respond here or CC qemu-sta...@nongnu.org on any patches you
> think should be included in the release.

Looking for "CVE" in the changelog, these look like good candidates for
stable as well:

191f59dc17396bb5a8da50f8c59b6e0a430711a4
vga: check the validation of memory addr when draw text

f887cf165db20f405cb8805c716bd363aaadf815
ui: place a hard cap on VNC server output buffer size
(and the preceding patches)

> Of particular importance would be any feedback on the various QEMU
> patches relating to Spectre/Meltdown mitigation. The current tree has
> what I understand to be the QEMU components required for x86, s390,
> and pseries, but feedback/confirmation from the various authors would
> be greatly appreciated.
[...]
> Christian Borntraeger (2):
>   s390x/kvm: Handle bpb feature
>   s390x/kvm: provide stfle.81

Confirmed, AFAIK that are the only two patches that are required for
Spectre on s390x (together with the linux-headers update).

 Thomas

Re: [Qemu-devel] [PATCH v5 1/4] target/arm: implement SHA-512 instructions

2018-02-06 Thread Richard Henderson

On 02/06/2018 11:15 AM, Peter Maydell wrote:
> My test setup doesn't capture register values from
> before the insn executes...

I have patches for RISU to dump each record written
to the trace file, which does allow one to go back
and examine previous register values.


r~

Re: [Qemu-devel] [PATCH v5 0/5] coroutine-lock: polymorphic CoQueue

2018-02-06 Thread Fam Zheng

On Sat, 02/03 10:39, Paolo Bonzini wrote:
> There are cases in which a queued coroutine must be restarted from
> non-coroutine context (with qemu_co_enter_next).  In this cases,
> qemu_co_enter_next also needs to be thread-safe, but it cannot use a
> CoMutex and so cannot qemu_co_queue_wait.  This happens in curl (which
> right now is rolling its own list of Coroutines) and will happen in
> Fam's NVMe driver as well.
> 
> This series extracts the idea of a polymorphic lockable object
> from my "scoped lock guard" proposal, and applies it to CoQueue.
> The implementation of QemuLockable is similar to C11 _Generic, but
> redone using the preprocessor and GCC builtins for compatibility.
> 
> In general, while a bit on the esoteric side, the functionality used
> to emulate _Generic is fairly old in GCC, and the builtins are already
> used by include/qemu/atomic.h; the series was tested with Fedora 27 (boot
> Damn Small Linux via http) and CentOS 6 (compiled only).
> 
> Paolo
> 
> v4->v5: fix checkpatch complaints

Queued, thanks.

Fam

Re: [Qemu-devel] [RFC PATCH] vfio/pci: Add ioeventfd support

2018-02-06 Thread Alexey Kardashevskiy

On 07/02/18 15:25, Alex Williamson wrote:
> On Wed, 7 Feb 2018 15:09:22 +1100
> Alexey Kardashevskiy  wrote:
>> On 07/02/18 11:08, Alex Williamson wrote:
>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>>> index e3301dbd27d4..07966a5f0832 100644
>>> --- a/include/uapi/linux/vfio.h
>>> +++ b/include/uapi/linux/vfio.h
>>> @@ -503,6 +503,30 @@ struct vfio_pci_hot_reset {
>>>  
>>>  #define VFIO_DEVICE_PCI_HOT_RESET  _IO(VFIO_TYPE, VFIO_BASE + 13)
>>>  
>>> +/**
>>> + * VFIO_DEVICE_IOEVENTFD - _IOW(VFIO_TYPE, VFIO_BASE + 14,
>>> + *  struct vfio_device_ioeventfd)
>>> + *
>>> + * Perform a write to the device at the specified device fd offset, with
>>> + * the specified data and width when the provided eventfd is triggered.
>>> + *
>>> + * Return: 0 on success, -errno on failure.
>>> + */
>>> +struct vfio_device_ioeventfd {
>>> +   __u32   argsz;
>>> +   __u32   flags;
>>> +#define VFIO_DEVICE_IOEVENTFD_8(1 << 0) /* 1-byte write */
>>> +#define VFIO_DEVICE_IOEVENTFD_16   (1 << 1) /* 2-byte write */
>>> +#define VFIO_DEVICE_IOEVENTFD_32   (1 << 2) /* 4-byte write */
>>> +#define VFIO_DEVICE_IOEVENTFD_64   (1 << 3) /* 8-byte write */
>>> +#define VFIO_DEVICE_IOEVENTFD_SIZE_MASK(0xf)
>>> +   __u64   offset; /* device fd offset of write */
>>> +   __u64   data;   /* data to be written */
>>> +   __s32   fd; /* -1 for de-assignment */
>>> +};
>>> +
>>> +#define VFIO_DEVICE_IOEVENTFD  _IO(VFIO_TYPE, VFIO_BASE + 14)  
>>
>>
>> Is this a first ioctl with endianness fixed to little-endian? I'd suggest
>> to comment on that as things like vfio_info_cap_header do use the host
>> endianness.
> 
> Look at our current read and write interface, we call leXX_to_cpu
> before calling iowriteXX there and I think a user would logically
> expect to use the same data format here as they would there.

If the data is "char data[8]" (i.e. bytestream), then it can be expected to
be device/bus endian (i.e. PCI == little endian), but if it is u64 - then I
am not so sure really, and this made me look around. It could be "__le64
data" too.

> Also note
> that iowriteXX does a cpu_to_leXX, so are we really defining the
> interface as little-endian or are we just trying to make ourselves
> endian neutral and counter that implicit conversion?  Thanks,

Defining it LE is fine, I just find it a bit confusing when
vfio_info_cap_header is host endian but vfio_device_ioeventfd is not.


-- 
Alexey

Re: [Qemu-devel] [RFC PATCH] vfio/pci: Add ioeventfd support

2018-02-06 Thread Alex Williamson

On Wed, 7 Feb 2018 15:09:22 +1100
Alexey Kardashevskiy  wrote:
> On 07/02/18 11:08, Alex Williamson wrote:
> > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> > index e3301dbd27d4..07966a5f0832 100644
> > --- a/include/uapi/linux/vfio.h
> > +++ b/include/uapi/linux/vfio.h
> > @@ -503,6 +503,30 @@ struct vfio_pci_hot_reset {
> >  
> >  #define VFIO_DEVICE_PCI_HOT_RESET  _IO(VFIO_TYPE, VFIO_BASE + 13)
> >  
> > +/**
> > + * VFIO_DEVICE_IOEVENTFD - _IOW(VFIO_TYPE, VFIO_BASE + 14,
> > + *  struct vfio_device_ioeventfd)
> > + *
> > + * Perform a write to the device at the specified device fd offset, with
> > + * the specified data and width when the provided eventfd is triggered.
> > + *
> > + * Return: 0 on success, -errno on failure.
> > + */
> > +struct vfio_device_ioeventfd {
> > +   __u32   argsz;
> > +   __u32   flags;
> > +#define VFIO_DEVICE_IOEVENTFD_8(1 << 0) /* 1-byte write */
> > +#define VFIO_DEVICE_IOEVENTFD_16   (1 << 1) /* 2-byte write */
> > +#define VFIO_DEVICE_IOEVENTFD_32   (1 << 2) /* 4-byte write */
> > +#define VFIO_DEVICE_IOEVENTFD_64   (1 << 3) /* 8-byte write */
> > +#define VFIO_DEVICE_IOEVENTFD_SIZE_MASK(0xf)
> > +   __u64   offset; /* device fd offset of write */
> > +   __u64   data;   /* data to be written */
> > +   __s32   fd; /* -1 for de-assignment */
> > +};
> > +
> > +#define VFIO_DEVICE_IOEVENTFD  _IO(VFIO_TYPE, VFIO_BASE + 14)  
> 
> 
> Is this a first ioctl with endianness fixed to little-endian? I'd suggest
> to comment on that as things like vfio_info_cap_header do use the host
> endianness.

Look at our current read and write interface, we call leXX_to_cpu
before calling iowriteXX there and I think a user would logically
expect to use the same data format here as they would there.  Also note
that iowriteXX does a cpu_to_leXX, so are we really defining the
interface as little-endian or are we just trying to make ourselves
endian neutral and counter that implicit conversion?  Thanks,

Alex

[Qemu-devel] [PATCH v5 10/14] pci: Add support for Designware IP block

2018-02-06 Thread Andrey Smirnov

Add code needed to get a functional PCI subsytem when using in
conjunction with upstream Linux guest (4.13+). Tested to work against
"e1000e" (network adapter, using MSI interrupts) as well as
"usb-ehci" (USB controller, using legacy PCI interrupts).

Based on "i.MX6 Applications Processor Reference Manual" (Document
Number: IMX6DQRM Rev. 4) as well as corresponding dirver in Linux
kernel (circa 4.13 - 4.16 found in drivers/pci/dwc/*)

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Signed-off-by: Andrey Smirnov 
---
 default-configs/arm-softmmu.mak  |   2 +
 hw/pci-host/Makefile.objs|   2 +
 hw/pci-host/designware.c | 759 +++
 include/hw/pci-host/designware.h |  97 +
 include/hw/pci/pci_ids.h |   2 +
 5 files changed, 862 insertions(+)
 create mode 100644 hw/pci-host/designware.c
 create mode 100644 include/hw/pci-host/designware.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index b0d6e65038..0c5ae914ed 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -132,3 +132,5 @@ CONFIG_GPIO_KEY=y
 CONFIG_MSF2=y
 CONFIG_FW_CFG_DMA=y
 CONFIG_XILINX_AXI=y
+CONFIG_PCI_DESIGNWARE=y
+
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index 4b69f737b5..6d6597c065 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -17,3 +17,5 @@ common-obj-$(CONFIG_PCI_PIIX) += piix.o
 common-obj-$(CONFIG_PCI_Q35) += q35.o
 common-obj-$(CONFIG_PCI_GENERIC) += gpex.o
 common-obj-$(CONFIG_PCI_XILINX) += xilinx-pcie.o
+
+common-obj-$(CONFIG_PCI_DESIGNWARE) += designware.o
diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
new file mode 100644
index 00..551a881af0
--- /dev/null
+++ b/hw/pci-host/designware.c
@@ -0,0 +1,759 @@
+/*
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * Designware PCIe IP block emulation
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * .
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pci_host.h"
+#include "hw/pci/pcie_port.h"
+#include "hw/pci-host/designware.h"
+
+#define PCIE_PORT_LINK_CONTROL  0x710
+
+#define PCIE_PHY_DEBUG_R1   0x72C
+#define PCIE_PHY_DEBUG_R1_XMLH_LINK_UP  BIT(4)
+
+#define PCIE_LINK_WIDTH_SPEED_CONTROL   0x80C
+#define PORT_LOGIC_SPEED_CHANGE (0x1 << 17)
+
+#define PCIE_MSI_ADDR_LO0x820
+#define PCIE_MSI_ADDR_HI0x824
+#define PCIE_MSI_INTR0_ENABLE   0x828
+#define PCIE_MSI_INTR0_MASK 0x82C
+#define PCIE_MSI_INTR0_STATUS   0x830
+
+#define PCIE_ATU_VIEWPORT   0x900
+#define PCIE_ATU_REGION_INBOUND (0x1 << 31)
+#define PCIE_ATU_REGION_OUTBOUND(0x0 << 31)
+#define PCIE_ATU_REGION_INDEX2  (0x2 << 0)
+#define PCIE_ATU_REGION_INDEX1  (0x1 << 0)
+#define PCIE_ATU_REGION_INDEX0  (0x0 << 0)
+#define PCIE_ATU_CR10x904
+#define PCIE_ATU_TYPE_MEM   (0x0 << 0)
+#define PCIE_ATU_TYPE_IO(0x2 << 0)
+#define PCIE_ATU_TYPE_CFG0  (0x4 << 0)
+#define PCIE_ATU_TYPE_CFG1  (0x5 << 0)
+#define PCIE_ATU_CR20x908
+#define PCIE_ATU_ENABLE (0x1 << 31)
+#define PCIE_ATU_BAR_MODE_ENABLE(0x1 << 30)
+#define PCIE_ATU_LOWER_BASE 0x90C
+#define PCIE_ATU_UPPER_BASE 0x910
+#define PCIE_ATU_LIMIT  0x914
+#define PCIE_ATU_LOWER_TARGET   0x918
+#define PCIE_ATU_BUS(x) (((x) >> 24) & 0xff)
+#define PCIE_ATU_DEVFN(x)   (((x) >> 16) & 0xff)
+#define PCIE_ATU_UPPER_TARGET   0x91C
+
+static DesignwarePCIEHost *
+designware_pcie_root_to_host(DesignwarePCIERoot *root)
+{
+BusState *bus = qdev_get_parent_bus(DEVICE(root));
+return DESIGNWARE_PCIE_HOST(bus->parent);
+}
+
+static void designware_pcie_root_msi_write(void *opaque, hwaddr addr,
+   uint64_t val, unsigned len)
+{
+DesignwarePCIERoot

[Qemu-devel] [PATCH v5 09/14] pci: Use pci_config_size in pci_data_* accessors

2018-02-06 Thread Andrey Smirnov

Use pci_config_size (as opposed to PCI_CONFIG_SPACE_SIZE) in
pci_data_read() and pci_data_write(), so this function would work for
both classic PCI and PCIe use-cases.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Signed-off-by: Andrey Smirnov 
---
 hw/pci/pci_host.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c
index 5eaa935cb5..ea52ea07cd 100644
--- a/hw/pci/pci_host.c
+++ b/hw/pci/pci_host.c
@@ -89,30 +89,35 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, 
uint32_t addr,
 void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len)
 {
 PCIDevice *pci_dev = pci_dev_find_by_addr(s, addr);
-uint32_t config_addr = addr & (PCI_CONFIG_SPACE_SIZE - 1);
+uint32_t config_addr;
 
 if (!pci_dev) {
 return;
 }
 
+config_addr = addr & (pci_config_size(pci_dev) - 1);
+
 PCI_DPRINTF("%s: %s: addr=%02" PRIx32 " val=%08" PRIx32 " len=%d\n",
 __func__, pci_dev->name, config_addr, val, len);
-pci_host_config_write_common(pci_dev, config_addr, PCI_CONFIG_SPACE_SIZE,
+pci_host_config_write_common(pci_dev, config_addr,
+ pci_config_size(pci_dev),
  val, len);
 }
 
 uint32_t pci_data_read(PCIBus *s, uint32_t addr, int len)
 {
 PCIDevice *pci_dev = pci_dev_find_by_addr(s, addr);
-uint32_t config_addr = addr & (PCI_CONFIG_SPACE_SIZE - 1);
+uint32_t config_addr;
 uint32_t val;
 
 if (!pci_dev) {
 return ~0x0;
 }
 
+config_addr = addr & (pci_config_size(pci_dev) - 1);
+
 val = pci_host_config_read_common(pci_dev, config_addr,
-  PCI_CONFIG_SPACE_SIZE, len);
+  pci_config_size(pci_dev), len);
 PCI_DPRINTF("%s: %s: addr=%02"PRIx32" val=%08"PRIx32" len=%d\n",
 __func__, pci_dev->name, config_addr, val, len);
 
-- 
2.14.3

[Qemu-devel] [PATCH v5 08/14] i.MX: Add implementation of i.MX7 GPR IP block

2018-02-06 Thread Andrey Smirnov

Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/misc/Makefile.objs  |   1 +
 hw/misc/imx7_gpr.c | 124 +
 hw/misc/trace-events   |   4 ++
 include/hw/misc/imx7_gpr.h |  28 ++
 4 files changed, 157 insertions(+)
 create mode 100644 hw/misc/imx7_gpr.c
 create mode 100644 include/hw/misc/imx7_gpr.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 019886912c..fce426eb75 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -36,6 +36,7 @@ obj-$(CONFIG_IMX) += imx6_src.o
 obj-$(CONFIG_IMX) += imx7_ccm.o
 obj-$(CONFIG_IMX) += imx2_wdt.o
 obj-$(CONFIG_IMX) += imx7_snvs.o
+obj-$(CONFIG_IMX) += imx7_gpr.o
 obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o
 obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o
 obj-$(CONFIG_MAINSTONE) += mst_fpga.o
diff --git a/hw/misc/imx7_gpr.c b/hw/misc/imx7_gpr.c
new file mode 100644
index 00..c2a9df29c6
--- /dev/null
+++ b/hw/misc/imx7_gpr.c
@@ -0,0 +1,124 @@
+/*
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * i.MX7 GPR IP block emulation code
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * Bare minimum emulation code needed to support being able to shut
+ * down linux guest gracefully.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/misc/imx7_gpr.h"
+#include "qemu/log.h"
+#include "sysemu/sysemu.h"
+
+#include "trace.h"
+
+enum IMX7GPRRegisters {
+IOMUXC_GPR0  = 0x00,
+IOMUXC_GPR1  = 0x04,
+IOMUXC_GPR2  = 0x08,
+IOMUXC_GPR3  = 0x0c,
+IOMUXC_GPR4  = 0x10,
+IOMUXC_GPR5  = 0x14,
+IOMUXC_GPR6  = 0x18,
+IOMUXC_GPR7  = 0x1c,
+IOMUXC_GPR8  = 0x20,
+IOMUXC_GPR9  = 0x24,
+IOMUXC_GPR10 = 0x28,
+IOMUXC_GPR11 = 0x2c,
+IOMUXC_GPR12 = 0x30,
+IOMUXC_GPR13 = 0x34,
+IOMUXC_GPR14 = 0x38,
+IOMUXC_GPR15 = 0x3c,
+IOMUXC_GPR16 = 0x40,
+IOMUXC_GPR17 = 0x44,
+IOMUXC_GPR18 = 0x48,
+IOMUXC_GPR19 = 0x4c,
+IOMUXC_GPR20 = 0x50,
+IOMUXC_GPR21 = 0x54,
+IOMUXC_GPR22 = 0x58,
+};
+
+#define IMX7D_GPR1_IRQ_MASK BIT(12)
+#define IMX7D_GPR1_ENET1_TX_CLK_SEL_MASKBIT(13)
+#define IMX7D_GPR1_ENET2_TX_CLK_SEL_MASKBIT(14)
+#define IMX7D_GPR1_ENET_TX_CLK_SEL_MASK (0x3 << 13)
+#define IMX7D_GPR1_ENET1_CLK_DIR_MASK   BIT(17)
+#define IMX7D_GPR1_ENET2_CLK_DIR_MASK   BIT(18)
+#define IMX7D_GPR1_ENET_CLK_DIR_MASK(0x3 << 17)
+
+#define IMX7D_GPR5_CSI_MUX_CONTROL_MIPI BIT(4)
+#define IMX7D_GPR12_PCIE_PHY_REFCLK_SEL BIT(5)
+#define IMX7D_GPR22_PCIE_PHY_PLL_LOCKED BIT(31)
+
+
+static uint64_t imx7_gpr_read(void *opaque, hwaddr offset, unsigned size)
+{
+trace_imx7_gpr_read(offset);
+
+if (offset == IOMUXC_GPR22) {
+return IMX7D_GPR22_PCIE_PHY_PLL_LOCKED;
+}
+
+return 0;
+}
+
+static void imx7_gpr_write(void *opaque, hwaddr offset,
+   uint64_t v, unsigned size)
+{
+trace_imx7_gpr_write(offset, v);
+}
+
+static const struct MemoryRegionOps imx7_gpr_ops = {
+.read = imx7_gpr_read,
+.write = imx7_gpr_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+/*
+ * Our device would not work correctly if the guest was doing
+ * unaligned access. This might not be a limitation on the
+ * real device but in practice there is no reason for a guest
+ * to access this device unaligned.
+ */
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false,
+},
+};
+
+static void imx7_gpr_init(Object *obj)
+{
+SysBusDevice *sd = SYS_BUS_DEVICE(obj);
+IMX7GPRState *s = IMX7_GPR(obj);
+
+memory_region_init_io(>mmio, obj, _gpr_ops, s,
+  TYPE_IMX7_GPR, 64 * 1024);
+sysbus_init_mmio(sd, >mmio);
+}
+
+static void imx7_gpr_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->desc  = "i.MX7 General Purpose Registers Module";
+}
+
+static const TypeInfo imx7_gpr_info = {
+.name  = TYPE_IMX7_GPR,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(IMX7GPRState),
+.instance_init = imx7_gpr_init,
+.class_init= imx7_gpr_class_init,
+};
+
+static void imx7_gpr_register_type(void)
+{
+type_register_static(_gpr_info);
+}
+type_init(imx7_gpr_register_type)
diff --git a/hw/misc/trace-events b/hw/misc/trace-events
index

[Qemu-devel] [PATCH v5 12/14] i.MX: Add i.MX7 SOC implementation.

2018-02-06 Thread Andrey Smirnov

The following interfaces are partially or fully emulated:

* up to 2 Cortex A9 cores (SMP works with PSCI)
* A7 MPCORE (identical to A15 MPCORE)
* 4 GPTs modules
* 7 GPIO controllers
* 2 IOMUXC controllers
* 1 CCM module
* 1 SVNS module
* 1 SRC module
* 1 GPCv2 controller
* 4 eCSPI controllers
* 4 I2C controllers
* 7 i.MX UART controllers
* 2 FlexCAN controllers
* 2 Ethernet controllers (FEC)
* 3 SD controllers (USDHC)
* 4 WDT modules
* 1 SDMA module
* 1 GPR module
* 2 USBMISC modules
* 2 ADC modules
* 1 PCIe controller

Tested to boot and work with upstream Linux (4.13+) guest.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 default-configs/arm-softmmu.mak |   1 +
 hw/arm/Makefile.objs|   2 +
 hw/arm/fsl-imx7.c   | 580 
 include/hw/arm/fsl-imx7.h   | 221 +++
 4 files changed, 804 insertions(+)
 create mode 100644 hw/arm/fsl-imx7.c
 create mode 100644 include/hw/arm/fsl-imx7.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 0c5ae914ed..99fe1cd1fb 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -118,6 +118,7 @@ CONFIG_ALLWINNER_A10=y
 CONFIG_FSL_IMX6=y
 CONFIG_FSL_IMX31=y
 CONFIG_FSL_IMX25=y
+CONFIG_FSL_IMX7=y
 
 CONFIG_IMX_I2C=y
 
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 1c896bafb4..1f306c6a19 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -20,3 +20,5 @@ obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
 obj-$(CONFIG_MSF2) += msf2-soc.o msf2-som.o
+obj-$(CONFIG_FSL_IMX7) += fsl-imx7.o
+
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
new file mode 100644
index 00..5e78f64ac4
--- /dev/null
+++ b/hw/arm/fsl-imx7.c
@@ -0,0 +1,580 @@
+/*
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * i.MX7 SoC definitions
+ *
+ * Author: Andrey Smirnov 
+ *
+ * Based on hw/arm/fsl-imx6.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "hw/arm/fsl-imx7.h"
+#include "hw/misc/unimp.h"
+#include "sysemu/sysemu.h"
+#include "qemu/error-report.h"
+
+#define NAME_SIZE 20
+
+static void fsl_imx7_init(Object *obj)
+{
+BusState *sysbus = sysbus_get_default();
+FslIMX7State *s = FSL_IMX7(obj);
+char name[NAME_SIZE];
+int i;
+
+if (smp_cpus > FSL_IMX7_NUM_CPUS) {
+error_report("%s: Only %d CPUs are supported (%d requested)",
+ TYPE_FSL_IMX7, FSL_IMX7_NUM_CPUS, smp_cpus);
+exit(1);
+}
+
+for (i = 0; i < smp_cpus; i++) {
+object_initialize(>cpu[i], sizeof(s->cpu[i]),
+  ARM_CPU_TYPE_NAME("cortex-a7"));
+snprintf(name, NAME_SIZE, "cpu%d", i);
+object_property_add_child(obj, name, OBJECT(>cpu[i]),
+  _fatal);
+}
+
+/*
+ * A7MPCORE
+ */
+object_initialize(>a7mpcore, sizeof(s->a7mpcore), TYPE_A15MPCORE_PRIV);
+qdev_set_parent_bus(DEVICE(>a7mpcore), sysbus);
+object_property_add_child(obj, "a7mpcore",
+  OBJECT(>a7mpcore), _fatal);
+
+/*
+ * GPIOs 1 to 7
+ */
+for (i = 0; i < FSL_IMX7_NUM_GPIOS; i++) {
+object_initialize(>gpio[i], sizeof(s->gpio[i]),
+  TYPE_IMX_GPIO);
+qdev_set_parent_bus(DEVICE(>gpio[i]), sysbus);
+snprintf(name, NAME_SIZE, "gpio%d", i);
+object_property_add_child(obj, name,
+  OBJECT(>gpio[i]), _fatal);
+}
+
+/*
+ * GPT1, 2, 3, 4
+ */
+for (i = 0; i < FSL_IMX7_NUM_GPTS; i++) {
+object_initialize(>gpt[i], sizeof(s->gpt[i]), TYPE_IMX7_GPT);
+qdev_set_parent_bus(DEVICE(>gpt[i]), sysbus);
+snprintf(name, NAME_SIZE, "gpt%d", i);
+object_property_add_child(obj, name, OBJECT(>gpt[i]),
+  _fatal);
+}
+
+/*
+ * CCM
+ */
+object_initialize(>ccm,

[Qemu-devel] [PATCH v5 07/14] i.MX: Add i.MX7 GPT variant

2018-02-06 Thread Andrey Smirnov

Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/timer/imx_gpt.c | 25 +
 include/hw/timer/imx_gpt.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/hw/timer/imx_gpt.c b/hw/timer/imx_gpt.c
index 4b9b54bf2e..65e4ee6bcf 100644
--- a/hw/timer/imx_gpt.c
+++ b/hw/timer/imx_gpt.c
@@ -113,6 +113,17 @@ static const IMXClk imx6_gpt_clocks[] = {
 CLK_HIGH,  /* 111 reference clock */
 };
 
+static const IMXClk imx7_gpt_clocks[] = {
+CLK_NONE,  /* 000 No clock source */
+CLK_IPG,   /* 001 ipg_clk, 532MHz*/
+CLK_IPG_HIGH,  /* 010 ipg_clk_highfreq */
+CLK_EXT,   /* 011 External clock */
+CLK_32k,   /* 100 ipg_clk_32k */
+CLK_HIGH,  /* 101 reference clock */
+CLK_NONE,  /* 110 not defined */
+CLK_NONE,  /* 111 not defined */
+};
+
 static void imx_gpt_set_freq(IMXGPTState *s)
 {
 uint32_t clksrc = extract32(s->cr, GPT_CR_CLKSRC_SHIFT, 3);
@@ -512,6 +523,13 @@ static void imx6_gpt_init(Object *obj)
 s->clocks = imx6_gpt_clocks;
 }
 
+static void imx7_gpt_init(Object *obj)
+{
+IMXGPTState *s = IMX_GPT(obj);
+
+s->clocks = imx7_gpt_clocks;
+}
+
 static const TypeInfo imx25_gpt_info = {
 .name = TYPE_IMX25_GPT,
 .parent = TYPE_SYS_BUS_DEVICE,
@@ -532,11 +550,18 @@ static const TypeInfo imx6_gpt_info = {
 .instance_init = imx6_gpt_init,
 };
 
+static const TypeInfo imx7_gpt_info = {
+.name = TYPE_IMX7_GPT,
+.parent = TYPE_IMX25_GPT,
+.instance_init = imx7_gpt_init,
+};
+
 static void imx_gpt_register_types(void)
 {
 type_register_static(_gpt_info);
 type_register_static(_gpt_info);
 type_register_static(_gpt_info);
+type_register_static(_gpt_info);
 }
 
 type_init(imx_gpt_register_types)
diff --git a/include/hw/timer/imx_gpt.h b/include/hw/timer/imx_gpt.h
index eac59b2a70..20ccb327c4 100644
--- a/include/hw/timer/imx_gpt.h
+++ b/include/hw/timer/imx_gpt.h
@@ -77,6 +77,7 @@
 #define TYPE_IMX25_GPT "imx25.gpt"
 #define TYPE_IMX31_GPT "imx31.gpt"
 #define TYPE_IMX6_GPT "imx6.gpt"
+#define TYPE_IMX7_GPT "imx7.gpt"
 
 #define TYPE_IMX_GPT TYPE_IMX25_GPT
 
-- 
2.14.3

[Qemu-devel] [PATCH v5 13/14] hw/arm: Move virt's PSCI DT fixup code to arm/boot.c

2018-02-06 Thread Andrey Smirnov

Move virt's PSCI DT fixup code to arm/boot.c and set this fixup to
happen automatically for every board that doesn't mark "psci-conduit"
as disabled. This way emulated boards other than "virt" that rely on
PSIC for SMP could benefit from that code.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/arm/boot.c | 65 +++
 hw/arm/virt.c | 61 ---
 2 files changed, 65 insertions(+), 61 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index c2720c8046..18ada9152c 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -384,6 +384,69 @@ static void set_kernel_args_old(const struct arm_boot_info 
*info)
 }
 }
 
+static void fdt_add_psci_node(void *fdt)
+{
+uint32_t cpu_suspend_fn;
+uint32_t cpu_off_fn;
+uint32_t cpu_on_fn;
+uint32_t migrate_fn;
+ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0));
+const char *psci_method;
+int64_t psci_conduit;
+
+psci_conduit = object_property_get_int(OBJECT(armcpu),
+   "psci-conduit",
+   _abort);
+switch (psci_conduit) {
+case QEMU_PSCI_CONDUIT_DISABLED:
+return;
+case QEMU_PSCI_CONDUIT_HVC:
+psci_method = "hvc";
+break;
+case QEMU_PSCI_CONDUIT_SMC:
+psci_method = "smc";
+break;
+default:
+g_assert_not_reached();
+}
+
+qemu_fdt_add_subnode(fdt, "/psci");
+if (armcpu->psci_version == 2) {
+const char comp[] = "arm,psci-0.2\0arm,psci";
+qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
+
+cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
+if (arm_feature(>env, ARM_FEATURE_AARCH64)) {
+cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
+cpu_on_fn = QEMU_PSCI_0_2_FN64_CPU_ON;
+migrate_fn = QEMU_PSCI_0_2_FN64_MIGRATE;
+} else {
+cpu_suspend_fn = QEMU_PSCI_0_2_FN_CPU_SUSPEND;
+cpu_on_fn = QEMU_PSCI_0_2_FN_CPU_ON;
+migrate_fn = QEMU_PSCI_0_2_FN_MIGRATE;
+}
+} else {
+qemu_fdt_setprop_string(fdt, "/psci", "compatible", "arm,psci");
+
+cpu_suspend_fn = QEMU_PSCI_0_1_FN_CPU_SUSPEND;
+cpu_off_fn = QEMU_PSCI_0_1_FN_CPU_OFF;
+cpu_on_fn = QEMU_PSCI_0_1_FN_CPU_ON;
+migrate_fn = QEMU_PSCI_0_1_FN_MIGRATE;
+}
+
+/* We adopt the PSCI spec's nomenclature, and use 'conduit' to refer
+ * to the instruction that should be used to invoke PSCI functions.
+ * However, the device tree binding uses 'method' instead, so that is
+ * what we should use here.
+ */
+qemu_fdt_setprop_string(fdt, "/psci", "method", psci_method);
+
+qemu_fdt_setprop_cell(fdt, "/psci", "cpu_suspend", cpu_suspend_fn);
+qemu_fdt_setprop_cell(fdt, "/psci", "cpu_off", cpu_off_fn);
+qemu_fdt_setprop_cell(fdt, "/psci", "cpu_on", cpu_on_fn);
+qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
+}
+
 /**
  * load_dtb() - load a device tree binary image into memory
  * @addr:   the address to load the image at
@@ -540,6 +603,8 @@ static int load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
 }
 }
 
+fdt_add_psci_node(fdt);
+
 if (binfo->modify_dtb) {
 binfo->modify_dtb(binfo, fdt);
 }
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b334c82eda..dbb3c8036a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -244,66 +244,6 @@ static void create_fdt(VirtMachineState *vms)
 }
 }
 
-static void fdt_add_psci_node(const VirtMachineState *vms)
-{
-uint32_t cpu_suspend_fn;
-uint32_t cpu_off_fn;
-uint32_t cpu_on_fn;
-uint32_t migrate_fn;
-void *fdt = vms->fdt;
-ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0));
-const char *psci_method;
-
-switch (vms->psci_conduit) {
-case QEMU_PSCI_CONDUIT_DISABLED:
-return;
-case QEMU_PSCI_CONDUIT_HVC:
-psci_method = "hvc";
-break;
-case QEMU_PSCI_CONDUIT_SMC:
-psci_method = "smc";
-break;
-default:
-g_assert_not_reached();
-}
-
-qemu_fdt_add_subnode(fdt, "/psci");
-if (armcpu->psci_version == 2) {
-const char comp[] = "arm,psci-0.2\0arm,psci";
-qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
-
-cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
-if (arm_feature(>env, ARM_FEATURE_AARCH64)) {
-cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
-cpu_on_fn = QEMU_PSCI_0_2_FN64_CPU_ON;
-migrate_fn = QEMU_PSCI_0_2_FN64_MIGRATE;
-} else {

[Qemu-devel] [PATCH v5 06/14] i.MX: Add code to emulate GPCv2 IP block

2018-02-06 Thread Andrey Smirnov

Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/intc/Makefile.objs   |   2 +-
 hw/intc/imx_gpcv2.c | 125 
 include/hw/intc/imx_gpcv2.h |  22 
 3 files changed, 148 insertions(+), 1 deletion(-)
 create mode 100644 hw/intc/imx_gpcv2.c
 create mode 100644 include/hw/intc/imx_gpcv2.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 571e094a14..0e9963f5ee 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -6,7 +6,7 @@ common-obj-$(CONFIG_XILINX) += xilinx_intc.o
 common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-pmu-iomod-intc.o
 common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-ipi.o
 common-obj-$(CONFIG_ETRAXFS) += etraxfs_pic.o
-common-obj-$(CONFIG_IMX) += imx_avic.o
+common-obj-$(CONFIG_IMX) += imx_avic.o imx_gpcv2.o
 common-obj-$(CONFIG_LM32) += lm32_pic.o
 common-obj-$(CONFIG_REALVIEW) += realview_gic.o
 common-obj-$(CONFIG_SLAVIO) += slavio_intctl.o
diff --git a/hw/intc/imx_gpcv2.c b/hw/intc/imx_gpcv2.c
new file mode 100644
index 00..4eb9ce2668
--- /dev/null
+++ b/hw/intc/imx_gpcv2.c
@@ -0,0 +1,125 @@
+/*
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * i.MX7 GPCv2 block emulation code
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/intc/imx_gpcv2.h"
+#include "qemu/log.h"
+
+#define GPC_PU_PGC_SW_PUP_REQ   0x0f8
+#define GPC_PU_PGC_SW_PDN_REQ   0x104
+
+#define USB_HSIC_PHY_SW_Pxx_REQ BIT(4)
+#define USB_OTG2_PHY_SW_Pxx_REQ BIT(3)
+#define USB_OTG1_PHY_SW_Pxx_REQ BIT(2)
+#define PCIE_PHY_SW_Pxx_REQ BIT(1)
+#define MIPI_PHY_SW_Pxx_REQ BIT(0)
+
+
+static void imx_gpcv2_reset(DeviceState *dev)
+{
+IMXGPCv2State *s = IMX_GPCV2(dev);
+
+memset(s->regs, 0, sizeof(s->regs));
+}
+
+static uint64_t imx_gpcv2_read(void *opaque, hwaddr offset,
+   unsigned size)
+{
+IMXGPCv2State *s = opaque;
+
+return s->regs[offset / sizeof(uint32_t)];
+}
+
+static void imx_gpcv2_write(void *opaque, hwaddr offset,
+uint64_t value, unsigned size)
+{
+IMXGPCv2State *s = opaque;
+const size_t idx = offset / sizeof(uint32_t);
+
+s->regs[idx] = value;
+
+/*
+ * Real HW will clear those bits once as a way to indicate that
+ * power up request is complete
+ */
+if (offset == GPC_PU_PGC_SW_PUP_REQ ||
+offset == GPC_PU_PGC_SW_PDN_REQ) {
+s->regs[idx] &= ~(USB_HSIC_PHY_SW_Pxx_REQ |
+  USB_OTG2_PHY_SW_Pxx_REQ |
+  USB_OTG1_PHY_SW_Pxx_REQ |
+  PCIE_PHY_SW_Pxx_REQ |
+  MIPI_PHY_SW_Pxx_REQ);
+}
+}
+
+static const struct MemoryRegionOps imx_gpcv2_ops = {
+.read = imx_gpcv2_read,
+.write = imx_gpcv2_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+/*
+ * Our device would not work correctly if the guest was doing
+ * unaligned access. This might not be a limitation on the real
+ * device but in practice there is no reason for a guest to access
+ * this device unaligned.
+ */
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false,
+},
+};
+
+static void imx_gpcv2_init(Object *obj)
+{
+SysBusDevice *sd = SYS_BUS_DEVICE(obj);
+IMXGPCv2State *s = IMX_GPCV2(obj);
+
+memory_region_init_io(>iomem,
+  obj,
+  _gpcv2_ops,
+  s,
+  TYPE_IMX_GPCV2 ".iomem",
+  sizeof(s->regs));
+sysbus_init_mmio(sd, >iomem);
+}
+
+static const VMStateDescription vmstate_imx_gpcv2 = {
+.name = TYPE_IMX_GPCV2,
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32_ARRAY(regs, IMXGPCv2State, GPC_NUM),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static void imx_gpcv2_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->reset = imx_gpcv2_reset;
+dc->vmsd  = _imx_gpcv2;
+dc->desc  = "i.MX GPCv2 Module";
+}
+
+static const TypeInfo imx_gpcv2_info = {
+.name  = TYPE_IMX_GPCV2,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(IMXGPCv2State),
+.instance_init = imx_gpcv2_init,
+.class_init= imx_gpcv2_class_init,
+};
+
+static void

[Qemu-devel] [PATCH v5 11/14] usb: Add basic code to emulate Chipidea USB IP

2018-02-06 Thread Andrey Smirnov

Add code to emulate Chipidea USB IP (used in i.MX SoCs). Tested to
work against:

-usb -drive if=none,id=stick,file=usb.img,format=raw -device \
 usb-storage,bus=usb-bus.0,drive=stick

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/usb/Makefile.objs  |   1 +
 hw/usb/chipidea.c | 176 ++
 include/hw/usb/chipidea.h |  16 +
 3 files changed, 193 insertions(+)
 create mode 100644 hw/usb/chipidea.c
 create mode 100644 include/hw/usb/chipidea.h

diff --git a/hw/usb/Makefile.objs b/hw/usb/Makefile.objs
index fbcd498c59..41be700812 100644
--- a/hw/usb/Makefile.objs
+++ b/hw/usb/Makefile.objs
@@ -12,6 +12,7 @@ common-obj-$(CONFIG_USB_XHCI_NEC) += hcd-xhci-nec.o
 common-obj-$(CONFIG_USB_MUSB) += hcd-musb.o
 
 obj-$(CONFIG_TUSB6010) += tusb6010.o
+obj-$(CONFIG_IMX)  += chipidea.o
 
 # emulated usb devices
 common-obj-$(CONFIG_USB) += dev-hub.o
diff --git a/hw/usb/chipidea.c b/hw/usb/chipidea.c
new file mode 100644
index 00..60d67f88b8
--- /dev/null
+++ b/hw/usb/chipidea.c
@@ -0,0 +1,176 @@
+/*
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * Chipidea USB block emulation code
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/usb/hcd-ehci.h"
+#include "hw/usb/chipidea.h"
+#include "qemu/log.h"
+
+enum {
+CHIPIDEA_USBx_DCIVERSION   = 0x000,
+CHIPIDEA_USBx_DCCPARAMS= 0x004,
+CHIPIDEA_USBx_DCCPARAMS_HC = BIT(8),
+};
+
+static uint64_t chipidea_read(void *opaque, hwaddr offset,
+   unsigned size)
+{
+return 0;
+}
+
+static void chipidea_write(void *opaque, hwaddr offset,
+uint64_t value, unsigned size)
+{
+}
+
+static const struct MemoryRegionOps chipidea_ops = {
+.read = chipidea_read,
+.write = chipidea_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+/*
+ * Our device would not work correctly if the guest was doing
+ * unaligned access. This might not be a limitation on the
+ * real device but in practice there is no reason for a guest
+ * to access this device unaligned.
+ */
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false,
+},
+};
+
+static uint64_t chipidea_dc_read(void *opaque, hwaddr offset,
+ unsigned size)
+{
+switch (offset) {
+case CHIPIDEA_USBx_DCIVERSION:
+return 0x1;
+case CHIPIDEA_USBx_DCCPARAMS:
+/*
+ * Real hardware (at least i.MX7) will also report the
+ * controller as "Device Capable" (and 8 supported endpoints),
+ * but there doesn't seem to be much point in doing so, since
+ * we don't emulate that part.
+ */
+return CHIPIDEA_USBx_DCCPARAMS_HC;
+}
+
+return 0;
+}
+
+static void chipidea_dc_write(void *opaque, hwaddr offset,
+  uint64_t value, unsigned size)
+{
+}
+
+static const struct MemoryRegionOps chipidea_dc_ops = {
+.read = chipidea_dc_read,
+.write = chipidea_dc_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+/*
+ * Our device would not work correctly if the guest was doing
+ * unaligned access. This might not be a limitation on the real
+ * device but in practice there is no reason for a guest to access
+ * this device unaligned.
+ */
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false,
+},
+};
+
+static void chipidea_init(Object *obj)
+{
+EHCIState *ehci = _BUS_EHCI(obj)->ehci;
+ChipideaState *ci = CHIPIDEA(obj);
+int i;
+
+for (i = 0; i < ARRAY_SIZE(ci->iomem); i++) {
+const struct {
+const char *name;
+hwaddr offset;
+uint64_t size;
+const struct MemoryRegionOps *ops;
+} regions[ARRAY_SIZE(ci->iomem)] = {
+/*
+ * Registers located between offsets 0x000 and 0xFC
+ */
+{
+.name   = TYPE_CHIPIDEA ".misc",
+.offset = 0x000,
+.size   = 0x100,
+.ops= _ops,
+},
+/*
+ * Registers located between offsets 0x1A4 and 0x1DC
+ */
+{
+.name   = TYPE_CHIPIDEA ".endpoints",
+.offset = 0x1A4,
+.size   = 0x1DC - 0x1A4 + 4,
+.ops= _ops,
+

[Qemu-devel] [PATCH v5 05/14] i.MX: Add code to emulate i.MX7 SNVS IP-block

2018-02-06 Thread Andrey Smirnov

Add code to emulate SNVS IP-block. Currently only the bits needed to
be able to emulate machine shutdown are implemented.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/misc/Makefile.objs   |  1 +
 hw/misc/imx7_snvs.c | 83 +
 include/hw/misc/imx7_snvs.h | 35 +++
 3 files changed, 119 insertions(+)
 create mode 100644 hw/misc/imx7_snvs.c
 create mode 100644 include/hw/misc/imx7_snvs.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 4b2b705a6c..019886912c 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -35,6 +35,7 @@ obj-$(CONFIG_IMX) += imx6_ccm.o
 obj-$(CONFIG_IMX) += imx6_src.o
 obj-$(CONFIG_IMX) += imx7_ccm.o
 obj-$(CONFIG_IMX) += imx2_wdt.o
+obj-$(CONFIG_IMX) += imx7_snvs.o
 obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o
 obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o
 obj-$(CONFIG_MAINSTONE) += mst_fpga.o
diff --git a/hw/misc/imx7_snvs.c b/hw/misc/imx7_snvs.c
new file mode 100644
index 00..4df482b282
--- /dev/null
+++ b/hw/misc/imx7_snvs.c
@@ -0,0 +1,83 @@
+/*
+ * IMX7 Secure Non-Volatile Storage
+ *
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ * Bare minimum emulation code needed to support being able to shut
+ * down linux guest gracefully.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/misc/imx7_snvs.h"
+#include "qemu/log.h"
+#include "sysemu/sysemu.h"
+
+static uint64_t imx7_snvs_read(void *opaque, hwaddr offset, unsigned size)
+{
+return 0;
+}
+
+static void imx7_snvs_write(void *opaque, hwaddr offset,
+uint64_t v, unsigned size)
+{
+const uint32_t value = v;
+const uint32_t mask  = SNVS_LPCR_TOP | SNVS_LPCR_DP_EN;
+
+if (offset == SNVS_LPCR && ((value & mask) == mask)) {
+qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
+}
+}
+
+static const struct MemoryRegionOps imx7_snvs_ops = {
+.read = imx7_snvs_read,
+.write = imx7_snvs_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+/*
+ * Our device would not work correctly if the guest was doing
+ * unaligned access. This might not be a limitation on the real
+ * device but in practice there is no reason for a guest to access
+ * this device unaligned.
+ */
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false,
+},
+};
+
+static void imx7_snvs_init(Object *obj)
+{
+SysBusDevice *sd = SYS_BUS_DEVICE(obj);
+IMX7SNVSState *s = IMX7_SNVS(obj);
+
+memory_region_init_io(>mmio, obj, _snvs_ops, s,
+  TYPE_IMX7_SNVS, 0x1000);
+
+sysbus_init_mmio(sd, >mmio);
+}
+
+static void imx7_snvs_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->desc  = "i.MX7 Secure Non-Volatile Storage Module";
+}
+
+static const TypeInfo imx7_snvs_info = {
+.name  = TYPE_IMX7_SNVS,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(IMX7SNVSState),
+.instance_init = imx7_snvs_init,
+.class_init= imx7_snvs_class_init,
+};
+
+static void imx7_snvs_register_type(void)
+{
+type_register_static(_snvs_info);
+}
+type_init(imx7_snvs_register_type)
diff --git a/include/hw/misc/imx7_snvs.h b/include/hw/misc/imx7_snvs.h
new file mode 100644
index 00..255f8f26f9
--- /dev/null
+++ b/include/hw/misc/imx7_snvs.h
@@ -0,0 +1,35 @@
+/*
+ * Copyright (c) 2017, Impinj, Inc.
+ *
+ * i.MX7 SNVS block emulation code
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef IMX7_SNVS_H
+#define IMX7_SNVS_H
+
+#include "qemu/bitops.h"
+#include "hw/sysbus.h"
+
+
+enum IMX7SNVSRegisters {
+SNVS_LPCR = 0x38,
+SNVS_LPCR_TOP   = BIT(6),
+SNVS_LPCR_DP_EN = BIT(5)
+};
+
+#define TYPE_IMX7_SNVS "imx7.snvs"
+#define IMX7_SNVS(obj) OBJECT_CHECK(IMX7SNVSState, (obj), TYPE_IMX7_SNVS)
+
+typedef struct IMX7SNVSState {
+/*  */
+SysBusDevice parent_obj;
+
+MemoryRegion mmio;
+} IMX7SNVSState;
+
+#endif /* IMX7_SNVS_H */
-- 
2.14.3

[Qemu-devel] [PATCH v5 01/14] sdhci: Add i.MX specific subtype of SDHCI

2018-02-06 Thread Andrey Smirnov

IP block found on several generations of i.MX family does not use
vanilla SDHCI implementation and it comes with a number of quirks.

Introduce i.MX SDHCI subtype of SDHCI block to add code necessary to
support unmodified Linux guest driver.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/sd/sdhci-internal.h |  20 +
 hw/sd/sdhci.c  | 230 -
 include/hw/sd/sdhci.h  |  13 +++
 3 files changed, 262 insertions(+), 1 deletion(-)

diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h
index fc807f08f3..f91b73af59 100644
--- a/hw/sd/sdhci-internal.h
+++ b/hw/sd/sdhci-internal.h
@@ -84,12 +84,18 @@
 
 /* R/W Host control Register 0x0 */
 #define SDHC_HOSTCTL   0x28
+#define SDHC_CTRL_LED  0x01
 #define SDHC_CTRL_DMA_CHECK_MASK   0x18
 #define SDHC_CTRL_SDMA 0x00
 #define SDHC_CTRL_ADMA1_32 0x08
 #define SDHC_CTRL_ADMA2_32 0x10
 #define SDHC_CTRL_ADMA2_64 0x18
 #define SDHC_DMA_TYPE(x)   ((x) & SDHC_CTRL_DMA_CHECK_MASK)
+#define SDHC_CTRL_4BITBUS  0x02
+#define SDHC_CTRL_8BITBUS  0x20
+#define SDHC_CTRL_CDTEST_INS   0x40
+#define SDHC_CTRL_CDTEST_EN0x80
+
 
 /* R/W Power Control Register 0x0 */
 #define SDHC_PWRCON0x29
@@ -226,4 +232,18 @@ enum {
 sdhc_gap_write  = 2   /* SDHC stopped at block gap during write operation 
*/
 };
 
+extern const VMStateDescription sdhci_vmstate;
+
+
+#define ESDHC_MIX_CTRL  0x48
+#define ESDHC_VENDOR_SPEC   0xc0
+#define ESDHC_DLL_CTRL  0x60
+
+#define ESDHC_TUNING_CTRL   0xcc
+#define ESDHC_TUNE_CTRL_STATUS  0x68
+#define ESDHC_WTMK_LVL  0x44
+
+#define ESDHC_CTRL_4BITBUS  (0x1 << 1)
+#define ESDHC_CTRL_8BITBUS  (0x2 << 1)
+
 #endif
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index fac7fa5c72..7c9683b47d 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -244,7 +244,8 @@ static void sdhci_send_command(SDHCIState *s)
 }
 }
 
-if ((s->norintstsen & SDHC_NISEN_TRSCMP) &&
+if (!(s->quirks & SDHCI_QUIRK_NO_BUSY_IRQ) &&
+(s->norintstsen & SDHC_NISEN_TRSCMP) &&
 (s->cmdreg & SDHC_CMD_RESPONSE) == SDHC_CMD_RSP_WITH_BUSY) {
 s->norintsts |= SDHC_NIS_TRSCMP;
 }
@@ -1189,6 +1190,8 @@ static void sdhci_initfn(SDHCIState *s)
 
 s->insert_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, 
sdhci_raise_insertion_irq, s);
 s->transfer_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, sdhci_data_transfer, 
s);
+
+s->io_ops = _mmio_ops;
 }
 
 static void sdhci_uninitfn(SDHCIState *s)
@@ -1396,6 +1399,10 @@ static void sdhci_sysbus_realize(DeviceState *dev, Error 
** errp)
 }
 
 sysbus_init_irq(sbd, >irq);
+
+memory_region_init_io(>iomem, OBJECT(s), s->io_ops, s, "sdhci",
+SDHC_REGISTERS_MAP_SIZE);
+
 sysbus_init_mmio(sbd, >iomem);
 }
 
@@ -1447,11 +1454,232 @@ static const TypeInfo sdhci_bus_info = {
 .class_init = sdhci_bus_class_init,
 };
 
+static uint64_t usdhc_read(void *opaque, hwaddr offset, unsigned size)
+{
+SDHCIState *s = SYSBUS_SDHCI(opaque);
+uint32_t ret;
+uint16_t hostctl;
+
+switch (offset) {
+default:
+return sdhci_read(opaque, offset, size);
+
+case SDHC_HOSTCTL:
+/*
+ * For a detailed explanation on the following bit
+ * manipulation code see comments in a similar part of
+ * usdhc_write()
+ */
+hostctl = SDHC_DMA_TYPE(s->hostctl) << (8 - 3);
+
+if (s->hostctl & SDHC_CTRL_8BITBUS) {
+hostctl |= ESDHC_CTRL_8BITBUS;
+}
+
+if (s->hostctl & SDHC_CTRL_4BITBUS) {
+hostctl |= ESDHC_CTRL_4BITBUS;
+}
+
+ret  = hostctl;
+ret |= (uint32_t)s->blkgap << 16;
+ret |= (uint32_t)s->wakcon << 24;
+
+break;
+
+case ESDHC_DLL_CTRL:
+case ESDHC_TUNE_CTRL_STATUS:
+case 0x6c:
+case ESDHC_TUNING_CTRL:
+case ESDHC_VENDOR_SPEC:
+case ESDHC_MIX_CTRL:
+case ESDHC_WTMK_LVL:
+ret = 0;
+break;
+}
+
+return ret;
+}
+
+static void
+usdhc_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
+{
+SDHCIState *s = SYSBUS_SDHCI(opaque);
+uint8_t hostctl;
+uint32_t value = (uint32_t)val;
+
+switch (offset) {
+case ESDHC_DLL_CTRL:
+case ESDHC_TUNE_CTRL_STATUS:
+case 0x6c:
+case ESDHC_TUNING_CTRL:
+case ESDHC_WTMK_LVL:
+case ESDHC_VENDOR_SPEC:
+break;
+
+

[Qemu-devel] [PATCH v5 04/14] i.MX: Add code to emulate i.MX2 watchdog IP block

2018-02-06 Thread Andrey Smirnov

Add enough code to emulate i.MX2 watchdog IP block so it would be
possible to reboot the machine running Linux Guest.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/misc/Makefile.objs  |  1 +
 hw/misc/imx2_wdt.c | 89 ++
 include/hw/misc/imx2_wdt.h | 33 +
 3 files changed, 123 insertions(+)
 create mode 100644 hw/misc/imx2_wdt.c
 create mode 100644 include/hw/misc/imx2_wdt.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index a28e5e49b0..4b2b705a6c 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -34,6 +34,7 @@ obj-$(CONFIG_IMX) += imx25_ccm.o
 obj-$(CONFIG_IMX) += imx6_ccm.o
 obj-$(CONFIG_IMX) += imx6_src.o
 obj-$(CONFIG_IMX) += imx7_ccm.o
+obj-$(CONFIG_IMX) += imx2_wdt.o
 obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o
 obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o
 obj-$(CONFIG_MAINSTONE) += mst_fpga.o
diff --git a/hw/misc/imx2_wdt.c b/hw/misc/imx2_wdt.c
new file mode 100644
index 00..e47e442592
--- /dev/null
+++ b/hw/misc/imx2_wdt.c
@@ -0,0 +1,89 @@
+/*
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * i.MX2 Watchdog IP block
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/bitops.h"
+#include "sysemu/watchdog.h"
+
+#include "hw/misc/imx2_wdt.h"
+
+#define IMX2_WDT_WCR_WDABIT(5)  /* -> External Reset WDOG_B */
+#define IMX2_WDT_WCR_SRSBIT(4)  /* -> Software Reset Signal */
+
+static uint64_t imx2_wdt_read(void *opaque, hwaddr addr,
+  unsigned int size)
+{
+return 0;
+}
+
+static void imx2_wdt_write(void *opaque, hwaddr addr,
+   uint64_t value, unsigned int size)
+{
+if (addr == IMX2_WDT_WCR &&
+(value & (IMX2_WDT_WCR_WDA | IMX2_WDT_WCR_SRS))) {
+watchdog_perform_action();
+}
+}
+
+static const MemoryRegionOps imx2_wdt_ops = {
+.read  = imx2_wdt_read,
+.write = imx2_wdt_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+/*
+ * Our device would not work correctly if the guest was doing
+ * unaligned access. This might not be a limitation on the
+ * real device but in practice there is no reason for a guest
+ * to access this device unaligned.
+ */
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false,
+},
+};
+
+static void imx2_wdt_realize(DeviceState *dev, Error **errp)
+{
+IMX2WdtState *s = IMX2_WDT(dev);
+
+memory_region_init_io(>mmio, OBJECT(dev),
+  _wdt_ops, s,
+  TYPE_IMX2_WDT".mmio",
+  IMX2_WDT_REG_NUM * sizeof(uint16_t));
+sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mmio);
+}
+
+static void imx2_wdt_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->realize = imx2_wdt_realize;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static const TypeInfo imx2_wdt_info = {
+.name  = TYPE_IMX2_WDT,
+.parent= TYPE_SYS_BUS_DEVICE,
+.instance_size = sizeof(IMX2WdtState),
+.class_init= imx2_wdt_class_init,
+};
+
+static WatchdogTimerModel model = {
+.wdt_name = "imx2-watchdog",
+.wdt_description = "i.MX2 Watchdog",
+};
+
+static void imx2_wdt_register_type(void)
+{
+watchdog_add_model();
+type_register_static(_wdt_info);
+}
+type_init(imx2_wdt_register_type)
diff --git a/include/hw/misc/imx2_wdt.h b/include/hw/misc/imx2_wdt.h
new file mode 100644
index 00..8afc99a10e
--- /dev/null
+++ b/include/hw/misc/imx2_wdt.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright (c) 2017, Impinj, Inc.
+ *
+ * i.MX2 Watchdog IP block
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef IMX2_WDT_H
+#define IMX2_WDT_H
+
+#include "hw/sysbus.h"
+
+#define TYPE_IMX2_WDT "imx2.wdt"
+#define IMX2_WDT(obj) OBJECT_CHECK(IMX2WdtState, (obj), TYPE_IMX2_WDT)
+
+enum IMX2WdtRegisters {
+IMX2_WDT_WCR = 0x,
+IMX2_WDT_REG_NUM = 0x0008 / sizeof(uint16_t) + 1,
+};
+
+
+typedef struct IMX2WdtState {
+/*  */
+SysBusDevice parent_obj;
+
+MemoryRegion mmio;
+} IMX2WdtState;
+
+#endif /* IMX7_SNVS_H */
-- 
2.14.3

[Qemu-devel] [PATCH v5 02/14] hw: i.MX: Convert i.MX6 to use TYPE_IMX_USDHC

2018-02-06 Thread Andrey Smirnov

Convert i.MX6 to use TYPE_IMX_USDHC since that's what real HW comes
with.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Andrey Smirnov 
---
 hw/arm/fsl-imx6.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index b0d4088290..e6559a8b12 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -93,7 +93,7 @@ static void fsl_imx6_init(Object *obj)
 }
 
 for (i = 0; i < FSL_IMX6_NUM_ESDHCS; i++) {
-object_initialize(>esdhc[i], sizeof(s->esdhc[i]), 
TYPE_SYSBUS_SDHCI);
+object_initialize(>esdhc[i], sizeof(s->esdhc[i]), TYPE_IMX_USDHC);
 qdev_set_parent_bus(DEVICE(>esdhc[i]), sysbus_get_default());
 snprintf(name, NAME_SIZE, "sdhc%d", i + 1);
 object_property_add_child(obj, name, OBJECT(>esdhc[i]), NULL);
-- 
2.14.3

[Qemu-devel] [PATCH v5 03/14] i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks

2018-02-06 Thread Andrey Smirnov

Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell 
Cc: Jason Wang 
Cc: Philippe Mathieu-Daudé 
Cc: Marcel Apfelbaum 
Cc: Michael S. Tsirkin 
Cc: qemu-devel@nongnu.org
Cc: qemu-...@nongnu.org
Cc: yurov...@gmail.com
Reviewed-by: Peter Maydell 
Signed-off-by: Andrey Smirnov 
---
 hw/misc/Makefile.objs  |   1 +
 hw/misc/imx7_ccm.c | 277 +
 include/hw/misc/imx7_ccm.h | 139 +++
 3 files changed, 417 insertions(+)
 create mode 100644 hw/misc/imx7_ccm.c
 create mode 100644 include/hw/misc/imx7_ccm.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index d517f83e81..a28e5e49b0 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -33,6 +33,7 @@ obj-$(CONFIG_IMX) += imx31_ccm.o
 obj-$(CONFIG_IMX) += imx25_ccm.o
 obj-$(CONFIG_IMX) += imx6_ccm.o
 obj-$(CONFIG_IMX) += imx6_src.o
+obj-$(CONFIG_IMX) += imx7_ccm.o
 obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o
 obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o
 obj-$(CONFIG_MAINSTONE) += mst_fpga.o
diff --git a/hw/misc/imx7_ccm.c b/hw/misc/imx7_ccm.c
new file mode 100644
index 00..d90c48bfec
--- /dev/null
+++ b/hw/misc/imx7_ccm.c
@@ -0,0 +1,277 @@
+/*
+ * Copyright (c) 2018, Impinj, Inc.
+ *
+ * i.MX7 CCM, PMU and ANALOG IP blocks emulation code
+ *
+ * Author: Andrey Smirnov 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+
+#include "hw/misc/imx7_ccm.h"
+
+static void imx7_analog_reset(DeviceState *dev)
+{
+IMX7AnalogState *s = IMX7_ANALOG(dev);
+
+memset(s->pmu, 0, sizeof(s->pmu));
+memset(s->analog, 0, sizeof(s->analog));
+
+s->analog[ANALOG_PLL_ARM] = 0x2042;
+s->analog[ANALOG_PLL_DDR] = 0x0060302c;
+s->analog[ANALOG_PLL_DDR_SS]  = 0x;
+s->analog[ANALOG_PLL_DDR_NUM] = 0x06aaac4d;
+s->analog[ANALOG_PLL_DDR_DENOM]   = 0x13ec;
+s->analog[ANALOG_PLL_480] = 0x2000;
+s->analog[ANALOG_PLL_480A]= 0x52605a56;
+s->analog[ANALOG_PLL_480B]= 0x52525216;
+s->analog[ANALOG_PLL_ENET]= 0x1fc0;
+s->analog[ANALOG_PLL_AUDIO]   = 0x0001301b;
+s->analog[ANALOG_PLL_AUDIO_SS]= 0x;
+s->analog[ANALOG_PLL_AUDIO_NUM]   = 0x05f5e100;
+s->analog[ANALOG_PLL_AUDIO_DENOM] = 0x2964619c;
+s->analog[ANALOG_PLL_VIDEO]   = 0x0008201b;
+s->analog[ANALOG_PLL_VIDEO_SS]= 0x;
+s->analog[ANALOG_PLL_VIDEO_NUM]   = 0xf699;
+s->analog[ANALOG_PLL_VIDEO_DENOM] = 0x000f4240;
+s->analog[ANALOG_PLL_MISC0]   = 0x;
+
+/* all PLLs need to be locked */
+s->analog[ANALOG_PLL_ARM]   |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_DDR]   |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_480]   |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_480A]  |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_480B]  |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_ENET]  |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_AUDIO] |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_VIDEO] |= ANALOG_PLL_LOCK;
+s->analog[ANALOG_PLL_MISC0] |= ANALOG_PLL_LOCK;
+
+/*
+ * Since I couldn't find any info about this in the reference
+ * manual the value of this register is based strictly on matching
+ * what Linux kernel expects it to be.
+ */
+s->analog[ANALOG_DIGPROG]  = 0x72;
+/*
+ * Set revision to be 1.0 (Arbitrary choice, no particular
+ * reason).
+ */
+s->analog[ANALOG_DIGPROG] |= 0x10;
+}
+
+static void imx7_ccm_reset(DeviceState *dev)
+{
+IMX7CCMState *s = IMX7_CCM(dev);
+
+memset(s->ccm, 0, sizeof(s->ccm));
+}
+
+#define CCM_INDEX(offset)   (((offset) & ~(hwaddr)0xF) / sizeof(uint32_t))
+#define CCM_BITOP(offset)   ((offset) & (hwaddr)0xF)
+
+enum {
+CCM_BITOP_NONE = 0x00,
+CCM_BITOP_SET  = 0x04,
+CCM_BITOP_CLR  = 0x08,
+CCM_BITOP_TOG  = 0x0C,
+};
+
+static uint64_t imx7_set_clr_tog_read(void *opaque, hwaddr offset,
+  unsigned size)
+{
+const uint32_t *mmio = opaque;
+
+return mmio[CCM_INDEX(offset)];
+}
+
+static void imx7_set_clr_tog_write(void *opaque, hwaddr offset,
+   uint64_t value, unsigned size)
+{
+const uint8_t  bitop = CCM_BITOP(offset);
+const uint32_t index = CCM_INDEX(offset);
+uint32_t *mmio = opaque;
+
+switch (bitop) {
+case CCM_BITOP_NONE:
+mmio[index]  = value;
+break;
+case CCM_BITOP_SET:
+mmio[index] |= value;
+break;
+case CCM_BITOP_CLR:
+mmio[index] &= ~value;
+break;
+case CCM_BITOP_TOG:
+mmio[index] ^= value;
+break;
+

[Qemu-devel] [PATCH v5 00/14] Initial i.MX7 support

2018-02-06 Thread Andrey Smirnov

Hi everyone,

This v5 of the patch series containing the work that I've done in
order to enable support for i.MX7 emulation in QEMU.

As the one before last commit in the series states the supported i.MX7
features are:

* up to 2 Cortex A9 cores (SMP works with PSCI)
* A7 MPCORE (identical to A15 MPCORE)
* 4 GPTs modules
* 7 GPIO controllers
* 2 IOMUXC controllers
* 1 CCM module
* 1 SVNS module
* 1 SRC module
* 1 GPCv2 controller
* 4 eCSPI controllers
* 4 I2C controllers
* 7 i.MX UART controllers
* 2 FlexCAN controllers
* 2 Ethernet controllers (FEC)
* 3 SD controllers (USDHC)
* 4 WDT modules
* 1 SDMA module
* 1 GPR module
* 2 USBMISC modules
* 2 ADC modules
* 1 PCIe controller
* 3 USB controllers
* 1 LCD controller
* 1 ARMv7 DAP IP block

Feedback is welcome!

Changes since [v4]:

- Rebase patchest on top of latest QEMU master

- Reworked PCIE emulation code to create MemoryRegions
  only once

- Fixed incorrect usages of PCI instead of PCIE

- Fixed device class reported by PCIE bridge

- Added patch to make pci_data_read() and pci_data_write() usable
  for PCIE devices as well

- Converted PCIE code to use pci_data_read() and pci_data_write()

- Added VMStateDescription code for PCIE

- Collected Reviewed-by tag from Philippe

Changes since [v3]:

- Changes to FEC were split into a separate set and merged to master

- Patchest is rebased on latest master

- Converted to use PSCI DT fixup code that is shared with virt
  platform (now relocated to live in arm/boot.c)

- Large number of dummy block were converted to use
  create_unimplemented_device() as opposed to its own dedicated
  type

- Incorporated varios small feedback items

- Collected Reviewed-by tags from Peter

Changes since [v2]:

- Added stubs for more blocks that were causing memory
  transactions when booting Linux guest as were revealed by
  additional testing of the patchest

- Added proper USB emulation code, so now it should be possible to
  emulated guest's USB bus

Changes since [v1]:

- Patchset no longer relies on "ignore_memory_transaction_failures = false"
  for its functionality

- As a consequnce of implementing the above a number of patches
  implementing dummy IP block emulation as well as PCIe emulation
  patches that I alluded to in [v1] are now included in this patch
  series

- "has_el3" property is no longer being set to "false" as a part
  of intialization of A7 CPU. I couldn't reproduce the issues that
  I thought I was having, so I just dropped that code.

- A number of smaller feedback items from Peter and other has been
  incorporated into the patches.


Thanks,
Andrey Smirnov

[v4] https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg03264.html
[v3] https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg04236.html
[v2] https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg05516.html
[v1] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg04770.html

Andrey Smirnov (14):
  sdhci: Add i.MX specific subtype of SDHCI
  hw: i.MX: Convert i.MX6 to use TYPE_IMX_USDHC
  i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks
  i.MX: Add code to emulate i.MX2 watchdog IP block
  i.MX: Add code to emulate i.MX7 SNVS IP-block
  i.MX: Add code to emulate GPCv2 IP block
  i.MX: Add i.MX7 GPT variant
  i.MX: Add implementation of i.MX7 GPR IP block
  pci: Use pci_config_size in pci_data_* accessors
  pci: Add support for Designware IP block
  usb: Add basic code to emulate Chipidea USB IP
  i.MX: Add i.MX7 SOC implementation.
  hw/arm: Move virt's PSCI DT fixup code to arm/boot.c
  Implement support for i.MX7 Sabre board

 default-configs/arm-softmmu.mak  |   3 +
 hw/arm/Makefile.objs |   3 +
 hw/arm/boot.c|  65 
 hw/arm/fsl-imx6.c|   2 +-
 hw/arm/fsl-imx7.c| 580 ++
 hw/arm/mcimx7d-sabre.c   |  90 +
 hw/arm/virt.c|  61 
 hw/intc/Makefile.objs|   2 +-
 hw/intc/imx_gpcv2.c  | 125 +++
 hw/misc/Makefile.objs|   4 +
 hw/misc/imx2_wdt.c   |  89 +
 hw/misc/imx7_ccm.c   | 277 ++
 hw/misc/imx7_gpr.c   | 124 +++
 hw/misc/imx7_snvs.c  |  83 +
 hw/misc/trace-events |   4 +
 hw/pci-host/Makefile.objs|   2 +
 hw/pci-host/designware.c | 759 +++
 hw/pci/pci_host.c|  13 +-
 hw/sd/sdhci-internal.h   |  20 ++
 hw/sd/sdhci.c| 230 +++-
 hw/timer/imx_gpt.c   |  25 ++
 hw/usb/Makefile.objs |   1 +
 hw/usb/chipidea.c| 176 +
 include/hw/arm/fsl-imx7.h| 221

Re: [Qemu-devel] [PATCH v4 09/14] pci: Add support for Designware IP block

2018-02-06 Thread Andrey Smirnov

On Wed, Jan 31, 2018 at 4:13 AM, Marcel Apfelbaum  wrote:
> On 30/01/2018 19:49, Andrey Smirnov wrote:
>> On Tue, Jan 30, 2018 at 5:18 AM, Marcel Apfelbaum
>>  wrote:
>>> Hi Andrei,
>>>
>>> Sorry for letting you wait,
>>> I have some comments/questions below.
>>>
>>>
>>> On 16/01/2018 3:37, Andrey Smirnov wrote:

 Add code needed to get a functional PCI subsytem when using in
 conjunction with upstream Linux guest (4.13+). Tested to work against
 "e1000e" (network adapter, using MSI interrupts) as well as
 "usb-ehci" (USB controller, using legacy PCI interrupts).

 Cc: Peter Maydell 
 Cc: Jason Wang 
 Cc: Philippe Mathieu-Daudé 
 Cc: qemu-devel@nongnu.org
 Cc: qemu-...@nongnu.org
 Cc: yurov...@gmail.com
 Signed-off-by: Andrey Smirnov 
 ---
   default-configs/arm-softmmu.mak  |   2 +
   hw/pci-host/Makefile.objs|   2 +
   hw/pci-host/designware.c | 618
 +++
   include/hw/pci-host/designware.h |  93 ++
   include/hw/pci/pci_ids.h |   2 +
   5 files changed, 717 insertions(+)
   create mode 100644 hw/pci-host/designware.c
   create mode 100644 include/hw/pci-host/designware.h

 diff --git a/default-configs/arm-softmmu.mak
 b/default-configs/arm-softmmu.mak
 index b0d6e65038..0c5ae914ed 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
 @@ -132,3 +132,5 @@ CONFIG_GPIO_KEY=y
   CONFIG_MSF2=y
   CONFIG_FW_CFG_DMA=y
   CONFIG_XILINX_AXI=y
 +CONFIG_PCI_DESIGNWARE=y
 +
 diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
 index 9c7909cf44..0e2c0a123b 100644
 --- a/hw/pci-host/Makefile.objs
 +++ b/hw/pci-host/Makefile.objs
 @@ -17,3 +17,5 @@ common-obj-$(CONFIG_PCI_PIIX) += piix.o
   common-obj-$(CONFIG_PCI_Q35) += q35.o
   common-obj-$(CONFIG_PCI_GENERIC) += gpex.o
   common-obj-$(CONFIG_PCI_XILINX) += xilinx-pcie.o
 +
 +common-obj-$(CONFIG_PCI_DESIGNWARE) += designware.o
 diff --git a/hw/pci-host/designware.c b/hw/pci-host/designware.c
 new file mode 100644
 index 00..98fff5e5f3
 --- /dev/null
 +++ b/hw/pci-host/designware.c
 @@ -0,0 +1,618 @@
 +/*
 + * Copyright (c) 2017, Impinj, Inc.
>>>
>>> 2018 :)
>>>
 + *
 + * Designware PCIe IP block emulation
 + *
 + * This library is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU Lesser General Public
 + * License as published by the Free Software Foundation; either
 + * version 2 of the License, or (at your option) any later version.
 + *
 + * This library is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 + * Lesser General Public License for more details.
 + *
 + * You should have received a copy of the GNU Lesser General Public
 + * License along with this library; if not, see
 + * .
 + */
 +
 +#include "qemu/osdep.h"
 +#include "qapi/error.h"
 +#include "hw/pci/msi.h"
 +#include "hw/pci/pci_bridge.h"
 +#include "hw/pci/pci_host.h"
 +#include "hw/pci/pcie_port.h"
 +#include "hw/pci-host/designware.h"
 +
 +#define PCIE_PORT_LINK_CONTROL  0x710
 +
 +#define PCIE_PHY_DEBUG_R1   0x72C
 +#define PCIE_PHY_DEBUG_R1_XMLH_LINK_UP  BIT(4)
 +
 +#define PCIE_LINK_WIDTH_SPEED_CONTROL   0x80C
 +
 +#define PCIE_MSI_ADDR_LO0x820
 +#define PCIE_MSI_ADDR_HI0x824
 +#define PCIE_MSI_INTR0_ENABLE   0x828
 +#define PCIE_MSI_INTR0_MASK 0x82C
 +#define PCIE_MSI_INTR0_STATUS   0x830
 +
 +#define PCIE_ATU_VIEWPORT   0x900
 +#define PCIE_ATU_REGION_INBOUND (0x1 << 31)
 +#define PCIE_ATU_REGION_OUTBOUND(0x0 << 31)
 +#define PCIE_ATU_REGION_INDEX2  (0x2 << 0)
 +#define PCIE_ATU_REGION_INDEX1  (0x1 << 0)
 +#define PCIE_ATU_REGION_INDEX0  (0x0 << 0)
 +#define PCIE_ATU_CR10x904
 +#define PCIE_ATU_TYPE_MEM   (0x0 << 0)
 +#define PCIE_ATU_TYPE_IO(0x2 << 0)
 +#define PCIE_ATU_TYPE_CFG0  (0x4 << 0)
 +#define PCIE_ATU_TYPE_CFG1  (0x5 << 0)
 +#define PCIE_ATU_CR20x908
 +#define PCIE_ATU_ENABLE (0x1 << 31)
 +#define PCIE_ATU_BAR_MODE_ENABLE(0x1 << 30)
 +#define PCIE_ATU_LOWER_BASE 0x90C
 +#define PCIE_ATU_UPPER_BASE 0x910
 +#define PCIE_ATU_LIMIT

Re: [Qemu-devel] [RFC PATCH] vfio/pci: Add ioeventfd support

2018-02-06 Thread Alexey Kardashevskiy

On 07/02/18 11:08, Alex Williamson wrote:
> The ioeventfd here is actually irqfd handling of an ioeventfd such as
> supported in KVM.  A user is able to pre-program a device write to
> occur when the eventfd triggers.  This is yet another instance of
> eventfd-irqfd triggering between KVM and vfio.  The impetus for this
> is high frequency writes to pages which are virtualized in QEMU.
> Enabling this near-direct write path for selected registers within
> the virtualized page can improve performance and reduce overhead.
> Specifically this is initially targeted at NVIDIA graphics cards where
> the driver issues a write to an MMIO register within a virtualized
> region in order to allow the MSI interrupt to re-trigger.
> 
> Signed-off-by: Alex Williamson 
> ---
>  drivers/vfio/pci/vfio_pci.c |   33 +++
>  drivers/vfio/pci/vfio_pci_private.h |   14 +++
>  drivers/vfio/pci/vfio_pci_rdwr.c|  165 
> ---
>  include/uapi/linux/vfio.h   |   24 +
>  4 files changed, 224 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index f041b1a6cf66..c8e7297a61a3 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -302,6 +302,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
>  {
>   struct pci_dev *pdev = vdev->pdev;
>   struct vfio_pci_dummy_resource *dummy_res, *tmp;
> + struct vfio_pci_ioeventfd *ioeventfd, *ioeventfd_tmp;
>   int i, bar;
>  
>   /* Stop the device from further DMA */
> @@ -311,6 +312,14 @@ static void vfio_pci_disable(struct vfio_pci_device 
> *vdev)
>   VFIO_IRQ_SET_ACTION_TRIGGER,
>   vdev->irq_type, 0, 0, NULL);
>  
> + /* Device closed, don't need mutex here */
> + list_for_each_entry_safe(ioeventfd, ioeventfd_tmp,
> +  >ioeventfds_list, next) {
> + vfio_virqfd_disable(>virqfd);
> + list_del(>next);
> + kfree(ioeventfd);
> + }
> +
>   vdev->virq_disabled = false;
>  
>   for (i = 0; i < vdev->num_regions; i++)
> @@ -1039,6 +1048,28 @@ static long vfio_pci_ioctl(void *device_data,
>  
>   kfree(groups);
>   return ret;
> + } else if (cmd == VFIO_DEVICE_IOEVENTFD) {
> + struct vfio_device_ioeventfd ioeventfd;
> + int count;
> +
> + minsz = offsetofend(struct vfio_device_ioeventfd, fd);
> +
> + if (copy_from_user(, (void __user*)arg, minsz))
> + return -EFAULT;
> +
> + if (ioeventfd.argsz < minsz)
> + return -EINVAL;
> +
> + if (ioeventfd.flags & ~VFIO_DEVICE_IOEVENTFD_SIZE_MASK)
> + return -EINVAL;
> +
> + count = ioeventfd.flags & VFIO_DEVICE_IOEVENTFD_SIZE_MASK;
> +
> + if (hweight8(count) != 1 || ioeventfd.fd < -1)
> + return -EINVAL;
> +
> + return vfio_pci_ioeventfd(vdev, ioeventfd.offset,
> +   ioeventfd.data, count, ioeventfd.fd);
>   }
>  
>   return -ENOTTY;
> @@ -1217,6 +1248,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const 
> struct pci_device_id *id)
>   vdev->irq_type = VFIO_PCI_NUM_IRQS;
>   mutex_init(>igate);
>   spin_lock_init(>irqlock);
> + mutex_init(>ioeventfds_lock);
> + INIT_LIST_HEAD(>ioeventfds_list);
>  
>   ret = vfio_add_group_dev(>dev, _pci_ops, vdev);
>   if (ret) {
> diff --git a/drivers/vfio/pci/vfio_pci_private.h 
> b/drivers/vfio/pci/vfio_pci_private.h
> index f561ac1c78a0..23797622396e 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -29,6 +29,15 @@
>  #define PCI_CAP_ID_INVALID   0xFF/* default raw access */
>  #define PCI_CAP_ID_INVALID_VIRT  0xFE/* default virt access 
> */
>  
> +struct vfio_pci_ioeventfd {
> + struct list_headnext;
> + struct virqfd   *virqfd;
> + loff_t  pos;
> + uint64_tdata;
> + int bar;
> + int count;
> +};
> +
>  struct vfio_pci_irq_ctx {
>   struct eventfd_ctx  *trigger;
>   struct virqfd   *unmask;
> @@ -95,6 +104,8 @@ struct vfio_pci_device {
>   struct eventfd_ctx  *err_trigger;
>   struct eventfd_ctx  *req_trigger;
>   struct list_headdummy_resources_list;
> + struct mutexioeventfds_lock;
> + struct list_headioeventfds_list;
>  };
>  
>  #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)
> @@ -120,6 +131,9 @@ extern ssize_t vfio_pci_bar_rw(struct vfio_pci_device 
> *vdev, char __user *buf,
>  extern ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char __user 
> *buf,
>  size_t count,

Re: [Qemu-devel] [PATCH v4 00/14] Initial i.MX7 support

2018-02-06 Thread Andrey Smirnov

On Wed, Jan 31, 2018 at 9:03 AM, Philippe Mathieu-Daudé  wrote:
> Hi Peter, Andrey.
>
> On 01/15/2018 10:36 PM, Andrey Smirnov wrote:
>> Hi everyone,
>>
>> This v4 of the patch series containing the work that I've done in
>> order to enable support for i.MX7 emulation in QEMU.
>>
>> *NOTE*: Patches 1 and 2 are provided for the sake of completness and
>>   are going to have to be adapted once Philippe's SD changes
>>   land in master. As such, they are NOT ready to be
>>   accepted/merged.
>
> Peter:
> Since my series are taking longer, if this series is ready it is
> probably easier to apply Andrey series first and I'll adapt my SDHCI
> series after.
>
> Andrey:
> I only plan to keep the sdhci.c file generic (dealing with quirks) and
> split out the imx usdhci code, similar to this patch:
> https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg01265.html

Yes, I understand that, but I still am not clear how you propose
dealing with the fact that i.MX specific read/write functions need to
call similar functions from parent SDHC class. Are you planning on
adding overridable read()/write() methods to SDHCICommonClass?

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v2 0/3] s390x/pci: fixup and optimize IOTLB code

2018-02-06 Thread Yi Min Zhao




在 2018/2/6 下午6:23, Cornelia Huck 写道:

On Mon,  5 Feb 2018 15:22:55 +0800
Yi Min Zhao  wrote:


This series contains three patches,
1) optimizes the code including walking DMA tables and rpcit handler
2) fixes the issue caused by IOTLB global refresh
3) uses the right pal and pba when registering ioat

The issue mentioned above was found when we tested SMC-r tools. This
behavior has been introduced when linux guest started using a global
refresh to purge the whole IOTLB of invalid entries in a lazy fashion
instead of flushing each entry when invalidating table entries.

The previous QEMU implementation didn't keep track of the mapping,
didn't handle correctly the global flush demand from the guest and a
major part of the IOTLB entries were not flushed.

Consequently linux kernel on the host keeping the previous mapping
reports, as it should, -EEXIST DMA mapping error on the next mapping
with the same IOVA. The second patch fixes this issue.

Introducing a local tracking mechanism still feels a bit awkward to me
(even though it works, of course). If nobody else needs such a thing,
our best choice is to do it like that, though.
Caching iotlb entries is also helpful for us to support 2G mapping in 
future.



During the investigation, we noticed that the current code walking
PCI IOMMU page tables didn't check important flags of table entries,
including:
1) protection bit
2) table length
3) table offset
4) intermediate tables' invalid bit
5) format control bit

We implement the checking in the first patch before handling the
IOTLB global refresh issue. To keep track of the mapped IOTLB entries
and be able to check if the host IOTLB entries need to be refreshed
we implement a IOTLB cache in QEMU, and introduce some helper
functions to check these bits. All S390IOTLBEntry instances are stored
in a new hashtable which are indexed by IOVA. Each PCI device has its
own IOMMU. Therefore each IOMMU also has its own hashtable caching
corresponding PCI device's DMA entries. Finally, we split 1M
contiguous DMA range into 4K pages to do DMA map, and the code about
error notification is also optimized.

Change log:
v1->v2:
1) update commit messages
2) move some changes from the 2nd patch to the 1st patch
3) define macros for 'ett' in the 1st patch

Yi Min Zhao (3):
   s390x/pci: fixup the code walking IOMMU tables
   s390x/pci: fixup global refresh
   s390x/pci: use the right pal and pba in reg_ioat()

  hw/s390x/s390-pci-bus.c  | 233 ++-
  hw/s390x/s390-pci-bus.h  |  17 
  hw/s390x/s390-pci-inst.c | 103 ++---
  3 files changed, 275 insertions(+), 78 deletions(-)


I have played with these patches and some virtio-pci devices (since I
don't have access to real zpci cards), and it worked both under kvm and
under tcg. So I'm inclined to apply this (I can't review further due to
missing documentation), unless the pci folks have further comments.

Thanks!

Re: [Qemu-devel] [PATCH v1 1/2] timer: Initial commit of xlnx-pmu-iomod-pit device

2018-02-06 Thread Philippe Mathieu-Daudé

Hi Alistair,

On 02/06/2018 07:23 PM, Alistair Francis wrote:
> Signed-off-by: Alistair Francis 
> ---
> 
>  include/hw/timer/xlnx-pmu-iomod-pit.h |  58 
>  hw/timer/xlnx-pmu-iomod-pit.c | 241 
> ++
>  hw/timer/Makefile.objs|   2 +
>  3 files changed, 301 insertions(+)
>  create mode 100644 include/hw/timer/xlnx-pmu-iomod-pit.h
>  create mode 100644 hw/timer/xlnx-pmu-iomod-pit.c
> 
> diff --git a/include/hw/timer/xlnx-pmu-iomod-pit.h 
> b/include/hw/timer/xlnx-pmu-iomod-pit.h
> new file mode 100644
> index 00..15f9f0dee5
> --- /dev/null
> +++ b/include/hw/timer/xlnx-pmu-iomod-pit.h
> @@ -0,0 +1,58 @@
> +/*
> + * QEMU model of Xilinx I/O Module PIT
> + *
> + * Copyright (c) 2013 Xilinx Inc

I'm glad you upstream this :)

> + * Written by Edgar E. Iglesias 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/ptimer.h"
> +
> +#define TYPE_XLNX_ZYNQMP_IOMODULE_PIT "xlnx.pmu_iomodule"
> +
> +#define XLNX_ZYNQMP_IOMODULE_PIT(obj) \
> + OBJECT_CHECK(XlnxPMUPIT, (obj), TYPE_XLNX_ZYNQMP_IOMODULE_PIT)
> +
> +#define XLNX_ZYNQMP_IOMODULE_PIT_R_MAX (0x08 + 1)

I assume this part is generated, removing 'XLNX_' shortens a bit.

> +
> +typedef struct XlnxPMUPIT {
> +SysBusDevice parent_obj;
> +MemoryRegion iomem;
> +
> +QEMUBH *bh;
> +ptimer_state *ptimer;
> +
> +qemu_irq irq;
> +/* IRQ to pulse out when present timer hits zero */
> +qemu_irq hit_out;
> +
> +/* Counter in Pre-Scalar(ps) Mode */
> +uint32_t ps_counter;
> +/* ps_mode irq-in to enable/disable pre-scalar */
> +bool ps_enable;
> +/* State var to remember hit_in level */
> +bool ps_level;
> +
> +uint32_t frequency;

I personally prefer explicit that kind of unit when possible:
"frequency_hz".

> +
> +uint32_t regs[XLNX_ZYNQMP_IOMODULE_PIT_R_MAX];
> +RegisterInfo regs_info[XLNX_ZYNQMP_IOMODULE_PIT_R_MAX];
> +} XlnxPMUPIT;
> diff --git a/hw/timer/xlnx-pmu-iomod-pit.c b/hw/timer/xlnx-pmu-iomod-pit.c
> new file mode 100644
> index 00..cdfef1a440
> --- /dev/null
> +++ b/hw/timer/xlnx-pmu-iomod-pit.c
> @@ -0,0 +1,241 @@
> +/*
> + * QEMU model of Xilinx I/O Module PIT
> + *
> + * Copyright (c) 2013 Xilinx Inc
> + * Written by Edgar E. Iglesias 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/sysbus.h"
> +#include "hw/ptimer.h"
> +#include "hw/register.h"
> +#include "qemu/main-loop.h"
> +#include "qemu/log.h"
> +#include "qapi/error.h"
> +#include "hw/timer/xlnx-pmu-iomod-pit.h"
> +
> +#ifndef XLNX_ZYNQMP_IOMODULE_PIT_ERR_DEBUG
> +#define XLNX_ZYNQMP_IOMODULE_PIT_ERR_DEBUG 0
> +#endif
> +
>

Re: [Qemu-devel] [PATCH for-2.9-rc5 v4 2/2] block: Drain BH in bdrv_drained_begin

2018-02-06 Thread Fam Zheng

On Tue, Feb 6, 2018 at 11:32 PM, Kevin Wolf  wrote:
> Am 18.04.2017 um 16:30 hat Fam Zheng geschrieben:
>> During block job completion, nothing is preventing
>> block_job_defer_to_main_loop_bh from being called in a nested
>> aio_poll(), which is a trouble, such as in this code path:
>>
>> qmp_block_commit
>>   commit_active_start
>> bdrv_reopen
>>   bdrv_reopen_multiple
>> bdrv_reopen_prepare
>>   bdrv_flush
>> aio_poll
>>   aio_bh_poll
>> aio_bh_call
>>   block_job_defer_to_main_loop_bh
>> stream_complete
>>   bdrv_reopen
>>
>> block_job_defer_to_main_loop_bh is the last step of the stream job,
>> which should have been "paused" by the bdrv_drained_begin/end in
>> bdrv_reopen_multiple, but it is not done because it's in the form of a
>> main loop BH.
>>
>> Similar to why block jobs should be paused between drained_begin and
>> drained_end, BHs they schedule must be excluded as well.  To achieve
>> this, this patch forces draining the BH in BDRV_POLL_WHILE.
>>
>> As a side effect this fixes a hang in block_job_detach_aio_context
>> during system_reset when a block job is ready:
>>
>> #0  0x55aa79f3 in bdrv_drain_recurse
>> #1  0x55aa825d in bdrv_drained_begin
>> #2  0x55aa8449 in bdrv_drain
>> #3  0x55a9c356 in blk_drain
>> #4  0x55aa3cfd in mirror_drain
>> #5  0x55a66e11 in block_job_detach_aio_context
>> #6  0x55a62f4d in bdrv_detach_aio_context
>> #7  0x55a63116 in bdrv_set_aio_context
>> #8  0x55a9d326 in blk_set_aio_context
>> #9  0x557e38da in virtio_blk_data_plane_stop
>> #10 0x559f9d5f in virtio_bus_stop_ioeventfd
>> #11 0x559fa49b in virtio_bus_stop_ioeventfd
>> #12 0x559f6a18 in virtio_pci_stop_ioeventfd
>> #13 0x559f6a18 in virtio_pci_reset
>> #14 0x559139a9 in qdev_reset_one
>> #15 0x55916738 in qbus_walk_children
>> #16 0x55913318 in qdev_walk_children
>> #17 0x55916738 in qbus_walk_children
>> #18 0x559168ca in qemu_devices_reset
>> #19 0x5581fcbb in pc_machine_reset
>> #20 0x558a4d96 in qemu_system_reset
>> #21 0x5577157a in main_loop_should_exit
>> #22 0x5577157a in main_loop
>> #23 0x5577157a in main
>>
>> The rationale is that the loop in block_job_detach_aio_context cannot
>> make any progress in pausing/completing the job, because bs->in_flight
>> is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
>> BH. With this patch, it does.
>>
>> Reported-by: Jeff Cody 
>> Signed-off-by: Fam Zheng 
>
> Fam, do you remember whether this was really only about drain? Because
> in that case...

Yes I believe so.

>
>> diff --git a/include/block/block.h b/include/block/block.h
>> index 97d4330..5ddc0cf 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -381,12 +381,13 @@ void bdrv_drain_all(void);
>>
>>  #define BDRV_POLL_WHILE(bs, cond) ({   \
>>  bool waited_ = false;  \
>> +bool busy_ = true; \
>>  BlockDriverState *bs_ = (bs);  \
>>  AioContext *ctx_ = bdrv_get_aio_context(bs_);  \
>>  if (aio_context_in_iothread(ctx_)) {   \
>> -while ((cond)) {   \
>> -aio_poll(ctx_, true);  \
>> -waited_ = true;\
>> +while ((cond) || busy_) {  \
>> +busy_ = aio_poll(ctx_, (cond));\
>> +waited_ |= !!(cond) | busy_;   \
>>  }  \
>>  } else {   \
>>  assert(qemu_get_current_aio_context() ==   \
>> @@ -398,11 +399,16 @@ void bdrv_drain_all(void);
>>   */\
>>  assert(!bs_->wakeup);  \
>>  bs_->wakeup = true;\
>> -while ((cond)) {   \
>> -aio_context_release(ctx_); \
>> -aio_poll(qemu_get_aio_context(), true);\
>> -aio_context_acquire(ctx_); \
>> -waited_ = true;\
>> +while (busy_) {\
>> +if ((cond)) {  \
>> +waited_ = busy_ = true;\
>> +aio_context_release(ctx_);

Re: [Qemu-devel] [PATCH v6 00/23] x86: Secure Encrypted Virtualization (AMD)

2018-02-06 Thread Brijesh Singh



On 2/6/18 9:51 AM, Bruce Rogers wrote:
 On 1/29/2018 at 10:41 AM,  wrote:
>> This patch series provides support for AMD's new Secure Encrypted 
>> Virtualization (SEV) feature.
>>
>> SEV is an extension to the AMD‑V architecture which supports running
>> multiple VMs under the control of a hypervisor. The SEV feature allows
>> the memory contents of a virtual machine (VM) to be transparently encrypted
>> with a key unique to the guest VM. The memory controller contains a
>> high performance encryption engine which can be programmed with multiple
>> keys for use by a different VMs in the system. The programming and
>> management of these keys is handled by the AMD Secure Processor firmware
>> which exposes a commands for these tasks.
>>
>> The KVM SEV patch series introduced a new ioctl (KVM_MEMORY_ENCRYPTION_OP)
>> which is used by qemu to issue the SEV commands to assist performing
>> common hypervisor activities such as a launching, running, snapshooting,
>> migration and debugging guests.
>>
> As for the reported failure to build on non-x86 hosts, eg:
> ...
>   LINKi386-softmmu/qemu-system-i386
> target/i386/helper.o: In function `get_me_mask':
> /var/tmp/patchew-tester-tmp-hek3vjny/src/target/i386/helper.c:735: undefined 
> reference to `kvm_arch_get_supported_cpuid'
> target/i386/monitor.o: In function `get_me_mask':
> /var/tmp/patchew-tester-tmp-hek3vjny/src/target/i386/monitor.c:71: undefined 
> reference to `kvm_arch_get_supported_cpuid'
>
> ... I've looked at that a bit and find that in target/i386/kvm-stub.c, if we 
> get rid of
> the #ifndef __OPTIMIZE__ it then builds ok. I'm not sure if the guarding done 
> there
> with the check for  __OPTIMIZE__ is a relic that no longer applies given how 
> qemu
> currently builds, but at least it's something to look at.

Thanks for looking Bruce.

I have reworked code a bit and that should fix this build error. I will
be posting patches soon for review.

> Bruce
>

Re: [Qemu-devel] [PATCH v2 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-02-06 Thread Michael S. Tsirkin

On Tue, Feb 06, 2018 at 07:08:17PM +0800, Wei Wang wrote:
> The new feature enables the virtio-balloon device to receive the hint of
> guest free pages from the free page vq, and clears the corresponding bits
> of the free page from the dirty bitmap, so that those free pages are not
> transferred by the migration thread.
> 
> Signed-off-by: Wei Wang 
> Signed-off-by: Liang Li 
> CC: Michael S. Tsirkin 
> CC: Juan Quintela 
> ---
>  balloon.c   |  39 +--
>  hw/virtio/virtio-balloon.c  | 145 
> +---
>  include/hw/virtio/virtio-balloon.h  |  11 +-
>  include/migration/misc.h|   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 ++
>  include/sysemu/balloon.h|  12 +-
>  migration/ram.c |  10 ++
>  7 files changed, 198 insertions(+), 29 deletions(-)
> 
> diff --git a/balloon.c b/balloon.c
> index 1d720ff..0f0b30c 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -36,6 +36,8 @@
>  
>  static QEMUBalloonEvent *balloon_event_fn;
>  static QEMUBalloonStatus *balloon_stat_fn;
> +static QEMUBalloonFreePageSupport *balloon_free_page_support_fn;
> +static QEMUBalloonFreePagePoll *balloon_free_page_poll_fn;
>  static void *balloon_opaque;
>  static bool balloon_inhibited;
>  
> @@ -64,19 +66,34 @@ static bool have_balloon(Error **errp)
>  return true;
>  }
>  
> -int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> - QEMUBalloonStatus *stat_func, void *opaque)
> +bool balloon_free_page_support(void)
>  {
> -if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
> -/* We're already registered one balloon handler.  How many can
> - * a guest really have?
> - */
> -return -1;
> +return balloon_free_page_support_fn &&
> +   balloon_free_page_support_fn(balloon_opaque);
> +}
> +
> +void balloon_free_page_poll(void)
> +{
> +balloon_free_page_poll_fn(balloon_opaque);
> +}
> +
> +void qemu_add_balloon_handler(QEMUBalloonEvent *event_fn,
> +  QEMUBalloonStatus *stat_fn,
> +  QEMUBalloonFreePageSupport 
> *free_page_support_fn,
> +  QEMUBalloonFreePagePoll *free_page_poll_fn,
> +  void *opaque)
> +{
> +if (balloon_event_fn || balloon_stat_fn || balloon_free_page_support_fn 
> ||
> +balloon_free_page_poll_fn || balloon_opaque) {
> +/* We already registered one balloon handler. */
> +return;
>  }
> -balloon_event_fn = event_func;
> -balloon_stat_fn = stat_func;
> +
> +balloon_event_fn = event_fn;
> +balloon_stat_fn = stat_fn;
> +balloon_free_page_support_fn = free_page_support_fn;
> +balloon_free_page_poll_fn = free_page_poll_fn;
>  balloon_opaque = opaque;
> -return 0;
>  }
>  
>  void qemu_remove_balloon_handler(void *opaque)
> @@ -86,6 +103,8 @@ void qemu_remove_balloon_handler(void *opaque)
>  }
>  balloon_event_fn = NULL;
>  balloon_stat_fn = NULL;
> +balloon_free_page_support_fn = NULL;
> +balloon_free_page_poll_fn = NULL;
>  balloon_opaque = NULL;
>  }
>  
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 14e08d2..b424d4e 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -23,6 +23,7 @@
>  #include "hw/virtio/virtio-balloon.h"
>  #include "sysemu/kvm.h"
>  #include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
>  #include "qapi/visitor.h"
>  #include "qapi-event.h"
>  #include "trace.h"
> @@ -30,6 +31,7 @@
>  
>  #include "hw/virtio/virtio-bus.h"
>  #include "hw/virtio/virtio-access.h"
> +#include "migration/misc.h"
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> @@ -305,6 +307,87 @@ out:
>  }
>  }
>  
> +static void virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
> +{
> +VirtQueueElement *elem;
> +VirtQueue *vq = dev->free_page_vq;
> +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +bool page_poisoning = virtio_vdev_has_feature(vdev,
> +  VIRTIO_BALLOON_F_PAGE_POISON);
> +uint32_t id;
> +
> +/* Poll the vq till a stop cmd id is received */
> +while (dev->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
> +elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> +if (!elem) {
> +continue;
> +}
> +
> +if (elem->out_num) {
> +iov_to_buf(elem->out_sg, elem->out_num, 0, , 
> sizeof(uint32_t));
> +virtqueue_push(vq, elem, sizeof(id));
> +g_free(elem);
> +if (id == dev->free_page_report_cmd_id) {
> +dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
> +} else {
> +

[Qemu-devel] [RFC PATCH 5/5] vfio/quirks: Enable ioeventfd quirks to be handled by vfio directly

2018-02-06 Thread Alex Williamson

With vfio ioeventfd support, we can program vfio-pci to perform a
specified BAR write when an eventfd is triggered.  This allows the
KVM ioeventfd to be wired directly to vfio-pci, entirely avoiding
userspace handling for these events.  On the same micro-benchmark
where the ioeventfd got us to almost 90% of performance versus
disabling the GeForce quirks, this gets us to within 95%.

Signed-off-by: Alex Williamson 
---
 hw/vfio/pci-quirks.c |   42 --
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index e739efe601b1..35a4d5197e2d 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -16,6 +16,7 @@
 #include "qemu/range.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
+#include 
 #include "hw/nvram/fw_cfg.h"
 #include "pci.h"
 #include "trace.h"
@@ -287,13 +288,27 @@ static VFIOQuirk *vfio_quirk_alloc(int nr_mem)
 return quirk;
 }
 
-static void vfio_ioeventfd_exit(VFIOIOEventFD *ioeventfd)
+static void vfio_ioeventfd_exit(VFIOPCIDevice *vdev, VFIOIOEventFD *ioeventfd)
 {
+struct vfio_device_ioeventfd vfio_ioeventfd;
+
 QLIST_REMOVE(ioeventfd, next);
+
 memory_region_del_eventfd(ioeventfd->mr, ioeventfd->addr, ioeventfd->size,
   ioeventfd->match_data, ioeventfd->data,
   >e);
+
 qemu_set_fd_handler(event_notifier_get_fd(>e), NULL, NULL, 
NULL);
+
+vfio_ioeventfd.argsz = sizeof(vfio_ioeventfd);
+vfio_ioeventfd.flags = ioeventfd->size;
+vfio_ioeventfd.data = ioeventfd->data;
+vfio_ioeventfd.offset = ioeventfd->region->fd_offset +
+ioeventfd->region_addr;
+vfio_ioeventfd.fd = -1;
+
+ioctl(vdev->vbasedev.fd, VFIO_DEVICE_IOEVENTFD, _ioeventfd);
+
 event_notifier_cleanup(>e);
 g_free(ioeventfd);
 }
@@ -315,6 +330,8 @@ static VFIOIOEventFD *vfio_ioeventfd_init(VFIOPCIDevice 
*vdev,
   hwaddr region_addr)
 {
 VFIOIOEventFD *ioeventfd = g_malloc0(sizeof(*ioeventfd));
+struct vfio_device_ioeventfd vfio_ioeventfd;
+char vfio_enabled = '+';
 
 if (event_notifier_init(>e, 0)) {
 g_free(ioeventfd);
@@ -329,15 +346,28 @@ static VFIOIOEventFD *vfio_ioeventfd_init(VFIOPCIDevice 
*vdev,
 ioeventfd->region = region;
 ioeventfd->region_addr = region_addr;
 
-qemu_set_fd_handler(event_notifier_get_fd(>e),
-vfio_ioeventfd_handler, NULL, ioeventfd);
+vfio_ioeventfd.argsz = sizeof(vfio_ioeventfd);
+vfio_ioeventfd.flags = ioeventfd->size;
+vfio_ioeventfd.data = ioeventfd->data;
+vfio_ioeventfd.offset = ioeventfd->region->fd_offset +
+ioeventfd->region_addr;
+vfio_ioeventfd.fd = event_notifier_get_fd(>e);
+
+if (ioctl(vdev->vbasedev.fd,
+  VFIO_DEVICE_IOEVENTFD, _ioeventfd) != 0) {
+qemu_set_fd_handler(event_notifier_get_fd(>e),
+vfio_ioeventfd_handler, NULL, ioeventfd);
+vfio_enabled = '-';
+}
+
 memory_region_add_eventfd(ioeventfd->mr, ioeventfd->addr,
   ioeventfd->size, ioeventfd->match_data,
   ioeventfd->data, >e);
 
 info_report("Enabled automatic ioeventfd acceleration for %s region %d, "
-"offset 0x%"HWADDR_PRIx", data 0x%"PRIx64", size %u",
-vdev->vbasedev.name, region->nr, region_addr, data, size);
+"offset 0x%"HWADDR_PRIx", data 0x%"PRIx64", size %u, vfio%c",
+vdev->vbasedev.name, region->nr, region_addr, data, size,
+vfio_enabled);
 
 return ioeventfd;
 }
@@ -1767,7 +1797,7 @@ void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr)
 
 QLIST_FOREACH(quirk, >quirks, next) {
 while (!QLIST_EMPTY(>ioeventfds)) {
-vfio_ioeventfd_exit(QLIST_FIRST(>ioeventfds));
+vfio_ioeventfd_exit(vdev, QLIST_FIRST(>ioeventfds));
 }
 
 for (i = 0; i < quirk->nr_mem; i++) {

[Qemu-devel] [RFC PATCH 3/5] vfio/quirks: Automatic ioeventfd enabling for NVIDIA BAR0 quirks

2018-02-06 Thread Alex Williamson

Record data writes that come through the NVIDIA BAR0 quirk, if we get
enough in a row that we're only passing through, automatically enable
an ioeventfd for it.  The primary target for this is the MSI-ACK
that NVIDIA uses to allow the MSI interrupt to re-trigger, which is a
4-byte write, data value 0x0 to offset 0x704 into the quirk, 0x88704
into BAR0 MMIO space.  For an interrupt latency sensitive micro-
benchmark, this takes us from 83% of performance versus disabling the
quirk entirely (which GeForce cannot do), to to almost 90%.

Signed-off-by: Alex Williamson 
---
 hw/vfio/pci-quirks.c |   89 +-
 hw/vfio/pci.h|2 +
 2 files changed, 89 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index e4cf4ea2dd9c..e739efe601b1 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -203,6 +203,7 @@ typedef struct VFIOConfigMirrorQuirk {
 uint32_t offset;
 uint8_t bar;
 MemoryRegion *mem;
+uint8_t data[];
 } VFIOConfigMirrorQuirk;
 
 static uint64_t vfio_generic_quirk_mirror_read(void *opaque,
@@ -297,6 +298,50 @@ static void vfio_ioeventfd_exit(VFIOIOEventFD *ioeventfd)
 g_free(ioeventfd);
 }
 
+static void vfio_ioeventfd_handler(void *opaque)
+{
+VFIOIOEventFD *ioeventfd = opaque;
+
+if (event_notifier_test_and_clear(>e)) {
+vfio_region_write(ioeventfd->region, ioeventfd->region_addr,
+  ioeventfd->data, ioeventfd->size);
+}
+}
+
+static VFIOIOEventFD *vfio_ioeventfd_init(VFIOPCIDevice *vdev,
+  MemoryRegion *mr, hwaddr addr,
+  unsigned size, uint64_t data,
+  VFIORegion *region,
+  hwaddr region_addr)
+{
+VFIOIOEventFD *ioeventfd = g_malloc0(sizeof(*ioeventfd));
+
+if (event_notifier_init(>e, 0)) {
+g_free(ioeventfd);
+return NULL;
+}
+
+ioeventfd->mr = mr;
+ioeventfd->addr = addr;
+ioeventfd->size = size;
+ioeventfd->match_data = true;
+ioeventfd->data = data;
+ioeventfd->region = region;
+ioeventfd->region_addr = region_addr;
+
+qemu_set_fd_handler(event_notifier_get_fd(>e),
+vfio_ioeventfd_handler, NULL, ioeventfd);
+memory_region_add_eventfd(ioeventfd->mr, ioeventfd->addr,
+  ioeventfd->size, ioeventfd->match_data,
+  ioeventfd->data, >e);
+
+info_report("Enabled automatic ioeventfd acceleration for %s region %d, "
+"offset 0x%"HWADDR_PRIx", data 0x%"PRIx64", size %u",
+vdev->vbasedev.name, region->nr, region_addr, data, size);
+
+return ioeventfd;
+}
+
 static void vfio_vga_probe_ati_3c3_quirk(VFIOPCIDevice *vdev)
 {
 VFIOQuirk *quirk;
@@ -732,6 +777,13 @@ static void vfio_probe_nvidia_bar5_quirk(VFIOPCIDevice 
*vdev, int nr)
 trace_vfio_quirk_nvidia_bar5_probe(vdev->vbasedev.name);
 }
 
+typedef struct LastDataSet {
+hwaddr addr;
+uint64_t data;
+unsigned size;
+int count;
+} LastDataSet;
+
 /*
  * Finally, BAR0 itself.  We want to redirect any accesses to either
  * 0x1800 or 0x88000 through the PCI config space access functions.
@@ -742,6 +794,7 @@ static void vfio_nvidia_quirk_mirror_write(void *opaque, 
hwaddr addr,
 VFIOConfigMirrorQuirk *mirror = opaque;
 VFIOPCIDevice *vdev = mirror->vdev;
 PCIDevice *pdev = >pdev;
+LastDataSet *last = (LastDataSet *)>data;
 
 vfio_generic_quirk_mirror_write(opaque, addr, data, size);
 
@@ -756,6 +809,38 @@ static void vfio_nvidia_quirk_mirror_write(void *opaque, 
hwaddr addr,
   addr + mirror->offset, data, size);
 trace_vfio_quirk_nvidia_bar0_msi_ack(vdev->vbasedev.name);
 }
+
+/*
+ * Automatically add an ioeventfd to handle any repeated write with the
+ * same data and size above the standard PCI config space header.  This is
+ * primarily expected to accelerate the MSI-ACK behavior, such as noted
+ * above.  Current hardware/drivers should trigger an ioeventfd at config
+ * offset 0x704 (region offset 0x88704), with data 0x0, size 4.
+ */
+if (addr > PCI_STD_HEADER_SIZEOF) {
+if (addr != last->addr || data != last->data || size != last->size) {
+last->addr = addr;
+last->data = data;
+last->size = size;
+last->count = 1;
+} else if (++last->count > 10) {
+VFIOIOEventFD *ioeventfd;
+
+ioeventfd = vfio_ioeventfd_init(vdev, mirror->mem, addr, size, 
data,
+>bars[mirror->bar].region,
+mirror->offset + addr);
+if (ioeventfd) {
+VFIOQuirk *quirk;
+
+QLIST_FOREACH(quirk, >bars[mirror->bar].quirks, next) {
+

[Qemu-devel] [RFC PATCH 4/5] vfio: Update linux header

2018-02-06 Thread Alex Williamson

Update with proposed ioeventfd API.

Signed-off-by: Alex Williamson 
---
 linux-headers/linux/vfio.h |   24 
 1 file changed, 24 insertions(+)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4312e961ffd3..0921994daa6d 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -503,6 +503,30 @@ struct vfio_pci_hot_reset {
 
 #define VFIO_DEVICE_PCI_HOT_RESET  _IO(VFIO_TYPE, VFIO_BASE + 13)
 
+/**
+ * VFIO_DEVICE_IOEVENTFD - _IOW(VFIO_TYPE, VFIO_BASE + 14,
+ *  struct vfio_device_ioeventfd)
+ *
+ * Perform a write to the device at the specified device fd offset, with
+ * the specified data and width when the provided eventfd is triggered.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+struct vfio_device_ioeventfd {
+   __u32   argsz;
+   __u32   flags;
+#define VFIO_DEVICE_IOEVENTFD_8(1 << 0) /* 1-byte write */
+#define VFIO_DEVICE_IOEVENTFD_16   (1 << 1) /* 2-byte write */
+#define VFIO_DEVICE_IOEVENTFD_32   (1 << 2) /* 4-byte write */
+#define VFIO_DEVICE_IOEVENTFD_64   (1 << 3) /* 8-byte write */
+#define VFIO_DEVICE_IOEVENTFD_SIZE_MASK(0xf)
+   __u64   offset; /* device fd offset of write */
+   __u64   data;   /* data to be written */
+   __s32   fd; /* -1 for de-assignment */
+};
+
+#define VFIO_DEVICE_IOEVENTFD  _IO(VFIO_TYPE, VFIO_BASE + 14)
+
 /*  API for Type1 VFIO IOMMU  */
 
 /**

[Qemu-devel] [RFC PATCH 2/5] vfio/quirks: Add generic support for ioveventfds

2018-02-06 Thread Alex Williamson

We might wish to handle some quirks via ioeventfds, add a list of
ioeventfds to the quirk.

Signed-off-by: Alex Williamson 
---
 hw/vfio/pci-quirks.c |   17 +
 hw/vfio/pci.h|   11 +++
 2 files changed, 28 insertions(+)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 10af23217292..e4cf4ea2dd9c 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -12,6 +12,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/error-report.h"
+#include "qemu/main-loop.h"
 #include "qemu/range.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
@@ -278,12 +279,24 @@ static const MemoryRegionOps vfio_ati_3c3_quirk = {
 static VFIOQuirk *vfio_quirk_alloc(int nr_mem)
 {
 VFIOQuirk *quirk = g_malloc0(sizeof(*quirk));
+QLIST_INIT(>ioeventfds);
 quirk->mem = g_new0(MemoryRegion, nr_mem);
 quirk->nr_mem = nr_mem;
 
 return quirk;
 }
 
+static void vfio_ioeventfd_exit(VFIOIOEventFD *ioeventfd)
+{
+QLIST_REMOVE(ioeventfd, next);
+memory_region_del_eventfd(ioeventfd->mr, ioeventfd->addr, ioeventfd->size,
+  ioeventfd->match_data, ioeventfd->data,
+  >e);
+qemu_set_fd_handler(event_notifier_get_fd(>e), NULL, NULL, 
NULL);
+event_notifier_cleanup(>e);
+g_free(ioeventfd);
+}
+
 static void vfio_vga_probe_ati_3c3_quirk(VFIOPCIDevice *vdev)
 {
 VFIOQuirk *quirk;
@@ -1668,6 +1681,10 @@ void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr)
 int i;
 
 QLIST_FOREACH(quirk, >quirks, next) {
+while (!QLIST_EMPTY(>ioeventfds)) {
+vfio_ioeventfd_exit(QLIST_FIRST(>ioeventfds));
+}
+
 for (i = 0; i < quirk->nr_mem; i++) {
 memory_region_del_subregion(bar->region.mem, >mem[i]);
 }
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index f4aa13e021fa..146065c2f715 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -24,9 +24,20 @@
 
 struct VFIOPCIDevice;
 
+typedef struct VFIOIOEventFD {
+QLIST_ENTRY(VFIOIOEventFD) next;
+MemoryRegion *mr;
+hwaddr addr;
+unsigned size;
+bool match_data;
+uint64_t data;
+EventNotifier e;
+} VFIOIOEventFD;
+
 typedef struct VFIOQuirk {
 QLIST_ENTRY(VFIOQuirk) next;
 void *data;
+QLIST_HEAD(, VFIOIOEventFD) ioeventfds;
 int nr_mem;
 MemoryRegion *mem;
 } VFIOQuirk;

[Qemu-devel] [RFC PATCH 1/5] vfio/quirks: Add common quirk alloc helper

2018-02-06 Thread Alex Williamson

This will later be used to include list initialization

Signed-off-by: Alex Williamson 
---
 hw/vfio/pci-quirks.c |   48 +---
 1 file changed, 21 insertions(+), 27 deletions(-)

diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index e5779a7ad35b..10af23217292 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -275,6 +275,15 @@ static const MemoryRegionOps vfio_ati_3c3_quirk = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static VFIOQuirk *vfio_quirk_alloc(int nr_mem)
+{
+VFIOQuirk *quirk = g_malloc0(sizeof(*quirk));
+quirk->mem = g_new0(MemoryRegion, nr_mem);
+quirk->nr_mem = nr_mem;
+
+return quirk;
+}
+
 static void vfio_vga_probe_ati_3c3_quirk(VFIOPCIDevice *vdev)
 {
 VFIOQuirk *quirk;
@@ -288,9 +297,7 @@ static void vfio_vga_probe_ati_3c3_quirk(VFIOPCIDevice 
*vdev)
 return;
 }
 
-quirk = g_malloc0(sizeof(*quirk));
-quirk->mem = g_new0(MemoryRegion, 1);
-quirk->nr_mem = 1;
+quirk = vfio_quirk_alloc(1);
 
 memory_region_init_io(quirk->mem, OBJECT(vdev), _ati_3c3_quirk, vdev,
   "vfio-ati-3c3-quirk", 1);
@@ -323,9 +330,7 @@ static void vfio_probe_ati_bar4_quirk(VFIOPCIDevice *vdev, 
int nr)
 return;
 }
 
-quirk = g_malloc0(sizeof(*quirk));
-quirk->mem = g_new0(MemoryRegion, 2);
-quirk->nr_mem = 2;
+quirk = vfio_quirk_alloc(2);
 window = quirk->data = g_malloc0(sizeof(*window) +
  sizeof(VFIOConfigWindowMatch));
 window->vdev = vdev;
@@ -371,10 +376,9 @@ static void vfio_probe_ati_bar2_quirk(VFIOPCIDevice *vdev, 
int nr)
 return;
 }
 
-quirk = g_malloc0(sizeof(*quirk));
+quirk = vfio_quirk_alloc(1);
 mirror = quirk->data = g_malloc0(sizeof(*mirror));
-mirror->mem = quirk->mem = g_new0(MemoryRegion, 1);
-quirk->nr_mem = 1;
+mirror->mem = quirk->mem;
 mirror->vdev = vdev;
 mirror->offset = 0x4000;
 mirror->bar = nr;
@@ -548,10 +552,8 @@ static void vfio_vga_probe_nvidia_3d0_quirk(VFIOPCIDevice 
*vdev)
 return;
 }
 
-quirk = g_malloc0(sizeof(*quirk));
+quirk = vfio_quirk_alloc(2);
 quirk->data = data = g_malloc0(sizeof(*data));
-quirk->mem = g_new0(MemoryRegion, 2);
-quirk->nr_mem = 2;
 data->vdev = vdev;
 
 memory_region_init_io(>mem[0], OBJECT(vdev), _nvidia_3d4_quirk,
@@ -667,9 +669,7 @@ static void vfio_probe_nvidia_bar5_quirk(VFIOPCIDevice 
*vdev, int nr)
 return;
 }
 
-quirk = g_malloc0(sizeof(*quirk));
-quirk->mem = g_new0(MemoryRegion, 4);
-quirk->nr_mem = 4;
+quirk = vfio_quirk_alloc(4);
 bar5 = quirk->data = g_malloc0(sizeof(*bar5) +
(sizeof(VFIOConfigWindowMatch) * 2));
 window = >window;
@@ -762,10 +762,9 @@ static void vfio_probe_nvidia_bar0_quirk(VFIOPCIDevice 
*vdev, int nr)
 return;
 }
 
-quirk = g_malloc0(sizeof(*quirk));
+quirk = vfio_quirk_alloc(1);
 mirror = quirk->data = g_malloc0(sizeof(*mirror));
-mirror->mem = quirk->mem = g_new0(MemoryRegion, 1);
-quirk->nr_mem = 1;
+mirror->mem = quirk->mem;
 mirror->vdev = vdev;
 mirror->offset = 0x88000;
 mirror->bar = nr;
@@ -781,10 +780,9 @@ static void vfio_probe_nvidia_bar0_quirk(VFIOPCIDevice 
*vdev, int nr)
 
 /* The 0x1800 offset mirror only seems to get used by legacy VGA */
 if (vdev->vga) {
-quirk = g_malloc0(sizeof(*quirk));
+quirk = vfio_quirk_alloc(1);
 mirror = quirk->data = g_malloc0(sizeof(*mirror));
-mirror->mem = quirk->mem = g_new0(MemoryRegion, 1);
-quirk->nr_mem = 1;
+mirror->mem = quirk->mem;
 mirror->vdev = vdev;
 mirror->offset = 0x1800;
 mirror->bar = nr;
@@ -945,9 +943,7 @@ static void vfio_probe_rtl8168_bar2_quirk(VFIOPCIDevice 
*vdev, int nr)
 return;
 }
 
-quirk = g_malloc0(sizeof(*quirk));
-quirk->mem = g_new0(MemoryRegion, 2);
-quirk->nr_mem = 2;
+quirk = vfio_quirk_alloc(2);
 quirk->data = rtl = g_malloc0(sizeof(*rtl));
 rtl->vdev = vdev;
 
@@ -1507,9 +1503,7 @@ static void vfio_probe_igd_bar4_quirk(VFIOPCIDevice 
*vdev, int nr)
 }
 
 /* Setup our quirk to munge GTT addresses to the VM allocated buffer */
-quirk = g_malloc0(sizeof(*quirk));
-quirk->mem = g_new0(MemoryRegion, 2);
-quirk->nr_mem = 2;
+quirk = vfio_quirk_alloc(2);
 igd = quirk->data = g_malloc0(sizeof(*igd));
 igd->vdev = vdev;
 igd->index = ~0;

[Qemu-devel] [RFC PATCH 0/5] vfio: ioeventfd support

2018-02-06 Thread Alex Williamson

For the matching kernel patch, see:

https://lkml.org/lkml/2018/2/6/866

This series enables ioeventfd support and makes use of a proposed vfio
kernel ioeventfd interface for accelerating high frequency writes
through to the device.  In the specific case addressed, the writes are
to a range of MMIO space virtualized in QEMU for NVIDIA GeForce
support, but which also hosts a register which is used to allow the
MSI interrupt for the device to re-trigger.  Applications which
generate a very high interrupt rate on the GPU can see noticeable
overhead as a result of this trap through QEMU.  We added an option
for users to disable these quirks entirely for non-Geforce cards[1]
for optimal performance, but for GeForce users and users that can't
tweak their VM config, this gets us to within 95% of that performance
for an interrupt intensive micro-benchmark (from 83%).  I'd be
interested in more typical benchmark results to understand if there's
an improvement there as well.  Thanks,

Alex

[1] https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg06878.html

---

Alex Williamson (5):
  vfio/quirks: Add common quirk alloc helper
  vfio/quirks: Add generic support for ioveventfds
  vfio/quirks: Automatic ioeventfd enabling for NVIDIA BAR0 quirks
  vfio: Update linux header
  vfio/quirks: Enable ioeventfd quirks to be handled by vfio directly


 hw/vfio/pci-quirks.c   |  184 +---
 hw/vfio/pci.h  |   13 +++
 linux-headers/linux/vfio.h |   24 ++
 3 files changed, 192 insertions(+), 29 deletions(-)

[Qemu-devel] [RFC PATCH] vfio/pci: Add ioeventfd support

2018-02-06 Thread Alex Williamson

The ioeventfd here is actually irqfd handling of an ioeventfd such as
supported in KVM.  A user is able to pre-program a device write to
occur when the eventfd triggers.  This is yet another instance of
eventfd-irqfd triggering between KVM and vfio.  The impetus for this
is high frequency writes to pages which are virtualized in QEMU.
Enabling this near-direct write path for selected registers within
the virtualized page can improve performance and reduce overhead.
Specifically this is initially targeted at NVIDIA graphics cards where
the driver issues a write to an MMIO register within a virtualized
region in order to allow the MSI interrupt to re-trigger.

Signed-off-by: Alex Williamson 
---
 drivers/vfio/pci/vfio_pci.c |   33 +++
 drivers/vfio/pci/vfio_pci_private.h |   14 +++
 drivers/vfio/pci/vfio_pci_rdwr.c|  165 ---
 include/uapi/linux/vfio.h   |   24 +
 4 files changed, 224 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index f041b1a6cf66..c8e7297a61a3 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -302,6 +302,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 {
struct pci_dev *pdev = vdev->pdev;
struct vfio_pci_dummy_resource *dummy_res, *tmp;
+   struct vfio_pci_ioeventfd *ioeventfd, *ioeventfd_tmp;
int i, bar;
 
/* Stop the device from further DMA */
@@ -311,6 +312,14 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
VFIO_IRQ_SET_ACTION_TRIGGER,
vdev->irq_type, 0, 0, NULL);
 
+   /* Device closed, don't need mutex here */
+   list_for_each_entry_safe(ioeventfd, ioeventfd_tmp,
+>ioeventfds_list, next) {
+   vfio_virqfd_disable(>virqfd);
+   list_del(>next);
+   kfree(ioeventfd);
+   }
+
vdev->virq_disabled = false;
 
for (i = 0; i < vdev->num_regions; i++)
@@ -1039,6 +1048,28 @@ static long vfio_pci_ioctl(void *device_data,
 
kfree(groups);
return ret;
+   } else if (cmd == VFIO_DEVICE_IOEVENTFD) {
+   struct vfio_device_ioeventfd ioeventfd;
+   int count;
+
+   minsz = offsetofend(struct vfio_device_ioeventfd, fd);
+
+   if (copy_from_user(, (void __user*)arg, minsz))
+   return -EFAULT;
+
+   if (ioeventfd.argsz < minsz)
+   return -EINVAL;
+
+   if (ioeventfd.flags & ~VFIO_DEVICE_IOEVENTFD_SIZE_MASK)
+   return -EINVAL;
+
+   count = ioeventfd.flags & VFIO_DEVICE_IOEVENTFD_SIZE_MASK;
+
+   if (hweight8(count) != 1 || ioeventfd.fd < -1)
+   return -EINVAL;
+
+   return vfio_pci_ioeventfd(vdev, ioeventfd.offset,
+ ioeventfd.data, count, ioeventfd.fd);
}
 
return -ENOTTY;
@@ -1217,6 +1248,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
vdev->irq_type = VFIO_PCI_NUM_IRQS;
mutex_init(>igate);
spin_lock_init(>irqlock);
+   mutex_init(>ioeventfds_lock);
+   INIT_LIST_HEAD(>ioeventfds_list);
 
ret = vfio_add_group_dev(>dev, _pci_ops, vdev);
if (ret) {
diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index f561ac1c78a0..23797622396e 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -29,6 +29,15 @@
 #define PCI_CAP_ID_INVALID 0xFF/* default raw access */
 #define PCI_CAP_ID_INVALID_VIRT0xFE/* default virt access 
*/
 
+struct vfio_pci_ioeventfd {
+   struct list_headnext;
+   struct virqfd   *virqfd;
+   loff_t  pos;
+   uint64_tdata;
+   int bar;
+   int count;
+};
+
 struct vfio_pci_irq_ctx {
struct eventfd_ctx  *trigger;
struct virqfd   *unmask;
@@ -95,6 +104,8 @@ struct vfio_pci_device {
struct eventfd_ctx  *err_trigger;
struct eventfd_ctx  *req_trigger;
struct list_headdummy_resources_list;
+   struct mutexioeventfds_lock;
+   struct list_headioeventfds_list;
 };
 
 #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)
@@ -120,6 +131,9 @@ extern ssize_t vfio_pci_bar_rw(struct vfio_pci_device 
*vdev, char __user *buf,
 extern ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char __user *buf,
   size_t count, loff_t *ppos, bool iswrite);
 
+extern long vfio_pci_ioeventfd(struct vfio_pci_device *vdev, loff_t offset,
+  uint64_t data, int count, int fd);
+

Re: [Qemu-devel] [PATCH v2 0/3] virtio-balloon: free page hint reporting support

2018-02-06 Thread Michael S. Tsirkin

On Tue, Feb 06, 2018 at 07:08:16PM +0800, Wei Wang wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not transferred by the migration thread to the destination.
> 
> Please see the driver patch link for test results:
> https://lkml.org/lkml/2018/2/4/60
> 
> ChangeLog:
> v1->v2: 
> 1) virtio-balloon
> - use subsections to save free_page_report_cmd_id;
> - poll the free page vq after sending a cmd id to the driver;
> - change the free page vq size to VIRTQUEUE_MAX_SIZE;
> - virtio_balloon_poll_free_page_hints: handle the corner case
>   that the free page block reported from the driver may cross
>   the RAMBlock boundary.
> 2) migration/ram.c
> - use balloon_free_page_poll to start the optimization
> 
> Wei Wang (3):
>   virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
>   migration: use the free page reporting feature from balloon
>   virtio-balloon: add a timer to limit the free page report waiting time

This feature needs in-tree documentation about possible ways to use it,
tradeoffs involved etc.


>  balloon.c   |  39 ++--
>  hw/virtio/virtio-balloon.c  | 227 
> ++--
>  hw/virtio/virtio-pci.c  |   3 +
>  include/hw/virtio/virtio-balloon.h  |  15 +-
>  include/migration/misc.h|   3 +
>  include/standard-headers/linux/virtio_balloon.h |   7 +
>  include/sysemu/balloon.h|  12 +-
>  migration/ram.c |  34 +++-
>  8 files changed, 307 insertions(+), 33 deletions(-)
> 
> -- 
> 1.8.3.1

Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon

2018-02-06 Thread Michael S. Tsirkin

On Tue, Feb 06, 2018 at 07:08:18PM +0800, Wei Wang wrote:
> Use the free page reporting feature from the balloon device to clear the
> bits corresponding to guest free pages from the dirty bitmap, so that the
> free memory are not sent.
> 
> Signed-off-by: Wei Wang 
> CC: Michael S. Tsirkin 
> CC: Juan Quintela 

What the patch seems to do is stop migration
completely - blocking until guest completes the reporting.

Which makes no sense to me, since it's just an optimization.
Why not proceed with the migration? What do we have to loose?

I imagine some people might want to defer migration until reporting
completes to reduce the load on the network. Fair enough,
but it does not look like you actually measured the reduction
in traffic. So I suggest you work on that as a separate feature.


> ---
>  migration/ram.c | 24 
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index d6f462c..4fe16d2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -49,6 +49,7 @@
>  #include "qemu/rcu_queue.h"
>  #include "migration/colo.h"
>  #include "migration/block.h"
> +#include "sysemu/balloon.h"
>  
>  /***/
>  /* ram save/restore */
> @@ -206,6 +207,10 @@ struct RAMState {
>  uint32_t last_version;
>  /* We are in the first round */
>  bool ram_bulk_stage;
> +/* The feature, skipping the transfer of free pages, is supported */
> +bool free_page_support;
> +/* Skip the transfer of free pages in the bulk stage */
> +bool free_page_done;
>  /* How many times we have dirty too many pages */
>  int dirty_rate_high_cnt;
>  /* these variables are used for bitmap sync */
> @@ -773,7 +778,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, 
> RAMBlock *rb,
>  unsigned long *bitmap = rb->bmap;
>  unsigned long next;
>  
> -if (rs->ram_bulk_stage && start > 0) {
> +if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
>  next = start + 1;
>  } else {
>  next = find_next_bit(bitmap, size, start);
> @@ -1653,6 +1658,8 @@ static void ram_state_reset(RAMState *rs)
>  rs->last_page = 0;
>  rs->last_version = ram_list.version;
>  rs->ram_bulk_stage = true;
> +rs->free_page_support = balloon_free_page_support();
> +rs->free_page_done = false;
>  }
>  
>  #define MAX_WAIT 50 /* ms, half buffered_file limit */
> @@ -2135,7 +2142,7 @@ static int ram_state_init(RAMState **rsp)
>  return 0;
>  }
>  
> -static void ram_list_init_bitmaps(void)
> +static void ram_list_init_bitmaps(RAMState *rs)
>  {
>  RAMBlock *block;
>  unsigned long pages;
> @@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
>  QLIST_FOREACH_RCU(block, _list.blocks, next) {
>  pages = block->max_length >> TARGET_PAGE_BITS;
>  block->bmap = bitmap_new(pages);
> -bitmap_set(block->bmap, 0, pages);
> +if (rs->free_page_support) {
> +bitmap_set(block->bmap, 1, pages);
> +} else {
> +bitmap_set(block->bmap, 0, pages);
> +}
>  if (migrate_postcopy_ram()) {
>  block->unsentmap = bitmap_new(pages);
>  bitmap_set(block->unsentmap, 0, pages);
> @@ -2161,7 +2172,7 @@ static void ram_init_bitmaps(RAMState *rs)
>  qemu_mutex_lock_ramlist();
>  rcu_read_lock();
>  
> -ram_list_init_bitmaps();
> +ram_list_init_bitmaps(rs);
>  memory_global_dirty_log_start();
>  migration_bitmap_sync(rs);
>  
> @@ -2275,6 +2286,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  
>  ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>  
> +if (rs->free_page_support && !rs->free_page_done) {
> +balloon_free_page_poll();
> +rs->free_page_done = true;
> +}
> +
>  t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>  i = 0;
>  while ((ret = qemu_file_rate_limit(f)) == 0) {
> -- 
> 1.8.3.1

Re: [Qemu-devel] [PATCH v2 3/3] virtio-balloon: add a timer to limit the free page report waiting time

2018-02-06 Thread Michael S. Tsirkin

On Tue, Feb 06, 2018 at 07:08:19PM +0800, Wei Wang wrote:
> This patch adds a timer to limit the time that host waits for the free
> page hints reported by the guest. Users can specify the time in ms via
> "free-page-wait-time" command line option. If a user doesn't specify a
> time, host waits till the guest finishes reporting all the free page
> hints. The policy (wait for all the free page hints to be reported or
> use a time limit) is determined by the orchestration layer.
> 
> Signed-off-by: Wei Wang 
> CC: Michael S. Tsirkin 

Looks like an option the migration command should get,
as opposed to a device feature.

> ---
>  hw/virtio/virtio-balloon.c | 84 
> +-
>  hw/virtio/virtio-pci.c |  3 ++
>  include/hw/virtio/virtio-balloon.h |  4 ++
>  3 files changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index b424d4e..9ee0de4 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -207,6 +207,65 @@ static void balloon_stats_set_poll_interval(Object *obj, 
> Visitor *v,
>  balloon_stats_change_timer(s, 0);
>  }
>  
> +static void balloon_free_page_change_timer(VirtIOBalloon *s, int64_t ms)
> +{
> +timer_mod(s->free_page_timer,
> +  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + ms);
> +}
> +
> +static void balloon_stop_free_page_report(void *opaque)
> +{
> +VirtIOBalloon *dev = opaque;
> +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +
> +timer_del(dev->free_page_timer);
> +timer_free(dev->free_page_timer);
> +dev->free_page_timer = NULL;
> +
> +if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
> +dev->host_stop_free_page = true;
> +virtio_notify_config(vdev);
> +}
> +}
> +
> +static void balloon_free_page_get_wait_time(Object *obj, Visitor *v,
> +const char *name, void *opaque,
> +Error **errp)
> +{
> +VirtIOBalloon *s = opaque;
> +
> +visit_type_int(v, name, >free_page_wait_time, errp);
> +}
> +
> +static void balloon_free_page_set_wait_time(Object *obj, Visitor *v,
> +const char *name, void *opaque,
> +Error **errp)
> +{
> +VirtIOBalloon *s = opaque;
> +Error *local_err = NULL;
> +int64_t value;
> +
> +visit_type_int(v, name, , _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return;
> +}
> +if (value < 0) {
> +error_setg(errp, "free page wait time must be greater than zero");
> +return;
> +}
> +
> +if (value > UINT32_MAX) {
> +error_setg(errp, "free page wait time value is too big");
> +return;
> +}
> +
> +s->free_page_wait_time = value;
> +g_assert(s->free_page_timer == NULL);
> +s->free_page_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
> +  balloon_stop_free_page_report, s);
> +}
> +
>  static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>  {
>  VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> @@ -330,6 +389,7 @@ static void 
> virtio_balloon_poll_free_page_hints(VirtIOBalloon *dev)
>  if (id == dev->free_page_report_cmd_id) {
>  dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
>  } else {
> +dev->host_stop_free_page = false;
>  dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
>  break;
>  }
> @@ -385,6 +445,10 @@ static void virtio_balloon_free_page_poll(void *opaque)
>  virtio_notify_config(vdev);
>  s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
>  
> +if (s->free_page_wait_time) {
> +balloon_free_page_change_timer(s, s->free_page_wait_time);
> +}
> +
>  virtio_balloon_poll_free_page_hints(s);
>  }
>  
> @@ -395,7 +459,19 @@ static void virtio_balloon_get_config(VirtIODevice 
> *vdev, uint8_t *config_data)
>  
>  config.num_pages = cpu_to_le32(dev->num_pages);
>  config.actual = cpu_to_le32(dev->actual);
> -config.free_page_report_cmd_id = 
> cpu_to_le32(dev->free_page_report_cmd_id);
> +if (dev->host_stop_free_page) {
> +/*
> + * Host is actively requesting to stop the free page reporting, send
> + * the stop sign to the guest. This happens when the migration thread
> + * has reached the phase to send pages to the destination while the
> + * guest hasn't done the reporting.
> + */
> +config.free_page_report_cmd_id =
> +VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID;
> +} else {
> +config.free_page_report_cmd_id =
> +
> cpu_to_le32(dev->free_page_report_cmd_id);
> +}
>  
>

Re: [Qemu-devel] [PATCH v3 00/25] generalize parsing of cpu_model (part 4)

2018-02-06 Thread Eduardo Habkost

I will try to summarize my comments here:

* I suggest squashing patches 2-22 together.  This way we
  shouldn't have any intermediate commits where "make check"
  generates warnings, and the series is shorter.
  * Or, even better: squash the CPU_RESOLVING_TYPE parts of 3-22
into one patch, and the tests/machine-none-test.c parts of
3-22 into patch 2.

* The linux-user/main.c hunk of patch 03/25 looks unnecessary.

* I suggest testing all CPU models in patch 02/25, but this
  shouldn't block the series.  Can be a follow-up patch.

All the rest looks good to me.

Thanks!


On Tue, Jan 23, 2018 at 09:07:59AM +0100, Igor Mammedov wrote:
> 
> v3:
>   - use qtest_startf() instead of qtest_start()
>   - rename tests/machine-none.c to tests/machine-none-test.c
>   - introduce first CPU_RESOLVING_TYPE for all targets and
> only then use it parse_cpu_model() 
>   - stop abusing  mc->default_cpu_type as resolving cpu type,
> move cpu_parse_cpu_model() in to exec.c and embed in
> CPU_RESOLVING_TYPE, so that callers won't have to know
> about unnecessary detail
> 
> v2:
>   - implemented new approach only for x86/ARM (will be done for all targets
> if approach seems acceptable)
>   - add test case for '-M none -cpu FOO' case
>   - redefine TARGET_DEFAULT_CPU_TYPE into CPU_RESOLVING_TYPE
>   - scrape off default cpu_model refactoring, so it would cause
> less conflicts with Laurent's series where he tries to rework
> defaults to use ELF hints of executed program
> 
> Series is finishing work on generalizing cpu_model parsing
> and limiting parts that deal with inconsistent cpu_model
> naming to "-cpu" CLI option in vl.c, bsd|linux-user/main.c
> CLI and default cpu_model processing and FOO_cpu_class_by_name()
> callbacks.
> 
> It introduces CPU_RESOLVING_TYPE which must be defined
> by each target and is used by helper parse_cpu_model()
> (former cpu_parse_cpu_model()) to get access to target
> specific FOO_cpu_class_by_name() callback.
> 
> git tree for testing:
>https://github.com/imammedo/qemu.git cpu_init_removal_v3
> 
> CC: Laurent Vivier 
> CC: Eduardo Habkost 
> CC: qemu-s3...@nongnu.org
> CC: qemu-...@nongnu.org
> CC: qemu-...@nongnu.org
> 
> Igor Mammedov (25):
>   nios2: 10m50_devboard: replace cpu_model with cpu_type
>   tests: add machine 'none' with -cpu test
>   arm: cpu: add CPU_RESOLVING_TYPE macro
>   x86: cpu: add CPU_RESOLVING_TYPE macro
>   alpha: cpu: add CPU_RESOLVING_TYPE macro
>   cris: cpu: add CPU_RESOLVING_TYPE macro
>   lm32: cpu: add CPU_RESOLVING_TYPE macro
>   m68k: cpu: add CPU_RESOLVING_TYPE macro
>   microblaze: cpu: add CPU_RESOLVING_TYPE macro
>   mips: cpu: add CPU_RESOLVING_TYPE macro
>   moxie: cpu: add CPU_RESOLVING_TYPE macro
>   nios2: cpu: add CPU_RESOLVING_TYPE macro
>   openrisc: cpu: add CPU_RESOLVING_TYPE macro
>   ppc: cpu: add CPU_RESOLVING_TYPE macro
>   s390x: cpu: add CPU_RESOLVING_TYPE macro
>   sh4: cpu: add CPU_RESOLVING_TYPE macro
>   sparc: cpu: add CPU_RESOLVING_TYPE macro
>   tricore: cpu: add CPU_RESOLVING_TYPE macro
>   unicore32: cpu: add CPU_RESOLVING_TYPE macro
>   xtensa: cpu: add CPU_RESOLVING_TYPE macro
>   hppa: cpu: add CPU_RESOLVING_TYPE macro
>   tilegx: cpu: add CPU_RESOLVING_TYPE macro
>   Use cpu_create(type) instead of cpu_init(cpu_model)
>   cpu: get rid of unused cpu_init() defines
>   cpu: get rid of cpu_generic_init()
> 
>  include/qom/cpu.h | 16 +---
>  target/alpha/cpu.h|  3 +-
>  target/arm/cpu.h  |  3 +-
>  target/cris/cpu.h |  3 +-
>  target/hppa/cpu.h |  2 +-
>  target/i386/cpu.h |  3 +-
>  target/lm32/cpu.h |  3 +-
>  target/m68k/cpu.h |  3 +-
>  target/microblaze/cpu.h   |  2 +-
>  target/mips/cpu.h |  3 +-
>  target/moxie/cpu.h|  3 +-
>  target/nios2/cpu.h|  2 +-
>  target/openrisc/cpu.h |  3 +-
>  target/ppc/cpu.h  |  3 +-
>  target/s390x/cpu.h|  3 +-
>  target/sh4/cpu.h  |  3 +-
>  target/sparc/cpu.h|  5 +--
>  target/tilegx/cpu.h   |  2 +-
>  target/tricore/cpu.h  |  3 +-
>  target/unicore32/cpu.h|  3 +-
>  target/xtensa/cpu.h   |  3 +-
>  bsd-user/main.c   |  4 +-
>  exec.c| 23 +++
>  hw/core/null-machine.c|  6 +--
>  hw/nios2/10m50_devboard.c |  2 +-
>  linux-user/main.c | 10 +++--
>  qom/cpu.c | 48 +--
>  tests/Makefile.include|  2 +
>  tests/machine-none-test.c | 97 
> +++
>  vl.c  | 10 ++---
>  30 files changed, 162 insertions(+), 114 deletions(-)
>  create mode 100644 tests/machine-none-test.c
> 
> -- 
> 2.7.4
> 
> 

-- 
Eduardo

[Qemu-devel] [PATCH v1 1/2] timer: Initial commit of xlnx-pmu-iomod-pit device

2018-02-06 Thread Alistair Francis

Signed-off-by: Alistair Francis 
---

 include/hw/timer/xlnx-pmu-iomod-pit.h |  58 
 hw/timer/xlnx-pmu-iomod-pit.c | 241 ++
 hw/timer/Makefile.objs|   2 +
 3 files changed, 301 insertions(+)
 create mode 100644 include/hw/timer/xlnx-pmu-iomod-pit.h
 create mode 100644 hw/timer/xlnx-pmu-iomod-pit.c

diff --git a/include/hw/timer/xlnx-pmu-iomod-pit.h 
b/include/hw/timer/xlnx-pmu-iomod-pit.h
new file mode 100644
index 00..15f9f0dee5
--- /dev/null
+++ b/include/hw/timer/xlnx-pmu-iomod-pit.h
@@ -0,0 +1,58 @@
+/*
+ * QEMU model of Xilinx I/O Module PIT
+ *
+ * Copyright (c) 2013 Xilinx Inc
+ * Written by Edgar E. Iglesias 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/ptimer.h"
+
+#define TYPE_XLNX_ZYNQMP_IOMODULE_PIT "xlnx.pmu_iomodule"
+
+#define XLNX_ZYNQMP_IOMODULE_PIT(obj) \
+ OBJECT_CHECK(XlnxPMUPIT, (obj), TYPE_XLNX_ZYNQMP_IOMODULE_PIT)
+
+#define XLNX_ZYNQMP_IOMODULE_PIT_R_MAX (0x08 + 1)
+
+typedef struct XlnxPMUPIT {
+SysBusDevice parent_obj;
+MemoryRegion iomem;
+
+QEMUBH *bh;
+ptimer_state *ptimer;
+
+qemu_irq irq;
+/* IRQ to pulse out when present timer hits zero */
+qemu_irq hit_out;
+
+/* Counter in Pre-Scalar(ps) Mode */
+uint32_t ps_counter;
+/* ps_mode irq-in to enable/disable pre-scalar */
+bool ps_enable;
+/* State var to remember hit_in level */
+bool ps_level;
+
+uint32_t frequency;
+
+uint32_t regs[XLNX_ZYNQMP_IOMODULE_PIT_R_MAX];
+RegisterInfo regs_info[XLNX_ZYNQMP_IOMODULE_PIT_R_MAX];
+} XlnxPMUPIT;
diff --git a/hw/timer/xlnx-pmu-iomod-pit.c b/hw/timer/xlnx-pmu-iomod-pit.c
new file mode 100644
index 00..cdfef1a440
--- /dev/null
+++ b/hw/timer/xlnx-pmu-iomod-pit.c
@@ -0,0 +1,241 @@
+/*
+ * QEMU model of Xilinx I/O Module PIT
+ *
+ * Copyright (c) 2013 Xilinx Inc
+ * Written by Edgar E. Iglesias 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/ptimer.h"
+#include "hw/register.h"
+#include "qemu/main-loop.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "hw/timer/xlnx-pmu-iomod-pit.h"
+
+#ifndef XLNX_ZYNQMP_IOMODULE_PIT_ERR_DEBUG
+#define XLNX_ZYNQMP_IOMODULE_PIT_ERR_DEBUG 0
+#endif
+
+REG32(PIT_PRELOAD, 0x00)
+REG32(PIT_COUNTER, 0x04)
+REG32(PIT_CONTROL, 0x08)
+FIELD(PIT_CONTROL, PRELOAD, 1, 1)
+FIELD(PIT_CONTROL, EN, 0, 1)
+
+static uint64_t xlnx_iomod_pit_ctr_pr(RegisterInfo *reg, uint64_t val)
+{
+XlnxPMUPIT *s = XLNX_ZYNQMP_IOMODULE_PIT(reg->opaque);
+uint32_t ret;
+
+if (s->ps_enable) {
+ret = s->ps_counter;
+} else {
+ret = ptimer_get_count(s->ptimer);
+}
+
+return ret;
+}
+
+static void xlnx_iomod_pit_control_pw(RegisterInfo

[Qemu-devel] [PATCH v1 0/2] Add and connet the PMU IOModule PIT device

2018-02-06 Thread Alistair Francis

Alistair Francis (2):
  timer: Initial commit of xlnx-pmu-iomod-pit device
  xlnx-zynqmp-pmu: Connect the PMU IOMOD PIT devices

 include/hw/timer/xlnx-pmu-iomod-pit.h |  58 
 hw/microblaze/xlnx-zynqmp-pmu.c   |  35 +
 hw/timer/xlnx-pmu-iomod-pit.c | 241 ++
 hw/timer/Makefile.objs|   2 +
 4 files changed, 336 insertions(+)
 create mode 100644 include/hw/timer/xlnx-pmu-iomod-pit.h
 create mode 100644 hw/timer/xlnx-pmu-iomod-pit.c

-- 
2.14.1

Re: [Qemu-devel] [PATCH v3 24/25] cpu: get rid of unused cpu_init() defines

2018-02-06 Thread Eduardo Habkost

On Tue, Jan 23, 2018 at 09:08:23AM +0100, Igor Mammedov wrote:
> cpu_init(cpu_model) were replaced by cpu_create(cpu_type) so
> no users are left, remove it.
> 
> Signed-off-by: Igor Mammedov 

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

[Qemu-devel] [PATCH v1 2/2] xlnx-zynqmp-pmu: Connect the PMU IOMOD PIT devices

2018-02-06 Thread Alistair Francis

Signed-off-by: Alistair Francis 
---

 hw/microblaze/xlnx-zynqmp-pmu.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/hw/microblaze/xlnx-zynqmp-pmu.c b/hw/microblaze/xlnx-zynqmp-pmu.c
index 999a5657cf..f466c56e45 100644
--- a/hw/microblaze/xlnx-zynqmp-pmu.c
+++ b/hw/microblaze/xlnx-zynqmp-pmu.c
@@ -26,6 +26,7 @@
 
 #include "hw/intc/xlnx-zynqmp-ipi.h"
 #include "hw/intc/xlnx-pmu-iomod-intc.h"
+#include "hw/timer/xlnx-pmu-iomod-pit.h"
 
 /* Define the PMU device */
 
@@ -40,6 +41,7 @@
 #define XLNX_ZYNQMP_PMU_INTC_ADDR   0xFFD4
 
 #define XLNX_ZYNQMP_PMU_NUM_IPIS4
+#define XLNX_ZYNQMP_PMU_NUM_PITS4
 
 static const uint64_t ipi_addr[XLNX_ZYNQMP_PMU_NUM_IPIS] = {
 0xFF34, 0xFF35, 0xFF36, 0xFF37,
@@ -48,6 +50,13 @@ static const uint64_t ipi_irq[XLNX_ZYNQMP_PMU_NUM_IPIS] = {
 19, 20, 21, 22,
 };
 
+static const uint64_t pit_addr[XLNX_ZYNQMP_PMU_NUM_PITS] = {
+0xFFD40040, 0xFFD40050, 0xFFD40060, 0xFFD40070,
+};
+static const uint64_t pit_irq[XLNX_ZYNQMP_PMU_NUM_PITS] = {
+3, 4, 5, 6,
+};
+
 typedef struct XlnxZynqMPPMUSoCState {
 /*< private >*/
 DeviceState parent_obj;
@@ -147,7 +156,9 @@ static void xlnx_zynqmp_pmu_init(MachineState *machine)
 MemoryRegion *pmu_rom = g_new(MemoryRegion, 1);
 MemoryRegion *pmu_ram = g_new(MemoryRegion, 1);
 XlnxZynqMPIPI *ipi[XLNX_ZYNQMP_PMU_NUM_IPIS];
+XlnxPMUPIT *pit[XLNX_ZYNQMP_PMU_NUM_PITS];
 qemu_irq irq[32];
+qemu_irq hit_in;
 int i;
 
 /* Create the ROM */
@@ -186,6 +197,30 @@ static void xlnx_zynqmp_pmu_init(MachineState *machine)
 sysbus_connect_irq(SYS_BUS_DEVICE(ipi[i]), 0, irq[ipi_irq[i]]);
 }
 
+/* Create and connect the IOMOD PIT devices */
+for (i = 0; i < XLNX_ZYNQMP_PMU_NUM_PITS; i++) {
+pit[i] = g_new0(XlnxPMUPIT, 1);
+object_initialize(pit[i], sizeof(XlnxPMUPIT), 
TYPE_XLNX_ZYNQMP_IOMODULE_PIT);
+qdev_set_parent_bus(DEVICE(pit[i]), sysbus_get_default());
+}
+
+for (i = 0; i < XLNX_ZYNQMP_PMU_NUM_PITS; i++) {
+object_property_set_bool(OBJECT(pit[i]), true, "realized",
+ _abort);
+sysbus_mmio_map(SYS_BUS_DEVICE(pit[i]), 0, pit_addr[i]);
+sysbus_connect_irq(SYS_BUS_DEVICE(pit[i]), 0, irq[pit_irq[i]]);
+}
+
+/* PIT1 hits into PIT0 */
+hit_in = qdev_get_gpio_in_named(DEVICE(pit[0]), "ps_hit_in", 0);
+qdev_connect_gpio_out_named(DEVICE(pit[1]), "ps_hit_out", 0, hit_in);
+
+/* PIT3 hits into PIT2 */
+hit_in = qdev_get_gpio_in_named(DEVICE(pit[2]), "ps_hit_in", 0);
+qdev_connect_gpio_out_named(DEVICE(pit[3]), "ps_hit_out", 0, hit_in);
+
+/* TODO: PIT0 and PIT2 "ps_config" GPIO goes to The GPO1 device. */
+
 /* Load the kernel */
 microblaze_load_kernel(>cpu, XLNX_ZYNQMP_PMU_RAM_ADDR,
machine->ram_size,
-- 
2.14.1

Re: [Qemu-devel] [PATCH v4 23/25] Use cpu_create(type) instead of cpu_init(cpu_model)

2018-02-06 Thread Eduardo Habkost

On Mon, Feb 05, 2018 at 06:08:29PM +0100, Igor Mammedov wrote:
> With all targets defining CPU_RESOLVING_TYPE, refactor
> cpu_parse_cpu_model(type, cpu_model) to parse_cpu_model(cpu_model)
> so that callers won't have to know internal resolving cpu
> type. Place it in exec.c so it could be called from both
> target independed vl.c and *-user/main.c.
> 
> That allows us to stop abusing cpu type from
>   MachineClass::default_cpu_type
> as resolver class in vl.c which were confusing part of
> cpu_parse_cpu_model().
> 
> Also with new parse_cpu_model(), the last users of cpu_init()
> in null-machine.c and bsd/linux-user targets could be switched
> to cpu_create() API and cpu_init() API will be removed by
> follow up patch.
> 
> With no longer users left remove MachineState::cpu_model field,
> new code should use MachineState::cpu_type instead and
> leave cpu_model parsing to generic code in vl.c.
> 
> Signed-off-by: Igor Mammedov 
> ---
> v4:
>   - actually remove no longer used MachineState::cpu_model field
> that I've lost somewhere during respins
> 
>   - squash in [PATCH v3 25/25] cpu: get rid of cpu_generic_init()
> as after rework/rebase cpu_generic_init() is being removed by
> this patch and only check removal was left in 25/25, which
> should be removed together with cpu_generic_init() in this patch
> 
> CC: Richard Henderson 
> CC: "Emilio G. Cota" 
> CC: Paolo Bonzini 
> CC: Eduardo Habkost 
> CC: "Alex Bennée" 
> CC: "Philippe Mathieu-Daudé" 
> 
> fixup: Use cpu_create(type) instead of  cpu_init(cpu_model)
> 
> Signed-off-by: Igor Mammedov 
> ---
>  include/hw/boards.h|  1 -
>  include/qom/cpu.h  | 16 ++--
>  bsd-user/main.c|  4 +++-
>  exec.c | 23 +++
>  hw/core/null-machine.c |  6 +++---
>  linux-user/main.c  |  8 ++--
>  qom/cpu.c  | 48 ++--
>  vl.c   | 10 +++---
>  8 files changed, 42 insertions(+), 74 deletions(-)

Less 32 lines.  Nice.  :)


[...]
> @@ -335,22 +304,9 @@ static ObjectClass *cpu_common_class_by_name(const char 
> *cpu_model)
>  static void cpu_common_parse_features(const char *typename, char *features,
>Error **errp)
>  {
> -char *featurestr; /* Single "key=value" string being parsed */
>  char *val;
> -static bool cpu_globals_initialized;
> -
> -/* TODO: all callers of ->parse_features() need to be changed to
> - * call it only once, so we can remove this check (or change it
> - * to assert(!cpu_globals_initialized).
> - * Current callers of ->parse_features() are:
> - * - cpu_generic_init()
> - */
> -if (cpu_globals_initialized) {
> -return;

Suggestion: replace this with assert(!cpu_globals_initialized)
just to make sure there are no bugs that make us register CPU
globals twice.

This shouldn't block the patch, though:

Reviewed-by: Eduardo Habkost 

> -}
> -cpu_globals_initialized = true;
> -
> -featurestr = features ? strtok(features, ",") : NULL;
> +/* Single "key=value" string being parsed */
> +char *featurestr = features ? strtok(features, ",") : NULL;
>  
>  while (featurestr) {
>  val = strchr(featurestr, '=');
[...]

-- 
Eduardo

Re: [Qemu-devel] [RFC PATCH 00/34] Hyper-V / VMBus

2018-02-06 Thread no-reply

Hi,

This series failed docker-quick@centos6 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20180206203048.11096-1-rka...@virtuozzo.com
Subject: [Qemu-devel] [RFC PATCH 00/34] Hyper-V / VMBus

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-quick@centos6
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
98adc64f2b hv-net: define default rom file name
32dd25dfe6 vmbus: add support for rom files
983ba13ed2 loader: allow arbitrary basename for fw_cfg file roms
3e4e616672 hv-net: add .bootindex support
9b49f45b61 net: add Hyper-V/VMBus net adapter
8b3b6a7831 net: add Hyper-V/VMBus network protocol definitions
932a14d315 net: add RNDIS definitions
58ff5f3675 tests: hv-scsi: add start-stop test
f81c7ad4b7 hv-scsi: limit the number of requests per notification
62d0ac72b5 scsi: add Hyper-V/VMBus SCSI controller
438b47b468 scsi: add Hyper-V/VMBus SCSI protocol definitions
4baa88ac14 i386: en/disable vmbus by a machine property
7e9adc9af7 i386: Hyper-V VMBus ACPI DSDT entry
1c19b65fab vmbus: build configuration
4471f1afca vmbus: vmbus implementation
6aff6ba275 vmbus: add vmbus protocol definitions
601a95fc0a hyperv: add support for KVM_HYPERV_EVENTFD
9ec270a97e import HYPERV_EVENTFD stuff from kernel
cbfeb5e77c hyperv: update copyright notices
77d9e62db5 hyperv_testdev: add SynIC message and event testmodes
6bf86c8237 hyperv: process POST_MESSAGE hypercall
52d482a30c hyperv: process SIGNAL_EVENT hypercall
be4ed13446 hyperv: add synic event flag signaling
58a94ec55e hyperv: add synic message delivery
921fcaab97 hyperv: make overlay pages for SynIC
8540e6cff0 hyperv: block SynIC use in QEMU in incompatible configurations
0a8a729b0b hyperv: qom-ify SynIC
1f9fba8cb0 hyperv: make HvSintRoute reference-counted
4d649ce401 hyperv: address HvSintRoute by X86CPU pointer
96cc3d2e21 hyperv: allow passing arbitrary data to sint ack callback
93ceab6b34 hyperv: synic: only setup ack notifier if there's a callback
fa7482c5ff hyperv: cosmetic: g_malloc -> g_new
63d46e7b96 hyperv_testdev: refactor for readability
ff514f8890 hyperv: ensure VP index equal to QEMU cpu_index

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-u4db_dad/src/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
  BUILD   centos6
  GEN 
/var/tmp/patchew-tester-tmp-u4db_dad/src/docker-src.2018-02-06-17.20.48.6483/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-u4db_dad/src/docker-src.2018-02-06-17.20.48.6483/qemu.tar.vroot'...
done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-u4db_dad/src/docker-src.2018-02-06-17.20.48.6483/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-u4db_dad/src/docker-src.2018-02-06-17.20.48.6483/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'10739aa26051a5d49d88132604539d3ed085e72e'
  COPYRUNNER
RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
bison-2.4.1-5.el6.x86_64
bzip2-devel-1.0.5-7.el6_0.x86_64
ccache-3.1.6-2.el6.x86_64
csnappy-devel-0-6.20150729gitd7bc683.el6.x86_64
flex-2.5.35-9.el6.x86_64
gcc-4.4.7-18.el6.x86_64
gettext-0.17-18.el6.x86_64
git-1.7.1-9.el6_9.x86_64
glib2-devel-2.28.8-9.el6.x86_64
libepoxy-devel-1.2-3.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
librdmacm-devel-1.0.21-0.el6.x86_64
lzo-devel-2.03-3.1.el6_5.1.x86_64
make-3.81-23.el6.x86_64
mesa-libEGL-devel-11.0.7-4.el6.x86_64
mesa-libgbm-devel-11.0.7-4.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
spice-glib-devel-0.26-8.el6.x86_64
spice-server-devel-0.12.4-16.el6.x86_64
tar-1.23-15.el6_8.x86_64
vte-devel-0.25.1-9.el6.x86_64
xen-devel-4.6.6-2.el6.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=bison bzip2-devel ccache csnappy-devel flex g++
 gcc gettext git glib2-devel libepoxy-devel libfdt-devel
 librdmacm-devel lzo-devel make mesa-libEGL-devel 
mesa-libgbm-devel pixman-devel SDL-devel spice-glib-devel 
spice-server-devel tar vte-devel xen-devel zlib-devel
HOSTNAME=b877588de59b
MAKEFLAGS= -j8
J=8
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
TARGET_LIST=
SHLVL=1
HOME=/root

Re: [Qemu-devel] [PATCH v3 02/25] tests: add machine 'none' with -cpu test

2018-02-06 Thread Eduardo Habkost

On Tue, Jan 23, 2018 at 09:08:01AM +0100, Igor Mammedov wrote:
> Check that "$QEMU -M none -cpu FOO" starts QEMU without error
> 
> Signed-off-by: Igor Mammedov 
[...]
> +struct arch2cpu {
> +const char *arch;
> +const char *cpu_model;
> +};
> +
> +static struct arch2cpu cpus_map[] = {
> +/* tested targets list */
> +};

Why are we testing only a single CPU model on each target (and
requiring one entry for each architecture in this table), instead
of just running query-cpu-definitions and testing all CPU models?

-- 
Eduardo

Re: [Qemu-devel] [PATCH v3 03/25] arm: cpu: add CPU_RESOLVING_TYPE macro

2018-02-06 Thread Eduardo Habkost

On Tue, Jan 23, 2018 at 09:08:02AM +0100, Igor Mammedov wrote:
> it will be used for providing to cpu name resolving class for
> parsing cpu model for system and user emulation code.
> 
> Along with change add target to null-machine test, so
> that when switch to CPU_RESOLVING_TYPE happens,
> thest would ensure that null-mchine usecase still works.
> 
> Signed-off-by: Igor Mammedov 
> ---
[...]
> @@ -4325,8 +4325,6 @@ int main(int argc, char **argv, char **envp)
>  #else
>  cpu_model = "qemu32";
>  #endif
> -#elif defined(TARGET_ARM)
> -cpu_model = "any";

I don't see any explanation for this hunk in the commit message.


>  #elif defined(TARGET_UNICORE32)
>  cpu_model = "any";
>  #elif defined(TARGET_M68K)
[...]

-- 
Eduardo

Re: [Qemu-devel] [PATCH v3 02/25] tests: add machine 'none' with -cpu test

2018-02-06 Thread Eduardo Habkost

On Tue, Jan 23, 2018 at 09:08:01AM +0100, Igor Mammedov wrote:
> Check that "$QEMU -M none -cpu FOO" starts QEMU without error
> 
> Signed-off-by: Igor Mammedov 
> ---
> v2:
>   - rename file to machine-none-test.c (Thomas Huth )
>   - use qtest_startf()/instead of qtest_start() (Thomas Huth 
> )
[...]
> +static struct arch2cpu cpus_map[] = {
> +/* tested targets list */
> +};
[...]
> +static void test_machine_cpu_cli(void)
> +{
> +QDict *response;
> +const char *arch = qtest_get_arch();
> +const char *cpu_model = get_cpu_model_by_arch(arch);
> +
> +if (!cpu_model) {
> +fprintf(stderr, "WARNING: cpu name for target '%s' isn't defined,"
> +" add it to cpus_map\n", arch);
> +return; /* TODO: die here to force all targets have a test */
> +}

I'm unsure if it's OK to purposefully have intermediate commits
that will generate warnings on "make check".  It could confuse
people doing bisects.

I would prefer to add this warning only after all targets are
converted.  Or maybe only add this test code after all targets
are converted.

-- 
Eduardo

Re: [Qemu-devel] [RFC PATCH 00/34] Hyper-V / VMBus

2018-02-06 Thread no-reply

Hi,

This series failed docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20180206203048.11096-1-rka...@virtuozzo.com
Subject: [Qemu-devel] [RFC PATCH 00/34] Hyper-V / VMBus

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
98adc64f2b hv-net: define default rom file name
32dd25dfe6 vmbus: add support for rom files
983ba13ed2 loader: allow arbitrary basename for fw_cfg file roms
3e4e616672 hv-net: add .bootindex support
9b49f45b61 net: add Hyper-V/VMBus net adapter
8b3b6a7831 net: add Hyper-V/VMBus network protocol definitions
932a14d315 net: add RNDIS definitions
58ff5f3675 tests: hv-scsi: add start-stop test
f81c7ad4b7 hv-scsi: limit the number of requests per notification
62d0ac72b5 scsi: add Hyper-V/VMBus SCSI controller
438b47b468 scsi: add Hyper-V/VMBus SCSI protocol definitions
4baa88ac14 i386: en/disable vmbus by a machine property
7e9adc9af7 i386: Hyper-V VMBus ACPI DSDT entry
1c19b65fab vmbus: build configuration
4471f1afca vmbus: vmbus implementation
6aff6ba275 vmbus: add vmbus protocol definitions
601a95fc0a hyperv: add support for KVM_HYPERV_EVENTFD
9ec270a97e import HYPERV_EVENTFD stuff from kernel
cbfeb5e77c hyperv: update copyright notices
77d9e62db5 hyperv_testdev: add SynIC message and event testmodes
6bf86c8237 hyperv: process POST_MESSAGE hypercall
52d482a30c hyperv: process SIGNAL_EVENT hypercall
be4ed13446 hyperv: add synic event flag signaling
58a94ec55e hyperv: add synic message delivery
921fcaab97 hyperv: make overlay pages for SynIC
8540e6cff0 hyperv: block SynIC use in QEMU in incompatible configurations
0a8a729b0b hyperv: qom-ify SynIC
1f9fba8cb0 hyperv: make HvSintRoute reference-counted
4d649ce401 hyperv: address HvSintRoute by X86CPU pointer
96cc3d2e21 hyperv: allow passing arbitrary data to sint ack callback
93ceab6b34 hyperv: synic: only setup ack notifier if there's a callback
fa7482c5ff hyperv: cosmetic: g_malloc -> g_new
63d46e7b96 hyperv_testdev: refactor for readability
ff514f8890 hyperv: ensure VP index equal to QEMU cpu_index

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-ha3h6wkd/src/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
  BUILD   fedora
  GEN 
/var/tmp/patchew-tester-tmp-ha3h6wkd/src/docker-src.2018-02-06-17.06.55.18307/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-ha3h6wkd/src/docker-src.2018-02-06-17.06.55.18307/qemu.tar.vroot'...
done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-ha3h6wkd/src/docker-src.2018-02-06-17.06.55.18307/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-ha3h6wkd/src/docker-src.2018-02-06-17.06.55.18307/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'10739aa26051a5d49d88132604539d3ed085e72e'
  COPYRUNNER
RUN test-mingw in qemu:fedora 
Packages installed:
PyYAML-3.11-13.fc25.x86_64
SDL-devel-1.2.15-21.fc24.x86_64
bc-1.06.95-16.fc24.x86_64
bison-3.0.4-4.fc24.x86_64
bzip2-1.0.6-21.fc25.x86_64
ccache-3.3.4-1.fc25.x86_64
clang-3.9.1-2.fc25.x86_64
findutils-4.6.0-8.fc25.x86_64
flex-2.6.0-3.fc25.x86_64
gcc-6.4.1-1.fc25.x86_64
gcc-c++-6.4.1-1.fc25.x86_64
gettext-0.19.8.1-3.fc25.x86_64
git-2.9.5-3.fc25.x86_64
glib2-devel-2.50.3-1.fc25.x86_64
hostname-3.15-8.fc25.x86_64
libaio-devel-0.3.110-6.fc24.x86_64
libasan-6.4.1-1.fc25.x86_64
libfdt-devel-1.4.2-1.fc25.x86_64
libubsan-6.4.1-1.fc25.x86_64
make-4.1-6.fc25.x86_64
mingw32-SDL-1.2.15-7.fc24.noarch
mingw32-bzip2-1.0.6-7.fc24.noarch
mingw32-curl-7.47.0-1.fc24.noarch
mingw32-glib2-2.50.3-1.fc25.noarch
mingw32-gmp-6.1.1-1.fc25.noarch
mingw32-gnutls-3.5.5-2.fc25.noarch
mingw32-gtk2-2.24.31-2.fc25.noarch
mingw32-gtk3-3.22.17-1.fc25.noarch
mingw32-libjpeg-turbo-1.5.1-1.fc25.noarch
mingw32-libpng-1.6.27-1.fc25.noarch
mingw32-libssh2-1.4.3-5.fc24.noarch
mingw32-libtasn1-4.9-1.fc25.noarch
mingw32-nettle-3.3-1.fc25.noarch
mingw32-pixman-0.34.0-1.fc25.noarch
mingw32-pkg-config-0.28-6.fc24.x86_64
mingw64-SDL-1.2.15-7.fc24.noarch
mingw64-bzip2-1.0.6-7.fc24.noarch
mingw64-curl-7.47.0-1.fc24.noarch
mingw64-glib2-2.50.3-1.fc25.noarch
mingw64-gmp-6.1.1-1.fc25.noarch
mingw64-gnutls-3.5.5-2.fc25.noarch
mingw64-gtk2-2.24.31-2.fc25.noarch
mingw64-gtk3-3.22.17-1.fc25.noarch
mingw64-libjpeg-turbo-1.5.1-1.fc25.noarch

Re: [Qemu-devel] [PATCH v3 01/25] nios2: 10m50_devboard: replace cpu_model with cpu_type

2018-02-06 Thread Eduardo Habkost

On Tue, Jan 23, 2018 at 09:08:00AM +0100, Igor Mammedov wrote:
> use cpu_create() instead of being removed cpu_generic_init()
> 
> Signed-off-by: Igor Mammedov 
> ---
> CC: Chris Wulff 
> CC: Marek Vasut 
> ---
>  hw/nios2/10m50_devboard.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/nios2/10m50_devboard.c b/hw/nios2/10m50_devboard.c
> index e4007f6..42053b2 100644
> --- a/hw/nios2/10m50_devboard.c
> +++ b/hw/nios2/10m50_devboard.c
> @@ -75,7 +75,7 @@ static void nios2_10m50_ghrd_init(MachineState *machine)
>  phys_ram_alias);
>  
>  /* Create CPU -- FIXME */
> -cpu = NIOS2_CPU(cpu_generic_init(TYPE_NIOS2_CPU, "nios2"));
> +cpu = NIOS2_CPU(cpu_create(TYPE_NIOS2_CPU));

Matches what nios2_cpu_class_by_name() does.

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [Qemu-devel] [RFC PATCH 00/34] Hyper-V / VMBus

2018-02-06 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180206203048.11096-1-rka...@virtuozzo.com
Subject: [Qemu-devel] [RFC PATCH 00/34] Hyper-V / VMBus

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
98adc64f2b hv-net: define default rom file name
32dd25dfe6 vmbus: add support for rom files
983ba13ed2 loader: allow arbitrary basename for fw_cfg file roms
3e4e616672 hv-net: add .bootindex support
9b49f45b61 net: add Hyper-V/VMBus net adapter
8b3b6a7831 net: add Hyper-V/VMBus network protocol definitions
932a14d315 net: add RNDIS definitions
58ff5f3675 tests: hv-scsi: add start-stop test
f81c7ad4b7 hv-scsi: limit the number of requests per notification
62d0ac72b5 scsi: add Hyper-V/VMBus SCSI controller
438b47b468 scsi: add Hyper-V/VMBus SCSI protocol definitions
4baa88ac14 i386: en/disable vmbus by a machine property
7e9adc9af7 i386: Hyper-V VMBus ACPI DSDT entry
1c19b65fab vmbus: build configuration
4471f1afca vmbus: vmbus implementation
6aff6ba275 vmbus: add vmbus protocol definitions
601a95fc0a hyperv: add support for KVM_HYPERV_EVENTFD
9ec270a97e import HYPERV_EVENTFD stuff from kernel
cbfeb5e77c hyperv: update copyright notices
77d9e62db5 hyperv_testdev: add SynIC message and event testmodes
6bf86c8237 hyperv: process POST_MESSAGE hypercall
52d482a30c hyperv: process SIGNAL_EVENT hypercall
be4ed13446 hyperv: add synic event flag signaling
58a94ec55e hyperv: add synic message delivery
921fcaab97 hyperv: make overlay pages for SynIC
8540e6cff0 hyperv: block SynIC use in QEMU in incompatible configurations
0a8a729b0b hyperv: qom-ify SynIC
1f9fba8cb0 hyperv: make HvSintRoute reference-counted
4d649ce401 hyperv: address HvSintRoute by X86CPU pointer
96cc3d2e21 hyperv: allow passing arbitrary data to sint ack callback
93ceab6b34 hyperv: synic: only setup ack notifier if there's a callback
fa7482c5ff hyperv: cosmetic: g_malloc -> g_new
63d46e7b96 hyperv_testdev: refactor for readability
ff514f8890 hyperv: ensure VP index equal to QEMU cpu_index

=== OUTPUT BEGIN ===
Checking PATCH 1/34: hyperv: ensure VP index equal to QEMU cpu_index...
Checking PATCH 2/34: hyperv_testdev: refactor for readability...
Checking PATCH 3/34: hyperv: cosmetic: g_malloc -> g_new...
Checking PATCH 4/34: hyperv: synic: only setup ack notifier if there's a 
callback...
Checking PATCH 5/34: hyperv: allow passing arbitrary data to sint ack 
callback...
Checking PATCH 6/34: hyperv: address HvSintRoute by X86CPU pointer...
Checking PATCH 7/34: hyperv: make HvSintRoute reference-counted...
Checking PATCH 8/34: hyperv: qom-ify SynIC...
Checking PATCH 9/34: hyperv: block SynIC use in QEMU in incompatible 
configurations...
Checking PATCH 10/34: hyperv: make overlay pages for SynIC...
Checking PATCH 11/34: hyperv: add synic message delivery...
Checking PATCH 12/34: hyperv: add synic event flag signaling...
Checking PATCH 13/34: hyperv: process SIGNAL_EVENT hypercall...
Checking PATCH 14/34: hyperv: process POST_MESSAGE hypercall...
Checking PATCH 15/34: hyperv_testdev: add SynIC message and event testmodes...
Checking PATCH 16/34: hyperv: update copyright notices...
Checking PATCH 17/34: import HYPERV_EVENTFD stuff from kernel...
Checking PATCH 18/34: hyperv: add support for KVM_HYPERV_EVENTFD...
Checking PATCH 19/34: vmbus: add vmbus protocol definitions...
ERROR: do not use C99 // comments
#133: FILE: include/hw/vmbus/vmbus-proto.h:114:
+uint8_t  monitor_flags;  // VMBUS_OFFER_MONITOR_*

ERROR: do not use C99 // comments
#134: FILE: include/hw/vmbus/vmbus-proto.h:115:
+uint16_t interrupt_flags;// VMBUS_OFFER_INTERRUPT_*

ERROR: do not use C99 // comments
#211: FILE: include/hw/vmbus/vmbus-proto.h:192:
+uint32_t feature_bits; // VMBUS_RING_BUFFER_FEAT_*

total: 3 errors, 0 warnings, 222 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 20/34: vmbus: vmbus implementation...
ERROR: open brace '{' following struct go on the same line
#91: FILE: hw/vmbus/vmbus.c:32:
+typedef struct VMBusGpadl
+{

ERROR: open brace '{' following struct go on the same line
#130: FILE: hw/vmbus/vmbus.c:71:
+typedef struct VMBusChannel
+{

ERROR: open brace '{' following struct go on the same line
#171: FILE: hw/vmbus/vmbus.c:112:
+typedef struct VMBus
+{

ERROR: memory barrier without comment
#566: FILE:

Re: [Qemu-devel] [PATCH 08/10] cuda: factor out timebase-derived counter value and load time

2018-02-06 Thread Mark Cave-Ayland


On 05/02/18 19:44, Philippe Mathieu-Daudé wrote:


On 02/03/2018 07:37 AM, Mark Cave-Ayland wrote:

Commit b981289c49 "PPC: Cuda: Use cuda timer to expose tbfreq to guest" altered
the timer calculations from those based upon the hardware CUDA clock frequency
to those based upon the CPU timebase frequency.

In fact we can isolate the differences to 2 simple changes: one to the counter
read value and another to the counter load time. Move these changes into
separate functions so the implementation can be swapped later.

Signed-off-by: Mark Cave-Ayland 
---
  hw/misc/macio/cuda.c | 25 -
  1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/misc/macio/cuda.c b/hw/misc/macio/cuda.c
index 00e71fcd5e..184d151702 100644
--- a/hw/misc/macio/cuda.c
+++ b/hw/misc/macio/cuda.c
@@ -145,21 +145,29 @@ static void cuda_update_irq(CUDAState *s)
  }
  }
  
-static uint64_t get_tb(uint64_t time, uint64_t freq)

+static uint64_t get_counter_value(CUDAState *s, CUDATimer *ti)
  {
-return muldiv64(time, freq, NANOSECONDS_PER_SECOND);
+/* Reverse of the tb calculation algorithm that Mac OS X uses on bootup */
+uint64_t tb_diff = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
+s->tb_frequency, NANOSECONDS_PER_SECOND) -
+   ti->load_time;


Easier to read imho:

uint64_t tb_diff = get_counter_load_time(s, ti) - ti->load_time;


Yes - I'm in two minds about this one, although I feel as if I should 
keep these functions completely separate since they form separate 
virtual methods in the device class (see patch 10). Does looking at that 
particular patch change your opinion at all?



Reviewed-by: Philippe Mathieu-Daudé 


+
+return (tb_diff * 0xBF401675E5DULL) / (s->tb_frequency << 24);
+}
+
+static uint64_t get_counter_load_time(CUDAState *s, CUDATimer *ti)
+{
+uint64_t load_time = muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
+  s->tb_frequency, NANOSECONDS_PER_SECOND);
+return load_time;
  }
  
  static unsigned int get_counter(CUDAState *s, CUDATimer *ti)

  {
  int64_t d;
  unsigned int counter;
-uint64_t tb_diff;
-uint64_t current_time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
  
-/* Reverse of the tb calculation algorithm that Mac OS X uses on bootup. */

-tb_diff = get_tb(current_time, s->tb_frequency) - ti->load_time;
-d = (tb_diff * 0xBF401675E5DULL) / (s->tb_frequency << 24);
+d = get_counter_value(s, ti);
  
  if (ti->index == 0) {

  /* the timer goes down from latch to -1 (period of latch + 2) */
@@ -178,8 +186,7 @@ static unsigned int get_counter(CUDAState *s, CUDATimer *ti)
  static void set_counter(CUDAState *s, CUDATimer *ti, unsigned int val)
  {
  CUDA_DPRINTF("T%d.counter=%d\n", 1 + ti->index, val);
-ti->load_time = get_tb(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
-   s->tb_frequency);
+ti->load_time = get_counter_load_time(s, ti);
  ti->counter_value = val;
  cuda_timer_update(s, ti, ti->load_time);
  }




ATB,

Mark.

Re: [Qemu-devel] [PATCH 01/10] cuda: do not use old_mmio accesses

2018-02-06 Thread Mark Cave-Ayland


On 05/02/18 14:17, Laurent Vivier wrote:


On 03/02/2018 11:37, Mark Cave-Ayland wrote:

Signed-off-by: Mark Cave-Ayland 
---
  hw/misc/macio/cuda.c | 40 
  1 file changed, 8 insertions(+), 32 deletions(-)

diff --git a/hw/misc/macio/cuda.c b/hw/misc/macio/cuda.c
index 008d8bd4d5..23b7e0f5b0 100644
--- a/hw/misc/macio/cuda.c
+++ b/hw/misc/macio/cuda.c
@@ -275,7 +275,7 @@ static void cuda_delay_set_sr_int(CUDAState *s)
  timer_mod(s->sr_delay_timer, expire);
  }
  
-static uint32_t cuda_readb(void *opaque, hwaddr addr)

+static uint64_t cuda_read(void *opaque, hwaddr addr, unsigned size)
  {
  CUDAState *s = opaque;
  uint32_t val;
@@ -350,7 +350,7 @@ static uint32_t cuda_readb(void *opaque, hwaddr addr)
  return val;
  }
  
-static void cuda_writeb(void *opaque, hwaddr addr, uint32_t val)

+static void cuda_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
  {
  CUDAState *s = opaque;
  
@@ -780,38 +780,14 @@ static void cuda_receive_packet_from_host(CUDAState *s,

  }
  }
  
-static void cuda_writew (void *opaque, hwaddr addr, uint32_t value)

-{
-}
-
-static void cuda_writel (void *opaque, hwaddr addr, uint32_t value)
-{
-}
-
-static uint32_t cuda_readw (void *opaque, hwaddr addr)
-{
-return 0;
-}
-
-static uint32_t cuda_readl (void *opaque, hwaddr addr)
-{
-return 0;
-}
-
  static const MemoryRegionOps cuda_ops = {
-.old_mmio = {
-.write = {
-cuda_writeb,
-cuda_writew,
-cuda_writel,
-},
-.read = {
-cuda_readb,
-cuda_readw,
-cuda_readl,
-},
-},
+.read = cuda_read,
+.write = cuda_write,
  .endianness = DEVICE_NATIVE_ENDIAN,


As CUDA Macintoshes are all big-endian, I think you should use
DEVICE_BIG_ENDIAN here, except if you are aware of a little-endian
machine using it in little-endian mode.


Yes that's true, I will give it a test to make sure that everything 
still works as expected. Any thoughts on the rest of the patchset? I 
haven't finished porting Ben's PMU patches over to use it, however 
things are looking fairly good with progress so far.



ATB,

Mark.

Re: [Qemu-devel] [PATCH v2] pci/bus: let it has higher migration priority

2018-02-06 Thread Michael S. Tsirkin

On Tue, Feb 06, 2018 at 03:39:33PM +0800, Peter Xu wrote:
> In the past, we prioritized IOMMU migration so that we have such a
> priority order:
> 
> IOMMU > PCI Devices
> 
> When migrating a guest with both vIOMMU and a pcie-root-port, we'll
> always migrate vIOMMU first, since pci buses will be seen to have the
> same priority of general PCI devices.
> 
> That's problematic.
> 
> The thing is that PCI bus number information is stored in the root port,
> and that is needed by vIOMMU during post_load(), e.g., to figure out
> context entry for a device.  If we don't have correct bus numbers for
> devices, we won't be able to recover device state of the DMAR memory
> regions, and things will be messed up.
> 
> So let's boost the PCIe root ports to be even with higher priority:
> 
>PCIe Root Port > IOMMU > PCI Devices
> 
> A smoke test shows that this patch fixes bug 1538953.
> 
> Also, apply this rule to all the PCI bus/bridge devices: ioh3420,
> xio3130_downstream, xio3130_upstream, pcie_pci_bridge, pci-pci bridge,
> i82801b11.
> 
> I noted that we set pcie_pci_bridge_dev_vmstate twice.  Clean that up
> together.
> 
> CC: Alex Williamson 
> CC: Marcel Apfelbaum 
> CC: Michael S. Tsirkin 
> CC: Dr. David Alan Gilbert 
> CC: Juan Quintela 
> CC: Laurent Vivier 
> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
> Reported-by: Maxime Coquelin 
> Signed-off-by: Peter Xu 

dgilbert, could you pls confirm this looks like a sane approach to you?

> ---
> v2:
> - add more devices that Marcel mentioned
> - rename to MIG_PRI_PCI_BUS
> - remove one useless line in existing code
> ---
>  hw/pci-bridge/gen_pcie_root_port.c | 1 +
>  hw/pci-bridge/i82801b11.c  | 1 +
>  hw/pci-bridge/ioh3420.c| 1 +
>  hw/pci-bridge/pci_bridge_dev.c | 1 +
>  hw/pci-bridge/pcie_pci_bridge.c| 2 +-
>  hw/pci-bridge/xio3130_downstream.c | 1 +
>  hw/pci-bridge/xio3130_upstream.c   | 1 +
>  include/migration/vmstate.h| 1 +
>  8 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/pci-bridge/gen_pcie_root_port.c 
> b/hw/pci-bridge/gen_pcie_root_port.c
> index 0e2f2e8bf1..435fbaa60e 100644
> --- a/hw/pci-bridge/gen_pcie_root_port.c
> +++ b/hw/pci-bridge/gen_pcie_root_port.c
> @@ -101,6 +101,7 @@ static void gen_rp_realize(DeviceState *dev, Error **errp)
>  
>  static const VMStateDescription vmstate_rp_dev = {
>  .name = "pcie-root-port",
> +.priority = MIG_PRI_PCI_BUS,
>  .version_id = 1,
>  .minimum_version_id = 1,
>  .post_load = pcie_cap_slot_post_load,
> diff --git a/hw/pci-bridge/i82801b11.c b/hw/pci-bridge/i82801b11.c
> index cb522bf30c..60df9b2c96 100644
> --- a/hw/pci-bridge/i82801b11.c
> +++ b/hw/pci-bridge/i82801b11.c
> @@ -80,6 +80,7 @@ err_bridge:
>  
>  static const VMStateDescription i82801b11_bridge_dev_vmstate = {
>  .name = "i82801b11_bridge",
> +.priority = MIG_PRI_PCI_BUS,
>  .fields = (VMStateField[]) {
>  VMSTATE_PCI_DEVICE(parent_obj, PCIBridge),
>  VMSTATE_END_OF_LIST()
> diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c
> index 5f56a2feb6..a7bfbdd238 100644
> --- a/hw/pci-bridge/ioh3420.c
> +++ b/hw/pci-bridge/ioh3420.c
> @@ -83,6 +83,7 @@ static void ioh3420_interrupts_uninit(PCIDevice *d)
>  
>  static const VMStateDescription vmstate_ioh3420 = {
>  .name = "ioh-3240-express-root-port",
> +.priority = MIG_PRI_PCI_BUS,
>  .version_id = 1,
>  .minimum_version_id = 1,
>  .post_load = pcie_cap_slot_post_load,
> diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
> index d56f6638c2..b2d861d216 100644
> --- a/hw/pci-bridge/pci_bridge_dev.c
> +++ b/hw/pci-bridge/pci_bridge_dev.c
> @@ -174,6 +174,7 @@ static bool pci_device_shpc_present(void *opaque, int 
> version_id)
>  
>  static const VMStateDescription pci_bridge_dev_vmstate = {
>  .name = "pci_bridge",
> +.priority = MIG_PRI_PCI_BUS,
>  .fields = (VMStateField[]) {
>  VMSTATE_PCI_DEVICE(parent_obj, PCIBridge),
>  SHPC_VMSTATE(shpc, PCIDevice, pci_device_shpc_present),
> diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c
> index a4d827c99d..e5ac7974cf 100644
> --- a/hw/pci-bridge/pcie_pci_bridge.c
> +++ b/hw/pci-bridge/pcie_pci_bridge.c
> @@ -129,6 +129,7 @@ static Property pcie_pci_bridge_dev_properties[] = {
>  
>  static const VMStateDescription pcie_pci_bridge_dev_vmstate = {
>  .name = TYPE_PCIE_PCI_BRIDGE_DEV,
> +.priority = MIG_PRI_PCI_BUS,
>  .fields = (VMStateField[]) {
>  VMSTATE_PCI_DEVICE(parent_obj, PCIBridge),
>  SHPC_VMSTATE(shpc, PCIDevice, NULL),
> @@ -178,7 +179,6 @@ static void pcie_pci_bridge_class_init(ObjectClass 
> *klass, void *data)
>  k->config_write = pcie_pci_bridge_write_config;
>  dc->vmsd =

Re: [Qemu-devel] [PATCH v11 00/20] tcg: generic vector operations

2018-02-06 Thread Alex Bennée


Alex Bennée  writes:

> Richard Henderson  writes:
>
>> Changes since v11:
>>   * Use dup_const more.
>>   * Cleanup some gvec 2i and 2s routines.
>>   * Use more helpers and less gotos in target/arm/translate-a64.c.
>
> I just noticed the aarch64 cross build breaks:
>
> n file included from /root/src/github.com/stsquad/qemu/tcg/tcg.c:296:0:
> /root/src/github.com/stsquad/qemu/tcg/aarch64/tcg-target.inc.c: In function 
> 'tcg_out_dupi_vec':
> /root/src/github.com/stsquad/qemu/tcg/aarch64/tcg-target.inc.c:806:9: error: 
> implicit declaration of function 'new_pool_l2' 
> [-Werror=implicit-function-declaration]
>  new_pool_l2(s, R_AARCH64_CONDBR19, s->code_ptr, 0, v64, v64);

Ignore me, bad patch application that only affected that build.

--
Alex Bennée

Re: [Qemu-devel] [PULL 00/47] Misc patches for 2018-02-05

2018-02-06 Thread Paolo Bonzini

On 06/02/2018 20:18, Peter Maydell wrote:
> Hi. I'm afraid this fails to build the all-linux-static config:
> 
>   LINKivshmem-client
> [usual linker gripes about getpwuid  in static binaries deleted]
> /usr/lib/gcc/x86_64-linux-gnu/5/libubsan.a(sanitizer_linux_libcdep.o):
> In function `__sanitizer::SetEnv(cha
> r const*, char const*)':
> (.text+0x41b): undefined reference to `dlsym'
> /usr/lib/gcc/x86_64-linux-gnu/5/libubsan.a(sanitizer_linux_libcdep.o):
> In function `__sanitizer::InitTlsSiz
> e()':
> (.text+0x553): undefined reference to `dlsym'
> collect2: error: ld returned 1 exit status

Uhm, what is all-linux-static?  Is it using --enable-debug?

Paolo

[Qemu-devel] [RFC PATCH 33/34] vmbus: add support for rom files

2018-02-06 Thread Roman Kagan

In order to leverage third-party drivers for VMBus devices in firmware
(in particular, there's a case with iPXE driver for hv-net in SeaBIOS
and OVMF), introduce an infrastructure to supply such drivers as option
ROMs.

To make it easy for the firmware to locate such ROMs, they are stored in
fw_cfg with names "vmbus/----.rom" for
default class ROMs (where xxx... is the class GUID) and
"vmbus/dev/----.rom" for per-device
(i.e. specified via .romfile property) ROMs (where yyy... is the device
instance GUID).

The format and the calling convention for the ROMs is out of scope for
this patch: QEMU doesn't try to interpret them.

Signed-off-by: Roman Kagan 
---
 include/hw/vmbus/vmbus.h |  3 +++
 hw/vmbus/vmbus.c | 39 +++
 2 files changed, 42 insertions(+)

diff --git a/include/hw/vmbus/vmbus.h b/include/hw/vmbus/vmbus.h
index cdb5180796..847edc08d7 100644
--- a/include/hw/vmbus/vmbus.h
+++ b/include/hw/vmbus/vmbus.h
@@ -49,6 +49,8 @@ typedef struct VMBusDeviceClass {
 int (*open_channel) (VMBusDevice *vdev);
 void (*close_channel) (VMBusDevice *vdev);
 VMBusChannelNotifyCb chan_notify_cb;
+
+const char *romfile;
 } VMBusDeviceClass;
 
 typedef struct VMBusDevice {
@@ -57,6 +59,7 @@ typedef struct VMBusDevice {
 uint16_t num_channels;
 VMBusChannel *channels;
 AddressSpace *dma_as;
+char *romfile;
 } VMBusDevice;
 
 extern const VMStateDescription vmstate_vmbus_dev;
diff --git a/hw/vmbus/vmbus.c b/hw/vmbus/vmbus.c
index 42d12dfdf6..c2aec004e7 100644
--- a/hw/vmbus/vmbus.c
+++ b/hw/vmbus/vmbus.c
@@ -12,6 +12,7 @@
 #include "qapi/error.h"
 #include "hw/vmbus/vmbus.h"
 #include "hw/sysbus.h"
+#include "hw/loader.h"
 #include "trace.h"
 
 #define TYPE_VMBUS "vmbus"
@@ -2061,6 +2062,36 @@ unmap:
 cpu_physical_memory_unmap(int_map, len, 1, is_dirty);
 }
 
+static void vmbus_install_rom(VMBusDevice *vdev)
+{
+VMBusDeviceClass *vdc = VMBUS_DEVICE_GET_CLASS(vdev);
+VMBus *vmbus = VMBUS(qdev_get_parent_bus(DEVICE(vdev)));
+BusChild *child;
+char uuid[UUID_FMT_LEN + 1];
+char romname[10 + UUID_FMT_LEN + 4 + 1];
+
+if (vdev->romfile) {
+/* device-specific rom */
+qemu_uuid_unparse(>instanceid, uuid);
+snprintf(romname, sizeof(romname), "vmbus/dev/%s.rom", uuid);
+rom_add_file(vdev->romfile, romname, 0, -1, true, NULL, NULL);
+} else if (vdc->romfile) {
+/* class-wide rom */
+QTAILQ_FOREACH(child, (vmbus)->children, sibling) {
+VMBusDevice *chlddev = VMBUS_DEVICE(child->child);
+
+/* another device of the same class has already installed it */
+if (chlddev != vdev && !chlddev->romfile &&
+VMBUS_DEVICE_GET_CLASS(chlddev) == vdc) {
+return;
+}
+}
+qemu_uuid_unparse(>classid, uuid);
+snprintf(romname, sizeof(romname), "vmbus/%s.rom", uuid);
+rom_add_file(vdc->romfile, romname, 0, -1, true, NULL, NULL);
+}
+}
+
 static void vmbus_dev_realize(DeviceState *dev, Error **errp)
 {
 VMBusDevice *vdev = VMBUS_DEVICE(dev);
@@ -2098,6 +2129,8 @@ static void vmbus_dev_realize(DeviceState *dev, Error 
**errp)
 goto error_out;
 }
 
+vmbus_install_rom(vdev);
+
 if (vdc->vmdev_realize) {
 vdc->vmdev_realize(vdev, );
 if (err) {
@@ -2145,6 +2178,11 @@ static void vmbus_dev_unrealize(DeviceState *dev, Error 
**errp)
 free_channels(vmbus, vdev);
 }
 
+static Property vmbus_dev_props[] = {
+DEFINE_PROP_STRING("romfile", VMBusDevice, romfile),
+DEFINE_PROP_END_OF_LIST()
+};
+
 static void vmbus_dev_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *kdev = DEVICE_CLASS(klass);
@@ -2152,6 +2190,7 @@ static void vmbus_dev_class_init(ObjectClass *klass, void 
*data)
 kdev->realize = vmbus_dev_realize;
 kdev->unrealize = vmbus_dev_unrealize;
 kdev->reset = vmbus_dev_reset;
+kdev->props = vmbus_dev_props;
 }
 
 static int vmbus_dev_post_load(void *opaque, int version_id)
-- 
2.14.3

[Qemu-devel] [RFC PATCH 29/34] net: add Hyper-V/VMBus network protocol definitions

2018-02-06 Thread Roman Kagan

Add a header with data structures and constants defining the protocol
for communication between the guest and the hypervisor implementing the
Hyper-V/VMBus network adapter.

Mostly taken from the corresponding definitions in the Linux kernel.

TODO: move RNDIS stuff to rndis.h
Signed-off-by: Roman Kagan 
---
 hw/net/hvnet-proto.h | 1161 ++
 1 file changed, 1161 insertions(+)
 create mode 100644 hw/net/hvnet-proto.h

diff --git a/hw/net/hvnet-proto.h b/hw/net/hvnet-proto.h
new file mode 100644
index 00..1582c7e5a2
--- /dev/null
+++ b/hw/net/hvnet-proto.h
@@ -0,0 +1,1161 @@
+/*
+ * Hyper-V network device protocol definitions
+ *
+ * Copyright (c) 2011, Microsoft Corporation.
+ * Copyright (c) 2018 Virtuozzo International GmbH.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef _HVNET_PROTO_H_
+#define _HVNET_PROTO_H_
+
+/
+ * c from linux drivers/net/hyperv/hyperv_net.h
+ /
+
+/* RSS related */
+#define OID_GEN_RECEIVE_SCALE_CAPABILITIES 0x00010203  /* query only */
+#define OID_GEN_RECEIVE_SCALE_PARAMETERS 0x00010204  /* query and set */
+
+#define NDIS_OBJECT_TYPE_RSS_CAPABILITIES 0x88
+#define NDIS_OBJECT_TYPE_RSS_PARAMETERS 0x89
+#define NDIS_OBJECT_TYPE_OFFLOAD   0xa7
+
+#define NDIS_RECEIVE_SCALE_CAPABILITIES_REVISION_2 2
+#define NDIS_RECEIVE_SCALE_PARAMETERS_REVISION_2 2
+
+struct ndis_obj_header {
+uint8_t type;
+uint8_t rev;
+uint16_t size;
+} QEMU_PACKED;
+
+/* ndis_recv_scale_cap/cap_flag */
+#define NDIS_RSS_CAPS_MESSAGE_SIGNALED_INTERRUPTS 0x0100
+#define NDIS_RSS_CAPS_CLASSIFICATION_AT_ISR   0x0200
+#define NDIS_RSS_CAPS_CLASSIFICATION_AT_DPC   0x0400
+#define NDIS_RSS_CAPS_USING_MSI_X 0x0800
+#define NDIS_RSS_CAPS_RSS_AVAILABLE_ON_PORTS  0x1000
+#define NDIS_RSS_CAPS_SUPPORTS_MSI_X  0x2000
+#define NDIS_RSS_CAPS_HASH_TYPE_TCP_IPV4  0x0100
+#define NDIS_RSS_CAPS_HASH_TYPE_TCP_IPV6  0x0200
+#define NDIS_RSS_CAPS_HASH_TYPE_TCP_IPV6_EX   0x0400
+
+struct ndis_recv_scale_cap { /* NDIS_RECEIVE_SCALE_CAPABILITIES */
+struct ndis_obj_header hdr;
+uint32_t cap_flag;
+uint32_t num_int_msg;
+uint32_t num_recv_que;
+uint16_t num_indirect_tabent;
+} QEMU_PACKED;
+
+
+/* ndis_recv_scale_param flags */
+#define NDIS_RSS_PARAM_FLAG_BASE_CPU_UNCHANGED 0x0001
+#define NDIS_RSS_PARAM_FLAG_HASH_INFO_UNCHANGED0x0002
+#define NDIS_RSS_PARAM_FLAG_ITABLE_UNCHANGED   0x0004
+#define NDIS_RSS_PARAM_FLAG_HASH_KEY_UNCHANGED 0x0008
+#define NDIS_RSS_PARAM_FLAG_DISABLE_RSS0x0010
+
+/* Hash info bits */
+#define NDIS_HASH_FUNC_TOEPLITZ 0x0001
+#define NDIS_HASH_IPV4  0x0100
+#define NDIS_HASH_TCP_IPV4  0x0200
+#define NDIS_HASH_IPV6  0x0400
+#define NDIS_HASH_IPV6_EX   0x0800
+#define NDIS_HASH_TCP_IPV6  0x1000
+#define NDIS_HASH_TCP_IPV6_EX   0x2000
+
+#define NDIS_RSS_INDIRECTION_TABLE_MAX_SIZE_REVISION_2 (128 * 4)
+#define NDIS_RSS_HASH_SECRET_KEY_MAX_SIZE_REVISION_2   40
+
+#define ITAB_NUM 128
+
+struct ndis_recv_scale_param { /* NDIS_RECEIVE_SCALE_PARAMETERS */
+struct ndis_obj_header hdr;
+
+/* Qualifies the rest of the information */
+uint16_t flag;
+
+/* The base CPU number to do receive processing. not used */
+uint16_t base_cpu_number;
+
+/* This describes the hash function and type being enabled */
+uint32_t hashinfo;
+
+/* The size of indirection table array */
+uint16_t indirect_tabsize;
+
+/* The offset of the indirection table from the beginning of this
+ * structure
+ */
+uint32_t indirect_taboffset;
+
+/* The size of the hash secret key */
+uint16_t hashkey_size;
+
+/* The offset of the secret key from the beginning of this structure */
+uint32_t kashkey_offset;
+
+uint32_t processor_masks_offset;
+uint32_t num_processor_masks;
+uint32_t processor_masks_entry_size;
+};
+
+/* Fwd declaration */
+struct ndis_tcp_ip_checksum_info;
+struct ndis_pkt_8021q_info;
+
+/*
+ * Represent netvsc packet which contains 1 RNDIS and 1 ethernet frame
+ * within the RNDIS
+ *
+ * The size of this structure is less than 48 bytes and we can now
+ * place this structure in the skb->cb field.
+ */
+struct hv_netvsc_packet {
+/* Bookkeeping stuff */
+uint8_t cp_partial; /* partial copy into send buffer */
+
+uint8_t rmsg_size; /* RNDIS header and PPI size */
+uint8_t rmsg_pgcnt; /* page count of RNDIS header and PPI */
+uint8_t page_buf_cnt;
+
+uint16_t q_idx;
+uint16_t total_packets;
+
+uint32_t total_bytes;
+uint32_t send_buf_index;
+uint32_t total_data_buflen;
+};
+
+enum rndis_device_state {
+RNDIS_DEV_UNINITIALIZED = 0,
+RNDIS_DEV_INITIALIZING,
+RNDIS_DEV_INITIALIZED,
+RNDIS_DEV_DATAINITIALIZED,

[Qemu-devel] [RFC PATCH 32/34] loader: allow arbitrary basename for fw_cfg file roms

2018-02-06 Thread Roman Kagan

rom_add_file assumes that the basename of the file roms in fw_cfg should
be the same as the original basename of the rom file on the filesystem.

However, this is not always convenient: the rom basename may bear
certain meaning in the guest firmware context, e.g. contain device ids,
while the the filename on the host filesystem may be something more
human-readable.

[In particular, this is how I'm planning to supply roms for Hyper-V
VMBus devices, which don't have a spec-defined way of doing this.]

To cater to such usecases, interpret the corresponding argument of
rom_add_file as a path which, if ends with a slash, is interpreted as a
"directory" to which the basename of the rom file is appended; otherwise
this argument is treated as a full fw_cfg path.

TODO: it may be a better idea to use separate arguments for "directory"
and "filename" instead of interpreting the trailing dash.

Signed-off-by: Roman Kagan 
---
 include/hw/loader.h |  2 +-
 hw/core/loader.c| 43 +--
 2 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/include/hw/loader.h b/include/hw/loader.h
index 355fe0f5a2..a309662fa8 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -186,7 +186,7 @@ void pstrcpy_targphys(const char *name,
 extern bool option_rom_has_mr;
 extern bool rom_file_has_mr;
 
-int rom_add_file(const char *file, const char *fw_dir,
+int rom_add_file(const char *file, const char *fw_path,
  hwaddr addr, int32_t bootindex,
  bool option_rom, MemoryRegion *mr, AddressSpace *as);
 MemoryRegion *rom_add_blob(const char *name, const void *blob, size_t len,
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 91669d65aa..436154de48 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -817,7 +817,6 @@ struct Rom {
 MemoryRegion *mr;
 AddressSpace *as;
 int isrom;
-char *fw_dir;
 char *fw_file;
 
 hwaddr addr;
@@ -882,7 +881,7 @@ static void *rom_set_mr(Rom *rom, Object *owner, const char 
*name, bool ro)
 return data;
 }
 
-int rom_add_file(const char *file, const char *fw_dir,
+int rom_add_file(const char *file, const char *fw_path,
  hwaddr addr, int32_t bootindex,
  bool option_rom, MemoryRegion *mr,
  AddressSpace *as)
@@ -914,10 +913,6 @@ int rom_add_file(const char *file, const char *fw_dir,
 goto err;
 }
 
-if (fw_dir) {
-rom->fw_dir  = g_strdup(fw_dir);
-rom->fw_file = g_strdup(file);
-}
 rom->addr = addr;
 rom->romsize  = lseek(fd, 0, SEEK_END);
 if (rom->romsize == -1) {
@@ -937,20 +932,26 @@ int rom_add_file(const char *file, const char *fw_dir,
 }
 close(fd);
 rom_insert(rom);
-if (rom->fw_file && fw_cfg) {
+if (fw_path && fw_cfg) {
 const char *basename;
-char fw_file_name[FW_CFG_MAX_FILE_PATH];
 void *data;
 
-basename = strrchr(rom->fw_file, '/');
-if (basename) {
-basename++;
+basename = strrchr(fw_path, '/');
+if (basename && basename[1] == '\0') {
+/* given path terminates with '/', append basename(file) */
+basename = strrchr(file, '/');
+if (basename) {
+basename++;
+} else {
+basename = file;
+}
+
+rom->fw_file = g_strdup_printf("%s%s", fw_path, basename);
 } else {
-basename = rom->fw_file;
+rom->fw_file = g_strdup(fw_path);
 }
-snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir,
- basename);
-snprintf(devpath, sizeof(devpath), "/rom@%s", fw_file_name);
+
+snprintf(devpath, sizeof(devpath), "/rom@%s", rom->fw_file);
 
 if ((!option_rom || mc->option_rom_has_mr) && mc->rom_file_has_mr) {
 data = rom_set_mr(rom, OBJECT(fw_cfg), devpath, true);
@@ -958,7 +959,7 @@ int rom_add_file(const char *file, const char *fw_dir,
 data = rom->data;
 }
 
-fw_cfg_add_file(fw_cfg, fw_file_name, data, rom->romsize);
+fw_cfg_add_file(fw_cfg, rom->fw_file, data, rom->romsize);
 } else {
 if (mr) {
 rom->mr = mr;
@@ -978,8 +979,7 @@ err:
 g_free(rom->data);
 g_free(rom->path);
 g_free(rom->name);
-if (fw_dir) {
-g_free(rom->fw_dir);
+if (fw_path) {
 g_free(rom->fw_file);
 }
 g_free(rom);
@@ -1052,12 +1052,12 @@ int rom_add_elf_program(const char *name, void *data, 
size_t datasize,
 
 int rom_add_vga(const char *file)
 {
-return rom_add_file(file, "vgaroms", 0, -1, true, NULL, NULL);
+return rom_add_file(file, "vgaroms/", 0, -1, true, NULL, NULL);
 }
 
 int rom_add_option(const char *file, int32_t bootindex)
 {
-return rom_add_file(file, "genroms", 0, bootindex, true, NULL, NULL);
+return rom_add_file(file, "genroms/", 0, bootindex, true, NULL, NULL);
 }

[Qemu-devel] [RFC PATCH 27/34] tests: hv-scsi: add start-stop test

2018-02-06 Thread Roman Kagan

It's trivial and tests only a tiny fraction of the relevant code, but
it's better than nothing.

Signed-off-by: Roman Kagan 
---
 tests/hv-scsi-test.c   | 57 ++
 tests/Makefile.include |  3 +++
 2 files changed, 60 insertions(+)
 create mode 100644 tests/hv-scsi-test.c

diff --git a/tests/hv-scsi-test.c b/tests/hv-scsi-test.c
new file mode 100644
index 00..9bff0df09c
--- /dev/null
+++ b/tests/hv-scsi-test.c
@@ -0,0 +1,57 @@
+/*
+ * QTest testcase for Hyper-V/VMBus SCSI
+ *
+ * Copyright (c) 2018 Virtuozzo International GmbH.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include 
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "qemu/bswap.h"
+#include "libqos/libqos-pc.h"
+
+static QOSState *qhv_scsi_start(const char *extra_opts)
+{
+const char *arch = qtest_get_arch();
+const char *cmd = "-machine accel=kvm,vmbus "
+  "-cpu kvm64,hv_synic,hv_vpindex "
+  "-drive id=hd0,if=none,file=null-co://,format=raw "
+  "-device hv-scsi,id=scsi0 "
+  "-device scsi-hd,bus=scsi0.0,drive=hd0 %s";
+
+if (strcmp(arch, "i386") && strcmp(arch, "x86_64")) {
+g_printerr("Hyper-V / VMBus are only available on x86\n");
+exit(EXIT_FAILURE);
+}
+
+if (access("/dev/kvm", R_OK | W_OK)) {
+g_printerr("Hyper-V / VMBus can only be used with KVM\n");
+exit(EXIT_FAILURE);
+}
+
+return qtest_pc_boot(cmd, extra_opts ? : "");
+}
+
+static void qhv_scsi_stop(QOSState *qs)
+{
+qtest_shutdown(qs);
+}
+
+static void start_stop(void)
+{
+QOSState *qs;
+
+qs = qhv_scsi_start(NULL);
+qhv_scsi_stop(qs);
+}
+
+int main(int argc, char **argv)
+{
+g_test_init(, , NULL);
+qtest_add_func("/hv-scsi/start-stop", start_stop);
+
+return g_test_run();
+}
diff --git a/tests/Makefile.include b/tests/Makefile.include
index ca82e0c0cc..800f9cca92 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -281,6 +281,8 @@ gcov-files-i386-y += hw/usb/hcd-xhci.c
 check-qtest-i386-y += tests/cpu-plug-test$(EXESUF)
 check-qtest-i386-y += tests/q35-test$(EXESUF)
 check-qtest-i386-y += tests/vmgenid-test$(EXESUF)
+check-qtest-i386-y += tests/hv-scsi-test$(EXESUF)
+gcov-files-i386-y += hw/scsi/hv-scsi.c
 gcov-files-i386-y += hw/pci-host/q35.c
 check-qtest-i386-$(CONFIG_VHOST_USER_NET_TEST_i386) += 
tests/vhost-user-test$(EXESUF)
 ifeq ($(CONFIG_VHOST_USER_NET_TEST_i386),)
@@ -820,6 +822,7 @@ tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o
 tests/test-qapi-util$(EXESUF): tests/test-qapi-util.o $(test-util-obj-y)
 tests/numa-test$(EXESUF): tests/numa-test.o
 tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/boot-sector.o 
tests/acpi-utils.o
+tests/hv-scsi-test$(EXESUF): tests/hv-scsi-test.o $(libqos-pc-obj-y)
 
 tests/migration/stress$(EXESUF): tests/migration/stress.o
$(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
,"LINK","$(TARGET_DIR)$@")
-- 
2.14.3

[Qemu-devel] [RFC PATCH 25/34] scsi: add Hyper-V/VMBus SCSI controller

2018-02-06 Thread Roman Kagan

Add an implementation of Hyper-V/VMBus SCSI controller.

Kudos to Evgeny Yakovlev (formerly eyakov...@virtuozzo.com) for research
and prototyping.

Signed-off-by: Roman Kagan 
---
 hw/scsi/hv-scsi.c | 398 ++
 hw/scsi/Makefile.objs |   2 +
 hw/scsi/trace-events  |   6 +
 3 files changed, 406 insertions(+)
 create mode 100644 hw/scsi/hv-scsi.c

diff --git a/hw/scsi/hv-scsi.c b/hw/scsi/hv-scsi.c
new file mode 100644
index 00..bbfc26bf0a
--- /dev/null
+++ b/hw/scsi/hv-scsi.c
@@ -0,0 +1,398 @@
+/*
+ * QEMU Hyper-V storage device support
+ *
+ * Copyright (c) 2017-2018 Virtuozzo International GmbH.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/vmbus/vmbus.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/dma.h"
+#include "qemu/iov.h"
+#include "hw/scsi/scsi.h"
+#include "scsi/constants.h"
+#include "trace.h"
+#include "hvscsi-proto.h"
+
+#define TYPE_HV_SCSI "hv-scsi"
+#define HV_SCSI_GUID "ba6163d9-04a1-4d29-b605-72e2ffb1dc7f"
+#define HV_SCSI_MAX_TRANSFER_BYTES (IOV_MAX * TARGET_PAGE_SIZE)
+
+typedef struct HvScsi {
+VMBusDevice parent;
+uint16_t num_queues;
+SCSIBus bus;
+enum {
+HV_SCSI_RESET,
+HV_SCSI_INITIALIZING,
+HV_SCSI_INITIALIZED,
+} state;
+uint8_t protocol_major;
+uint8_t protocol_minor;
+} HvScsi;
+
+#define HV_SCSI(obj) OBJECT_CHECK(HvScsi, (obj), TYPE_HV_SCSI)
+
+typedef struct HvScsiReq
+{
+VMBusChanReq vmreq;
+HvScsi *s;
+SCSIRequest *sreq;
+hv_stor_packet *reply;
+} HvScsiReq;
+
+static void hv_scsi_init_req(HvScsi *s, HvScsiReq *req)
+{
+VMBusChanReq *vmreq = >vmreq;
+
+req->s = s;
+if (vmreq->comp) {
+req->reply = vmreq->comp;
+}
+}
+
+static void hv_scsi_free_req(HvScsiReq *req)
+{
+vmbus_release_req(req);
+}
+
+static void hv_scsi_save_request(QEMUFile *f, SCSIRequest *sreq)
+{
+HvScsiReq *req = sreq->hba_private;
+
+vmbus_save_req(f, >vmreq);
+}
+
+static void *hv_scsi_load_request(QEMUFile *f, SCSIRequest *sreq)
+{
+HvScsiReq *req;
+HvScsi *scsi = container_of(sreq->bus, HvScsi, bus);
+
+req = vmbus_load_req(f, VMBUS_DEVICE(scsi), sizeof(*req));
+if (!req) {
+error_report("failed to load VMBus request from saved state");
+return NULL;
+}
+
+hv_scsi_init_req(scsi, req);
+scsi_req_ref(sreq);
+req->sreq = sreq;
+return req;
+}
+
+static int complete_io(HvScsiReq *req, uint32_t status)
+{
+VMBusChanReq *vmreq = >vmreq;
+int res = 0;
+
+if (vmreq->comp) {
+req->reply->operation = HV_STOR_OPERATION_COMPLETE_IO;
+req->reply->flags = 0;
+req->reply->status = status;
+res = vmbus_chan_send_completion(vmreq);
+}
+
+if (req->sreq) {
+scsi_req_unref(req->sreq);
+}
+hv_scsi_free_req(req);
+return res;
+}
+
+static int hv_scsi_complete_req(HvScsiReq *req, uint8_t scsi_status,
+uint32_t srb_status, size_t resid)
+{
+hv_srb_packet *srb = >reply->srb;
+
+srb->scsi_status = scsi_status;
+srb->srb_status = srb_status;
+
+assert(resid <= srb->transfer_length);
+srb->transfer_length -= resid;
+
+return complete_io(req, 0);
+}
+
+static void hv_scsi_request_cancelled(SCSIRequest *r)
+{
+HvScsiReq *req = r->hba_private;
+hv_scsi_complete_req(req, GOOD, HV_SRB_STATUS_ABORTED, 0);
+}
+
+static QEMUSGList *hv_scsi_get_sg_list(SCSIRequest *r)
+{
+HvScsiReq *req = r->hba_private;
+return >vmreq.sgl;
+}
+
+static void hv_scsi_command_complete(SCSIRequest *r, uint32_t status,
+ size_t resid)
+{
+HvScsiReq *req = r->hba_private;
+hv_srb_packet *srb = >reply->srb;
+
+trace_hvscsi_command_complete(r, status, resid);
+
+srb->sense_length = scsi_req_get_sense(r, srb->sense_data,
+   sizeof(srb->sense_data));
+hv_scsi_complete_req(req, status, HV_SRB_STATUS_SUCCESS, resid);
+}
+
+static struct SCSIBusInfo hv_scsi_info = {
+.tcq = true,
+.max_channel = HV_SRB_MAX_CHANNELS - 1,
+.max_target = HV_SRB_MAX_TARGETS - 1,
+.max_lun = HV_SRB_MAX_LUNS_PER_TARGET - 1,
+.complete = hv_scsi_command_complete,
+.cancel = hv_scsi_request_cancelled,
+.get_sg_list = hv_scsi_get_sg_list,
+.save_request = hv_scsi_save_request,
+.load_request = hv_scsi_load_request,
+};
+
+static void handle_missing_target(HvScsiReq *req)
+{
+/*
+ * SRB_STATUS_INVALID_LUN should be enough and it works for windows guests
+ * However linux stor_vsc driver ignores any scsi and srb status errors
+ * for all INQUIRY and MODE_SENSE commands.
+ * So, specifically for those linux clients we also have to fake
+ * an INVALID_LUN sense response.
+

[Qemu-devel] [RFC PATCH 26/34] hv-scsi: limit the number of requests per notification

2018-02-06 Thread Roman Kagan

There's a vague feeling that, if there are too many requests in the
incoming ring buffer, processing and replying to them may usurp the
event loop (main thread) and thus induce lags, soft lockups, etc.

So ensure to yield the event loop at most every 1024 requests.

TODO: do something smarter than this
Signed-off-by: Roman Kagan 
---
 hw/scsi/hv-scsi.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/hv-scsi.c b/hw/scsi/hv-scsi.c
index bbfc26bf0a..aa055340ef 100644
--- a/hw/scsi/hv-scsi.c
+++ b/hw/scsi/hv-scsi.c
@@ -294,8 +294,9 @@ static void hv_scsi_handle_packet(HvScsiReq *req)
 static void hv_scsi_notify_cb(VMBusChannel *chan)
 {
 HvScsi *scsi = HV_SCSI(vmbus_channel_device(chan));
+int i;
 
-for (;;) {
+for (i = 1024; i; i--) {
 HvScsiReq *req = vmbus_channel_recv(chan, sizeof(*req));
 if (!req) {
 break;
@@ -304,6 +305,10 @@ static void hv_scsi_notify_cb(VMBusChannel *chan)
 hv_scsi_init_req(scsi, req);
 hv_scsi_handle_packet(req);
 }
+
+if (!i) {
+vmbus_notify_channel(chan);
+}
 }
 
 static void hv_scsi_reset(HvScsi *scsi)
-- 
2.14.3

[Qemu-devel] [RFC PATCH 17/34] [not to commit] import HYPERV_EVENTFD stuff from kernel

2018-02-06 Thread Roman Kagan

Allow to build and test HYPERV_EVENTD until it comes through the regular
kernel headers import.

Signed-off-by: Roman Kagan 
---
 linux-headers/linux/kvm.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index d92c9b2f0e..47100cb6a3 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -934,6 +934,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_AIS_MIGRATION 150
 #define KVM_CAP_PPC_GET_CPU_CHAR 151
 #define KVM_CAP_S390_BPB 152
+#define KVM_CAP_HYPERV_EVENTFD 153
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1363,6 +1364,9 @@ struct kvm_s390_ucas_mapping {
 #define KVM_S390_GET_CMMA_BITS  _IOWR(KVMIO, 0xb8, struct 
kvm_s390_cmma_log)
 #define KVM_S390_SET_CMMA_BITS  _IOW(KVMIO, 0xb9, struct kvm_s390_cmma_log)
 
+/* Available with KVM_CAP_HYPERV_EVENTFD */
+#define KVM_HYPERV_EVENTFD_IOW(KVMIO,  0xbd, struct kvm_hyperv_eventfd)
+
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1 << 0)
 #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1)
 #define KVM_DEV_ASSIGN_MASK_INTX   (1 << 2)
@@ -1423,4 +1427,14 @@ struct kvm_assigned_msix_entry {
 #define KVM_ARM_DEV_EL1_PTIMER (1 << 1)
 #define KVM_ARM_DEV_PMU(1 << 2)
 
+struct kvm_hyperv_eventfd {
+   __u32 conn_id;
+   __s32 fd;
+   __u32 flags;
+   __u32 padding[3];
+};
+
+#define KVM_HYPERV_CONN_ID_MASK0x00ff
+#define KVM_HYPERV_EVENTFD_DEASSIGN(1 << 0)
+
 #endif /* __LINUX_KVM_H */
-- 
2.14.3

Re: [Qemu-devel] [PATCH 1/4] multiboot: Change multiboot_info from array of bytes to a C struct

2018-02-06 Thread Jack Schwartz


Hi Anatol and Kevin.

Kevin and Anatol, thanks for your replies.

A few comments inline close to the bottom...

On 2018-02-05 13:43, Anatol Pomozov wrote:

Hi

On Wed, Jan 31, 2018 at 1:12 AM, Kevin Wolf  wrote:

Am 31.01.2018 um 00:15 hat Jack Schwartz geschrieben:

Hi Anatol and Kevin.

Even though I am new to Qemu, I have a patch to deliver for
multiboot.c (as you know) and so I feel familiar enough to do a review
of this patch.  One comment is probably more for maintainers.

The Multiboot code is essentially unmaintained. It's technically part of
the PC/x86 subsystem, but their maintainers don't seem to care at all.
So in order to make any progress here, I decided that I will send a
pull request for Multiboot once we have the patches ready (as a one-time
thing, without officially making myself the maintainer).

So I am the closest thing to a maintainer that we have here, and while
I'm familiar with some of the Multiboot-specific code, I don't really
know the ELF code and don't have a lot of time to spend here.

Therefore it's very welcome if you review the patches of each other,
even if you're not perfectly familiar with the code, as there is
probably noone else who could do a better review.


On 01/29/18 12:43, Anatol Pomozov wrote:

@@ -253,11 +228,11 @@ int load_multiboot(FWCfgState *fw_cfg,
   mb_load_size = mb_kernel_size;
   }
-/* Valid if mh_flags sets MULTIBOOT_HEADER_HAS_VBE.
-uint32_t mh_mode_type = ldl_p(header+i+32);
-uint32_t mh_width = ldl_p(header+i+36);
-uint32_t mh_height = ldl_p(header+i+40);
-uint32_t mh_depth = ldl_p(header+i+44); */
+/* Valid if mh_flags sets MULTIBOOT_VIDEO_MODE.
+uint32_t mh_mode_type = ldl_p(_header->mode_type);
+uint32_t mh_width = ldl_p(_header->width);
+uint32_t mh_height = ldl_p(_header->height);
+uint32_t mh_depth = ldl_p(_header->depth); */

This question is probably more for maintainers...

In the patch series I submitted, people were OK that I was going to delete
these lines since they were only comments anyway.  Your approach leaves
these lines in and updates them even though they are comments.  Which way is
preferred?

As far as I am concerned, I honestly don't mind either way. It's
trivial code, so we won't lose anything by removing it.

This change suppose to be a refactoring and tries to avoid functional
changes. I can remove it in a separate change.


The ideal solution would be just implementing support for it, of course.
If we wanted to do that, I think we would have to pass the values
through fw_cfg and then set the VBE mode in the Mutiboot option rom.


   mb_debug("multiboot: mh_header_addr = %#x\n", mh_header_addr);
   mb_debug("multiboot: mh_load_addr = %#x\n", mh_load_addr);
@@ -295,14 +270,15 @@ int load_multiboot(FWCfgState *fw_cfg,
   }
   mbs.mb_buf_size += cmdline_len;
-mbs.mb_buf_size += MB_MOD_SIZE * mbs.mb_mods_avail;
+mbs.mb_buf_size += sizeof(multiboot_module_t) * mbs.mb_mods_avail;
   mbs.mb_buf_size += strlen(bootloader_name) + 1;
   mbs.mb_buf_size = TARGET_PAGE_ALIGN(mbs.mb_buf_size);
   /* enlarge mb_buf to hold cmdlines, bootloader, mb-info structs */
   mbs.mb_buf= g_realloc(mbs.mb_buf, mbs.mb_buf_size);
-mbs.offset_cmdlines   = mbs.offset_mbinfo + mbs.mb_mods_avail * 
MB_MOD_SIZE;
+mbs.offset_cmdlines   = mbs.offset_mbinfo +
+mbs.mb_mods_avail * sizeof(multiboot_module_t);
   mbs.offset_bootloader = mbs.offset_cmdlines + cmdline_len;
   if (initrd_filename) {
@@ -348,22 +324,22 @@ int load_multiboot(FWCfgState *fw_cfg,
   char kcmdline[strlen(kernel_filename) + strlen(kernel_cmdline) + 2];
   snprintf(kcmdline, sizeof(kcmdline), "%s %s",
kernel_filename, kernel_cmdline);
-stl_p(bootinfo + MBI_CMDLINE, mb_add_cmdline(, kcmdline));
+stl_p(, mb_add_cmdline(, kcmdline));
-stl_p(bootinfo + MBI_BOOTLOADER, mb_add_bootloader(, bootloader_name));
+stl_p(_loader_name, mb_add_bootloader(, 
bootloader_name));
-stl_p(bootinfo + MBI_MODS_ADDR,  mbs.mb_buf_phys + mbs.offset_mbinfo);
-stl_p(bootinfo + MBI_MODS_COUNT, mbs.mb_mods_count); /* mods_count */
+stl_p(_addr,  mbs.mb_buf_phys + mbs.offset_mbinfo);
+stl_p(_count, mbs.mb_mods_count); /* mods_count */
   /* the kernel is where we want it to be now */
-stl_p(bootinfo + MBI_FLAGS, MULTIBOOT_FLAGS_MEMORY
-| MULTIBOOT_FLAGS_BOOT_DEVICE
-| MULTIBOOT_FLAGS_CMDLINE
-| MULTIBOOT_FLAGS_MODULES
-| MULTIBOOT_FLAGS_MMAP
-| MULTIBOOT_FLAGS_BOOTLOADER);
-stl_p(bootinfo + MBI_BOOT_DEVICE, 0x8000); /* XXX: use the -boot 
switch? */
-stl_p(bootinfo + MBI_MMAP_ADDR,   ADDR_E820_MAP);
+stl_p(, MULTIBOOT_INFO_MEMORY
+   |

[Qemu-devel] [RFC PATCH 13/34] hyperv: process SIGNAL_EVENT hypercall

2018-02-06 Thread Roman Kagan

Add handling of SIGNAL_EVENT hypercall.  For that, provide an interface
to associate an EventNotifier with an event connection number, so that
it's signaled when the SIGNAL_EVENT hypercall with the matching
parameters is called by the guest.

TODO: we should be able to move this to KVM and avoid expensive user
exit just to look up an eventfd by connection number and signal it.

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.h |   2 +
 target/i386/hyperv.c | 113 +--
 2 files changed, 111 insertions(+), 4 deletions(-)

diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index 3d942e5524..4ce41fe314 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -43,4 +43,6 @@ int hyperv_post_msg(HvSintRoute *sint_route, struct 
hyperv_message *msg);
 
 int hyperv_set_evt_flag(HvSintRoute *sint_route, unsigned evtno);
 
+int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier);
+
 #endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index b557cd5d5d..9cf1225385 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -19,6 +19,9 @@
 #include "exec/address-spaces.h"
 #include "sysemu/cpus.h"
 #include "qemu/bitops.h"
+#include "qemu/queue.h"
+#include "qemu/rcu.h"
+#include "qemu/rcu_queue.h"
 #include "migration/vmstate.h"
 #include "hyperv.h"
 #include "hyperv-proto.h"
@@ -249,6 +252,106 @@ static void async_synic_update(CPUState *cs, 
run_on_cpu_data data)
 qemu_mutex_unlock_iothread();
 }
 
+typedef struct EvtHandler {
+struct rcu_head rcu;
+QLIST_ENTRY(EvtHandler) le;
+uint32_t conn_id;
+EventNotifier *notifier;
+} EvtHandler;
+
+static QLIST_HEAD(, EvtHandler) evt_handlers;
+static QemuMutex handlers_mutex;
+
+static void __attribute__((constructor)) hv_init(void)
+{
+QLIST_INIT(_handlers);
+qemu_mutex_init(_mutex);
+}
+
+int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier)
+{
+int ret;
+EvtHandler *eh;
+
+qemu_mutex_lock(_mutex);
+QLIST_FOREACH(eh, _handlers, le) {
+if (eh->conn_id == conn_id) {
+if (notifier) {
+ret = -EEXIST;
+} else {
+QLIST_REMOVE_RCU(eh, le);
+g_free_rcu(eh, rcu);
+ret = 0;
+}
+goto unlock;
+}
+}
+
+if (notifier) {
+eh = g_new(EvtHandler, 1);
+eh->conn_id = conn_id;
+eh->notifier = notifier;
+QLIST_INSERT_HEAD_RCU(_handlers, eh, le);
+ret = 0;
+} else {
+ret = -ENOENT;
+}
+unlock:
+qemu_mutex_unlock(_mutex);
+return ret;
+}
+
+static uint64_t sigevent_params(hwaddr addr, uint32_t *conn_id)
+{
+uint64_t ret;
+hwaddr len;
+struct hyperv_signal_event_input *msg;
+
+if (addr & (__alignof__(*msg) - 1)) {
+return HV_STATUS_INVALID_ALIGNMENT;
+}
+
+len = sizeof(*msg);
+msg = cpu_physical_memory_map(addr, , 0);
+if (len < sizeof(*msg)) {
+ret = HV_STATUS_INSUFFICIENT_MEMORY;
+} else {
+*conn_id = (msg->connection_id & HV_CONNECTION_ID_MASK) +
+msg->flag_number;
+ret = 0;
+}
+cpu_physical_memory_unmap(msg, len, 0, 0);
+return ret;
+}
+
+static uint64_t hvcall_signal_event(uint64_t param, bool fast)
+{
+uint64_t ret;
+uint32_t conn_id;
+EvtHandler *eh;
+
+if (likely(fast)) {
+conn_id = (param & 0x) + ((param >> 32) & 0x);
+} else {
+ret = sigevent_params(param, _id);
+if (ret) {
+return ret;
+}
+}
+
+ret = HV_STATUS_INVALID_CONNECTION_ID;
+rcu_read_lock();
+QLIST_FOREACH_RCU(eh, _handlers, le) {
+if (eh->conn_id == conn_id) {
+event_notifier_set(eh->notifier);
+ret = 0;
+break;
+}
+}
+rcu_read_unlock();
+return ret;
+}
+
 int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
 {
 CPUX86State *env = >env;
@@ -281,16 +384,18 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct 
kvm_hyperv_exit *exit)
   RUN_ON_CPU_HOST_PTR(get_synic(cpu)));
 return 0;
 case KVM_EXIT_HYPERV_HCALL: {
-uint16_t code;
+uint16_t code = exit->u.hcall.input & 0x;
+bool fast = exit->u.hcall.input & HV_HYPERCALL_FAST;
+uint64_t param = exit->u.hcall.params[0];
 
-code  = exit->u.hcall.input & 0x;
 switch (code) {
-case HV_POST_MESSAGE:
 case HV_SIGNAL_EVENT:
+exit->u.hcall.result = hvcall_signal_event(param, fast);
+break;
 default:
 exit->u.hcall.result = HV_STATUS_INVALID_HYPERCALL_CODE;
-return 0;
 }
+return 0;
 }
 default:
 return -1;
-- 
2.14.3

[Qemu-devel] [RFC PATCH 30/34] net: add Hyper-V/VMBus net adapter

2018-02-06 Thread Roman Kagan

TODO:
 - add MAC filtering
 - add offloads
 - perf tuning
Signed-off-by: Roman Kagan 
---
 hw/net/hv-net.c  | 1440 ++
 hw/net/Makefile.objs |2 +
 2 files changed, 1442 insertions(+)
 create mode 100644 hw/net/hv-net.c

diff --git a/hw/net/hv-net.c b/hw/net/hv-net.c
new file mode 100644
index 00..614922c0fb
--- /dev/null
+++ b/hw/net/hv-net.c
@@ -0,0 +1,1440 @@
+/*
+ * QEMU Hyper-V network device support
+ *
+ * Copyright (c) 2018 Virtuozzo International GmbH.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/iov.h"
+#include "hw/vmbus/vmbus.h"
+#include "net/net.h"
+#include "rndis.h"
+#include "hvnet-proto.h"
+
+#define TYPE_HV_NET "hv-net"
+#define HV_NET_GUID "f8615163-df3e-46c5-913f-f2d2f965ed0e"
+
+/* FIXME: generalize to vmbus.c? */
+typedef struct HvNetRcvPkt {
+QTAILQ_ENTRY(HvNetRcvPkt) link;
+uint32_t off;
+uint32_t len;
+} HvNetRcvPkt;
+
+typedef enum HvNetState {
+HV_NET_RESET,
+HV_NET_INITIALIZING,
+HV_NET_INITIALIZED,
+} HvNetState;
+
+typedef struct HvNet {
+VMBusDevice parent;
+
+NICConf conf;
+NICState *nic;
+
+HvNetState state;
+
+uint16_t sndbuf_id;
+uint32_t sndbuf_gpadl_id;
+VMBusGpadl *sndbuf_gpadl;
+
+uint16_t rcvbuf_id;
+uint32_t rcvbuf_gpadl_id;
+VMBusGpadl *rcvbuf_gpadl;
+int32_t rcvbuf_slot_num;/* int32_t for VMSTATE_STRUCT_VARRAY_ALLOC */
+uint16_t rcvbuf_slot_len;
+unsigned long *rcvbuf_slot_map;
+HvNetRcvPkt *rcvpkts;
+QTAILQ_HEAD(, HvNetRcvPkt) rcvpkts_free;
+
+struct {} reset_start;
+
+uint32_t protocol_ver;
+uint32_t ndis_maj_ver;
+uint32_t ndis_min_ver;
+uint32_t rndis_ctl;
+uint32_t rndis_req_id;
+uint32_t rndis_maj;
+uint32_t rndis_min;
+uint32_t max_xfer_size;
+uint32_t rndis_query_oid;
+#define RNDIS_QUERY_INFO_LEN 32
+uint64_t rndis_query_info[RNDIS_QUERY_INFO_LEN];
+uint32_t rndis_query_info_len;
+uint32_t rndis_set_status;
+uint32_t rndis_packet_filter;
+
+bool link_down;
+
+uint32_t rx_pkts;
+uint32_t tx_pkts;
+} HvNet;
+
+#define HV_NET(obj) OBJECT_CHECK(HvNet, (obj), TYPE_HV_NET)
+
+typedef struct HvNetReq
+{
+VMBusChanReq vmreq;
+HvNet *net;
+unsigned iov_cnt;
+struct iovec iov[64];
+} HvNetReq;
+
+static int hv_net_init_req(HvNet *net, HvNetReq *req)
+{
+int ret;
+QEMUSGList *sgl = >vmreq.sgl;
+
+req->net = net;
+
+if (!sgl->dev) {
+return 0;
+}
+
+ret = vmbus_map_sgl(sgl, DMA_DIRECTION_TO_DEVICE, req->iov,
+ARRAY_SIZE(req->iov), -1, 0);
+if (ret >= 0) {
+req->iov_cnt = ret;
+} else {
+error_report("%s: failed to map SGL: %d", __func__, ret);
+}
+return ret;
+}
+
+static void hv_net_free_req(HvNetReq *req)
+{
+vmbus_unmap_sgl(>vmreq.sgl, DMA_DIRECTION_TO_DEVICE, req->iov,
+req->iov_cnt, 0);
+vmbus_release_req(req);
+}
+
+static int complete_req(HvNetReq *req)
+{
+int ret = 0;
+#if 0
+VMBusChanReq *vmreq = >vmreq;
+struct nvsp_msg_header *nhdr = vmreq->msg;
+
+error_report("%s  msg: %x %lx", __func__,
+ vmreq->msglen ? nhdr->msg_type : -1, vmreq->transaction_id);
+#endif
+if (req->vmreq.comp) {
+ret = vmbus_chan_send_completion(>vmreq);
+}
+
+hv_net_free_req(req);
+return ret;
+}
+
+static HvNetRcvPkt *get_rcv_pkt(HvNet *net, size_t len)
+{
+uint32_t nr, start;
+HvNetRcvPkt *pkt;
+
+if (!len) {
+return NULL;
+}
+
+nr = DIV_ROUND_UP(len, net->rcvbuf_slot_len);
+start = bitmap_find_next_zero_area(net->rcvbuf_slot_map,
+   net->rcvbuf_slot_num, 0, nr, 0);
+if (start >= net->rcvbuf_slot_num) {
+return NULL;
+}
+
+bitmap_set(net->rcvbuf_slot_map, start, nr);
+pkt = QTAILQ_FIRST(>rcvpkts_free);
+assert(pkt);
+QTAILQ_REMOVE(>rcvpkts_free, pkt, link);
+pkt->off = start * net->rcvbuf_slot_len;
+pkt->len = len;
+return pkt;
+}
+
+static void put_rcv_pkt(HvNet *net, HvNetRcvPkt *pkt)
+{
+uint32_t nr, start;
+
+start = pkt->off / net->rcvbuf_slot_len;
+nr = DIV_ROUND_UP(pkt->len, net->rcvbuf_slot_len);
+bitmap_clear(net->rcvbuf_slot_map, start, nr);
+QTAILQ_INSERT_TAIL(>rcvpkts_free, pkt, link);
+pkt->len = 0;
+}
+
+static void put_rcv_pkt_by_tr_id(HvNet *net, uint64_t tr_id)
+{
+/* transaction id comes from the guest and can't be trusted blindly */
+HvNetRcvPkt *pkt;
+
+if (tr_id >= net->rcvbuf_slot_num) {
+return;
+}
+pkt = >rcvpkts[tr_id];
+if (!pkt->len) {
+return;
+}
+put_rcv_pkt(net, pkt);
+}
+
+static void create_rcvbuf(HvNet *net)
+{
+uint32_t gpadl_len;
+int i;
+
+gpadl_len =

[Qemu-devel] [RFC PATCH 10/34] hyperv: make overlay pages for SynIC

2018-02-06 Thread Roman Kagan

Per Hyper-V spec, SynIC message and event flag pages are to be
implemented as so called overlay pages.  That is, they are owned by the
hypervisor and, when mapped into the guest physical address space,
overlay the guest physical pages such that

1) the overlaid guest page becomes invisible to the guest CPUs until the
   overlay page is turned off
2) the contents of the overlay page is preserved when it's turned off
   and back on, even at a different address; it's only zeroed at vcpu
   reset

This particular nature of SynIC message and event flag pages is ignored
in the current code, and guest physical pages are used directly instead.
This (mostly) works because the actual guests seem not to depend on the
features listed above.

This patch implements those pages as the spec mandates.

Since the extra RAM regions, which introduce migration incompatibility,
are only added when in_kvm_only == false, no extra compat logic is
necessary.

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.c | 70 +++-
 1 file changed, 64 insertions(+), 6 deletions(-)

diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 933bfe5bcb..514cd27216 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -16,6 +16,9 @@
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "hw/qdev-properties.h"
+#include "exec/address-spaces.h"
+#include "sysemu/cpus.h"
+#include "migration/vmstate.h"
 #include "hyperv.h"
 #include "hyperv-proto.h"
 
@@ -29,6 +32,10 @@ typedef struct SynICState {
 bool enabled;
 hwaddr msg_page_addr;
 hwaddr evt_page_addr;
+MemoryRegion msg_page_mr;
+MemoryRegion evt_page_mr;
+struct hyperv_message_page *msg_page;
+struct hyperv_event_flags_page *evt_page;
 } SynICState;
 
 #define TYPE_SYNIC "hyperv-synic"
@@ -68,6 +75,17 @@ static void synic_update_msg_page_addr(SynICState *synic)
 uint64_t msr = synic->cpu->env.msr_hv_synic_msg_page;
 hwaddr new_addr = (msr & HV_SIMP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
 
+if (new_addr == synic->msg_page_addr) {
+return;
+}
+
+if (synic->msg_page_addr) {
+memory_region_del_subregion(get_system_memory(), >msg_page_mr);
+}
+if (new_addr) {
+memory_region_add_subregion(get_system_memory(), new_addr,
+>msg_page_mr);
+}
 synic->msg_page_addr = new_addr;
 }
 
@@ -76,6 +94,17 @@ static void synic_update_evt_page_addr(SynICState *synic)
 uint64_t msr = synic->cpu->env.msr_hv_synic_evt_page;
 hwaddr new_addr = (msr & HV_SIEFP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
 
+if (new_addr == synic->evt_page_addr) {
+return;
+}
+
+if (synic->evt_page_addr) {
+memory_region_del_subregion(get_system_memory(), >evt_page_mr);
+}
+if (new_addr) {
+memory_region_add_subregion(get_system_memory(), new_addr,
+>evt_page_mr);
+}
 synic->evt_page_addr = new_addr;
 }
 
@@ -90,6 +119,15 @@ static void synic_update(SynICState *synic)
 synic_update_evt_page_addr(synic);
 }
 
+
+static void async_synic_update(CPUState *cs, run_on_cpu_data data)
+{
+SynICState *synic = data.host_ptr;
+qemu_mutex_lock_iothread();
+synic_update(synic);
+qemu_mutex_unlock_iothread();
+}
+
 int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
 {
 CPUX86State *env = >env;
@@ -100,11 +138,6 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit 
*exit)
 return -1;
 }
 
-/*
- * For now just track changes in SynIC control and msg/evt pages msr's.
- * When SynIC messaging/events processing will be added in future
- * here we will do messages queues flushing and pages remapping.
- */
 switch (exit->u.synic.msr) {
 case HV_X64_MSR_SCONTROL:
 env->msr_hv_synic_control = exit->u.synic.control;
@@ -118,7 +151,13 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit 
*exit)
 default:
 return -1;
 }
-synic_update(get_synic(cpu));
+/*
+ * this will run in this cpu thread before it returns to KVM, but in a
+ * safe environment (i.e. when all cpus are quiescent) -- this is
+ * necessary because we're changing memory hierarchy
+ */
+async_safe_run_on_cpu(CPU(cpu), async_synic_update,
+  RUN_ON_CPU_HOST_PTR(get_synic(cpu)));
 return 0;
 case KVM_EXIT_HYPERV_HCALL: {
 uint16_t code;
@@ -258,12 +297,29 @@ static void synic_realize(DeviceState *dev, Error **errp)
 {
 Object *obj = OBJECT(dev);
 SynICState *synic = SYNIC(dev);
+char *msgp_name, *evtp_name;
+uint32_t vp_index;
 
 if (synic->in_kvm_only) {
 return;
 }
 
 synic->cpu = X86_CPU(obj->parent);
+
+/* memory region names have to be globally unique */
+vp_index =

[Qemu-devel] [RFC PATCH 20/34] vmbus: vmbus implementation

2018-02-06 Thread Roman Kagan

Add the VMBus infrastructure -- bus, devices, root bridge, vmbus state
machine, vmbus channel interactions, etc.

TODO:
 - split into smaller palatable pieces
 - more comments
 - check and handle corner cases

Kudos to Evgeny Yakovlev (formerly eyakov...@virtuozzo.com) and Andrey
Smetatin (formerly asmeta...@virtuozzo.com) for research and
prototyping.

Signed-off-by: Roman Kagan 
---
 Makefile.objs|1 +
 include/hw/vmbus/vmbus.h |  106 ++
 hw/vmbus/vmbus.c | 2436 ++
 hw/vmbus/Makefile.objs   |1 +
 hw/vmbus/trace-events|8 +
 5 files changed, 2552 insertions(+)
 create mode 100644 include/hw/vmbus/vmbus.h
 create mode 100644 hw/vmbus/vmbus.c
 create mode 100644 hw/vmbus/Makefile.objs
 create mode 100644 hw/vmbus/trace-events

diff --git a/Makefile.objs b/Makefile.objs
index 2efba6d768..14a36a4736 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -159,6 +159,7 @@ trace-events-subdirs += hw/alpha
 trace-events-subdirs += hw/hppa
 trace-events-subdirs += hw/xen
 trace-events-subdirs += hw/ide
+trace-events-subdirs += hw/vmbus
 trace-events-subdirs += ui
 trace-events-subdirs += audio
 trace-events-subdirs += net
diff --git a/include/hw/vmbus/vmbus.h b/include/hw/vmbus/vmbus.h
new file mode 100644
index 00..cdb5180796
--- /dev/null
+++ b/include/hw/vmbus/vmbus.h
@@ -0,0 +1,106 @@
+/*
+ * QEMU Hyper-V VMBus
+ *
+ * Copyright (c) 2017-2018 Virtuozzo International GmbH.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_VMBUS_H
+#define QEMU_VMBUS_H
+
+#include "hw/qdev.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/dma.h"
+#include "target/i386/hyperv.h"
+#include "target/i386/hyperv-proto.h"
+#include "hw/vmbus/vmbus-proto.h"
+#include "qemu/uuid.h"
+
+#define TYPE_VMBUS_DEVICE "vmbus-dev"
+
+#define VMBUS_DEVICE(obj) \
+OBJECT_CHECK(VMBusDevice, (obj), TYPE_VMBUS_DEVICE)
+#define VMBUS_DEVICE_CLASS(klass) \
+OBJECT_CLASS_CHECK(VMBusDeviceClass, (klass), TYPE_VMBUS_DEVICE)
+#define VMBUS_DEVICE_GET_CLASS(obj) \
+OBJECT_GET_CLASS(VMBusDeviceClass, (obj), TYPE_VMBUS_DEVICE)
+
+typedef struct VMBus VMBus;
+typedef struct VMBusChannel VMBusChannel;
+typedef struct VMBusDevice VMBusDevice;
+typedef struct VMBusGpadl VMBusGpadl;
+
+typedef void(*VMBusChannelNotifyCb)(struct VMBusChannel *chan);
+
+typedef struct VMBusDeviceClass {
+DeviceClass parent;
+
+QemuUUID classid;
+QemuUUID instanceid; /* Fixed UUID for singleton devices */
+uint16_t channel_flags;
+uint16_t mmio_size_mb;
+
+void (*vmdev_realize)(VMBusDevice *vdev, Error **errp);
+void (*vmdev_unrealize)(VMBusDevice *vdev, Error **errp);
+void (*vmdev_reset)(VMBusDevice *vdev);
+uint16_t (*num_channels)(VMBusDevice *vdev);
+int (*open_channel) (VMBusDevice *vdev);
+void (*close_channel) (VMBusDevice *vdev);
+VMBusChannelNotifyCb chan_notify_cb;
+} VMBusDeviceClass;
+
+typedef struct VMBusDevice {
+DeviceState parent;
+QemuUUID instanceid;
+uint16_t num_channels;
+VMBusChannel *channels;
+AddressSpace *dma_as;
+} VMBusDevice;
+
+extern const VMStateDescription vmstate_vmbus_dev;
+
+typedef struct VMBusChanReq {
+VMBusChannel *chan;
+uint16_t pkt_type;
+uint32_t msglen;
+void *msg;
+uint64_t transaction_id;
+void *comp;
+QEMUSGList sgl;
+} VMBusChanReq;
+
+VMBusDevice *vmbus_channel_device(VMBusChannel *chan);
+VMBusChannel *vmbus_device_channel(VMBusDevice *dev, uint32_t chan_idx);
+uint32_t vmbus_channel_idx(VMBusChannel *chan);
+void vmbus_notify_channel(VMBusChannel *chan);
+
+void vmbus_create(void);
+bool vmbus_exists(void);
+
+int vmbus_channel_send(VMBusChannel *chan, uint16_t pkt_type,
+   void *desc, uint32_t desclen,
+   void *msg, uint32_t msglen,
+   bool need_comp, uint64_t transaction_id);
+int vmbus_chan_send_completion(VMBusChanReq *req);
+int vmbus_channel_reserve(VMBusChannel *chan,
+  uint32_t desclen, uint32_t msglen);
+void *vmbus_channel_recv(VMBusChannel *chan, uint32_t size);
+void vmbus_release_req(void *req);
+
+void vmbus_save_req(QEMUFile *f, VMBusChanReq *req);
+void *vmbus_load_req(QEMUFile *f, VMBusDevice *dev, uint32_t size);
+
+
+VMBusGpadl *vmbus_get_gpadl(VMBusChannel *chan, uint32_t gpadl_id);
+void vmbus_put_gpadl(VMBusGpadl *gpadl);
+uint32_t vmbus_gpadl_len(VMBusGpadl *gpadl);
+ssize_t vmbus_iov_to_gpadl(VMBusChannel *chan, VMBusGpadl *gpadl, uint32_t off,
+   const struct iovec *iov, size_t iov_cnt);
+int vmbus_map_sgl(QEMUSGList *sgl, DMADirection dir, struct iovec *iov,
+  unsigned iov_cnt, size_t len, size_t off);
+void vmbus_unmap_sgl(QEMUSGList *sgl, DMADirection dir, struct iovec *iov,
+ unsigned iov_cnt, size_t accessed);
+
+#endif
diff --git

[Qemu-devel] [RFC PATCH 09/34] hyperv: block SynIC use in QEMU in incompatible configurations

2018-02-06 Thread Roman Kagan

Certain configurations do not allow SynIC to be used in QEMU.  In
particular,

- when hyperv_vpindex is off, SINT routes can't be used as they refer to
  the destination vCPU by vp_index

- older KVM (which doesn't expose KVM_CAP_HYPERV_SYNIC2) zeroes out
  SynIC message and event pages on every msr load, breaking migration

OTOH in-KVM users of SynIC -- SynIC timers -- do work in those
configurations, and we shouldn't stop the guest from using them.

To cover both scenarios, introduce a (user-invisible) SynIC property
that disallows to use the SynIC within QEMU but not in KVM.  The
property is clear by default but is set via compat logic for older
machine types.

As a result, when hv_synic and a modern machine type are specified, QEMU
will refuse to run unless vp_index is on and the kernel is recent
enough.  OTOH with older machine types QEMU will fine run against an
older kernel and/or without vp_index enabled but will refuse the in-QEMU
uses of SynIC (e.g. VMBus).

Also a function is added that allows the devices to query the status of
SynIC support across vCPUs.

Signed-off-by: Roman Kagan 
---
 include/hw/i386/pc.h |  5 
 target/i386/hyperv.h |  4 ++-
 target/i386/hyperv.c | 70 +++-
 target/i386/kvm.c|  8 +++---
 4 files changed, 80 insertions(+), 7 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index bb49165fe0..744f6a20d2 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -352,6 +352,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 .property = "extended-tseg-mbytes",\
 .value= stringify(0),\
 },\
+{\
+.driver   = "hyperv-synic",\
+.property = "in-kvm-only",\
+.value= "on",\
+},\
 
 #define PC_COMPAT_2_8 \
 HW_COMPAT_2_8 \
diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index 20bbd7bb29..249bc15232 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -34,8 +34,10 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
 uint32_t hyperv_vp_index(X86CPU *cpu);
 X86CPU *hyperv_find_vcpu(uint32_t vp_index);
 
-void hyperv_synic_add(X86CPU *cpu);
+int hyperv_synic_add(X86CPU *cpu);
 void hyperv_synic_reset(X86CPU *cpu);
 void hyperv_synic_update(X86CPU *cpu);
 
+bool hyperv_synic_usable(void);
+
 #endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index a27d33acb3..933bfe5bcb 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -14,6 +14,7 @@
 #include "qemu/osdep.h"
 #include "qemu/main-loop.h"
 #include "qapi/error.h"
+#include "qemu/error-report.h"
 #include "hw/qdev-properties.h"
 #include "hyperv.h"
 #include "hyperv-proto.h"
@@ -23,6 +24,8 @@ typedef struct SynICState {
 
 X86CPU *cpu;
 
+bool in_kvm_only;
+
 bool enabled;
 hwaddr msg_page_addr;
 hwaddr evt_page_addr;
@@ -78,6 +81,10 @@ static void synic_update_evt_page_addr(SynICState *synic)
 
 static void synic_update(SynICState *synic)
 {
+if (synic->in_kvm_only) {
+return;
+}
+
 synic->enabled = synic->cpu->env.msr_hv_synic_control & HV_SYNIC_ENABLE;
 synic_update_msg_page_addr(synic);
 synic_update_evt_page_addr(synic);
@@ -154,6 +161,7 @@ HvSintRoute *hyperv_sint_route_new(uint32_t vp_index, 
uint32_t sint,
 }
 
 synic = get_synic(cpu);
+assert(!synic->in_kvm_only);
 
 sint_route = g_new0(HvSintRoute, 1);
 r = event_notifier_init(_route->sint_set_notifier, false);
@@ -240,17 +248,32 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route)
 return event_notifier_set(_route->sint_set_notifier);
 }
 
+static Property synic_props[] = {
+/* user-invisible, only used for compat handling */
+DEFINE_PROP_BOOL("in-kvm-only", SynICState, in_kvm_only, false),
+DEFINE_PROP_END_OF_LIST(),
+};
+
 static void synic_realize(DeviceState *dev, Error **errp)
 {
 Object *obj = OBJECT(dev);
 SynICState *synic = SYNIC(dev);
 
+if (synic->in_kvm_only) {
+return;
+}
+
 synic->cpu = X86_CPU(obj->parent);
 }
 
 static void synic_reset(DeviceState *dev)
 {
 SynICState *synic = SYNIC(dev);
+
+if (synic->in_kvm_only) {
+return;
+}
+
 synic_update(synic);
 }
 
@@ -258,19 +281,45 @@ static void synic_class_init(ObjectClass *klass, void 
*data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
+dc->props = synic_props;
 dc->realize = synic_realize;
 dc->reset = synic_reset;
 dc->user_creatable = false;
 }
 
-void hyperv_synic_add(X86CPU *cpu)
+int hyperv_synic_add(X86CPU *cpu)
 {
 Object *obj;
+SynICState *synic;
+uint32_t synic_cap;
+int ret;
 
 obj = object_new(TYPE_SYNIC);
 object_property_add_child(OBJECT(cpu), "synic", obj, _abort);
 object_unref(obj);
+
+synic = SYNIC(obj);
+
+if (!synic->in_kvm_only) {
+synic_cap = KVM_CAP_HYPERV_SYNIC2;
+if (!cpu->hyperv_vpindex) {
+error_report("Hyper-V SynIC requires VP_INDEX

Re: [Qemu-devel] [PATCH RFC 17/21] qapi/types qapi/visit: Generate built-in stuff into separate files

2018-02-06 Thread Eric Blake


On 02/02/2018 07:03 AM, Markus Armbruster wrote:

Linking code from multiple separate QAPI schemata into the same
program is possible, but involves some weirdness around built-in
types:

* We generate code for built-in types into .c only with option
   --builtins.  The user is responsible to generate code for exactly
   one QAPI schema per program with --builtins.

* We generate code for them it into .h regardless of --builtins,


s/them it/them/


   guarded by #ifndef QAPI_VISIT_BUILTIN.  Because the code for
   built-in types is exactly the same in all of them, including any
   combination of these headers works.

Replace this contraption by something more conventional: generate code
for built-in types into their very own files: qapi-builtin-types.c,
qapi-builtin-visit.c, qapi-builtin-types.h, qapi-builtin-visit.h, but
only with --builtins.  Obey --output-dir, but ignore --prefix for
them.

Make qapi-types.h include qapi-builtin-types.h.  With multiple
schemata you now have multiple qapi-types.[ch], but only one
qapi-builtin-types.[ch].  Same for qapi-visit.[ch] and
qapi-builtin-visit.[ch].

Bonus: if all you need is built-in stuff, you can include a much
smaller header.  To be exploited shortly.

Signed-off-by: Markus Armbruster 
---


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Qemu-devel] [RFC PATCH 22/34] i386: Hyper-V VMBus ACPI DSDT entry

2018-02-06 Thread Roman Kagan

From: Andrey Smetanin 

Guest OS uses ACPI to discover vmbus presence.  Add a corresponding
entry to DSDT in case vmbus has been enabled.

Experimentally Windows guests were found to require this entry to
include two IRQ resources, so this patch adds two semi-arbitrarily
chosen ones (7 and 13).  This results, in particular, in parallel port
conflicting with vmbus.

TODO: discover and use spare IRQs to avoid conflicts.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Roman Kagan 
---
 hw/i386/acpi-build.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ed78c4ed9f..6f8cd3eb41 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -46,6 +46,7 @@
 #include "sysemu/tpm_backend.h"
 #include "hw/timer/mc146818rtc_regs.h"
 #include "sysemu/numa.h"
+#include "hw/vmbus/vmbus.h"
 
 /* Supported chipsets: */
 #include "hw/acpi/piix4.h"
@@ -1317,6 +1318,43 @@ static Aml *build_com_device_aml(uint8_t uid)
 return dev;
 }
 
+static Aml *build_vmbus_device_aml(void)
+{
+Aml *dev;
+Aml *method;
+Aml *crs;
+
+dev = aml_device("VMBS");
+aml_append(dev, aml_name_decl("STA", aml_int(0xF)));
+aml_append(dev, aml_name_decl("_HID", aml_string("VMBus")));
+aml_append(dev, aml_name_decl("_UID", aml_int(0x0)));
+aml_append(dev, aml_name_decl("_DDN", aml_string("VMBUS")));
+
+method = aml_method("_DIS", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_store(aml_and(aml_name("STA"), aml_int(0xD), NULL),
+ aml_name("STA")));
+aml_append(dev, method);
+
+method = aml_method("_PS0", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_store(aml_or(aml_name("STA"), aml_int(0xF), NULL),
+ aml_name("STA")));
+aml_append(dev, method);
+
+method = aml_method("_STA", 0, AML_NOTSERIALIZED);
+aml_append(method, aml_store(aml_name("STA"), aml_local(0)));
+aml_append(method, aml_return(aml_local(0)));
+aml_append(dev, method);
+
+aml_append(dev, aml_name_decl("_PS3", aml_int(0x0)));
+
+crs = aml_resource_template();
+aml_append(crs, aml_irq_no_flags(7));
+aml_append(crs, aml_irq_no_flags(13));
+aml_append(dev, aml_name_decl("_CRS", crs));
+
+return dev;
+}
+
 static void build_isa_devices_aml(Aml *table)
 {
 ISADevice *fdc = pc_find_fdc0();
@@ -1343,6 +1381,10 @@ static void build_isa_devices_aml(Aml *table)
 build_acpi_ipmi_devices(scope, BUS(obj));
 }
 
+if (vmbus_exists()) {
+aml_append(scope, build_vmbus_device_aml());
+}
+
 aml_append(table, scope);
 }
 
-- 
2.14.3

[Qemu-devel] [RFC PATCH 06/34] hyperv: address HvSintRoute by X86CPU pointer

2018-02-06 Thread Roman Kagan

Use X86CPU pointer to refer to the respective HvSintRoute instead of
vp_index.  This is more convenient and also paves the way for future
enhancements.

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index b2416f9a5b..0ce8a7aa2f 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -18,7 +18,7 @@
 
 struct HvSintRoute {
 uint32_t sint;
-uint32_t vcpu_id;
+X86CPU *cpu;
 int gsi;
 EventNotifier sint_set_notifier;
 EventNotifier sint_ack_notifier;
@@ -97,6 +97,12 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, 
uint32_t sint,
 HvSintRoute *sint_route;
 EventNotifier *ack_notifier;
 int r, gsi;
+X86CPU *cpu;
+
+cpu = hyperv_find_vcpu(vp_index);
+if (!cpu) {
+return NULL;
+}
 
 sint_route = g_new0(HvSintRoute, 1);
 r = event_notifier_init(_route->sint_set_notifier, false);
@@ -128,7 +134,7 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, 
uint32_t sint,
 sint_route->gsi = gsi;
 sint_route->sint_ack_clb = sint_ack_clb;
 sint_route->sint_ack_clb_data = sint_ack_clb_data;
-sint_route->vcpu_id = vp_index;
+sint_route->cpu = cpu;
 sint_route->sint = sint;
 
 return sint_route;
-- 
2.14.3

Re: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

2018-02-06 Thread Alex Bennée


Peter Maydell  writes:

> On 4 February 2018 at 04:11, Richard Henderson
>  wrote:
>> Or there's the code from glibc.  I know Peter didn't like the idea;
>> debugging this code is fairly painful -- the massive preprocessor
>> macros mean that you can't step through anything.  But at least we
>> have a good relationship with glibc, so merging patches back and
>> forth should be easy.
>
> Yeah. I didn't like dealing with this code two decades ago
> when I first encountered it, and it hasn't improved any.
> It's pretty much write-only code, and it isn't going to be
> any fun for debugging.

I think I've managed to pull the performance back on softfloat-v4 thanks
to the attribute(flatten) changes to addsub/div/mul/mulladd.

--
Alex Bennée

[Qemu-devel] [RFC PATCH 19/34] vmbus: add vmbus protocol definitions

2018-02-06 Thread Roman Kagan

Add a header with data structures and constants used in Hyper-V VMBus
hypervisor <-> guest interactions.

Based on the respective stuff from Linux kernel.

Signed-off-by: Roman Kagan 
---
 include/hw/vmbus/vmbus-proto.h | 222 +
 1 file changed, 222 insertions(+)
 create mode 100644 include/hw/vmbus/vmbus-proto.h

diff --git a/include/hw/vmbus/vmbus-proto.h b/include/hw/vmbus/vmbus-proto.h
new file mode 100644
index 00..1a60309650
--- /dev/null
+++ b/include/hw/vmbus/vmbus-proto.h
@@ -0,0 +1,222 @@
+/*
+ * QEMU Hyper-V VMBus support
+ *
+ * Copyright (c) 2017-2018 Virtuozzo International GmbH.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef _INCLUDE_HYPERV_VMBUS_H_
+#define _INCLUDE_HYPERV_VMBUS_H_
+
+#define VMBUS_VERSION_WS2008((0 << 16) | (13))
+#define VMBUS_VERSION_WIN7  ((1 << 16) | (1))
+#define VMBUS_VERSION_WIN8  ((2 << 16) | (4))
+#define VMBUS_VERSION_WIN8_1((3 << 16) | (0))
+#define VMBUS_VERSION_WIN10 ((4 << 16) | (0))
+#define VMBUS_VERSION_INVAL -1
+#define VMBUS_VERSION_CURRENT   VMBUS_VERSION_WIN10
+
+#define VMBUS_MESSAGE_CONNECTION_ID 1
+#define VMBUS_EVENT_CONNECTION_ID   2
+#define VMBUS_MONITOR_CONNECTION_ID 3
+#define VMBUS_SINT  2
+
+#define VMBUS_MSG_INVALID   0
+#define VMBUS_MSG_OFFERCHANNEL  1
+#define VMBUS_MSG_RESCIND_CHANNELOFFER  2
+#define VMBUS_MSG_REQUESTOFFERS 3
+#define VMBUS_MSG_ALLOFFERS_DELIVERED   4
+#define VMBUS_MSG_OPENCHANNEL   5
+#define VMBUS_MSG_OPENCHANNEL_RESULT6
+#define VMBUS_MSG_CLOSECHANNEL  7
+#define VMBUS_MSG_GPADL_HEADER  8
+#define VMBUS_MSG_GPADL_BODY9
+#define VMBUS_MSG_GPADL_CREATED 10
+#define VMBUS_MSG_GPADL_TEARDOWN11
+#define VMBUS_MSG_GPADL_TORNDOWN12
+#define VMBUS_MSG_RELID_RELEASED13
+#define VMBUS_MSG_INITIATE_CONTACT  14
+#define VMBUS_MSG_VERSION_RESPONSE  15
+#define VMBUS_MSG_UNLOAD16
+#define VMBUS_MSG_UNLOAD_RESPONSE   17
+#define VMBUS_MSG_COUNT 18
+
+#define VMBUS_MESSAGE_SIZE_ALIGNsizeof(uint64_t)
+
+#define VMBUS_PACKET_INVALID0x0
+#define VMBUS_PACKET_SYNCH  0x1
+#define VMBUS_PACKET_ADD_XFER_PAGESET   0x2
+#define VMBUS_PACKET_RM_XFER_PAGESET0x3
+#define VMBUS_PACKET_ESTABLISH_GPADL0x4
+#define VMBUS_PACKET_TEARDOWN_GPADL 0x5
+#define VMBUS_PACKET_DATA_INBAND0x6
+#define VMBUS_PACKET_DATA_USING_XFER_PAGES  0x7
+#define VMBUS_PACKET_DATA_USING_GPADL   0x8
+#define VMBUS_PACKET_DATA_USING_GPA_DIRECT  0x9
+#define VMBUS_PACKET_CANCEL_REQUEST 0xa
+#define VMBUS_PACKET_COMP   0xb
+#define VMBUS_PACKET_DATA_USING_ADDITIONAL_PKT  0xc
+#define VMBUS_PACKET_ADDITIONAL_DATA0xd
+
+#define VMBUS_CHANNEL_USER_DATA_SIZE120
+
+#define VMBUS_OFFER_MONITOR_ALLOCATED   0x1
+#define VMBUS_OFFER_INTERRUPT_DEDICATED 0x1
+
+#define VMBUS_RING_BUFFER_FEAT_PENDING_SZ   (1ul << 0)
+
+#define VMBUS_CHANNEL_ENUMERATE_DEVICE_INTERFACE  0x1
+#define VMBUS_CHANNEL_SERVER_SUPPORTS_TRANSFER_PAGES  0x2
+#define VMBUS_CHANNEL_SERVER_SUPPORTS_GPADLS  0x4
+#define VMBUS_CHANNEL_NAMED_PIPE_MODE 0x10
+#define VMBUS_CHANNEL_LOOPBACK_OFFER  0x100
+#define VMBUS_CHANNEL_PARENT_OFFER0x200
+#define VMBUS_CHANNEL_REQUEST_MONITORED_NOTIFICATION  0x400
+#define VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER0x2000
+
+#define VMBUS_PACKET_FLAG_REQUEST_COMPLETION1
+
+typedef struct vmbus_message_header {
+uint32_t message_type;
+uint32_t _padding;
+} vmbus_message_header;
+
+typedef struct vmbus_message_initiate_contact {
+vmbus_message_header header;
+uint32_t version_requested;
+uint32_t target_vcpu;
+uint64_t interrupt_page;
+uint64_t monitor_page1;
+uint64_t monitor_page2;
+} vmbus_message_initiate_contact;
+
+typedef struct vmbus_message_version_response {
+vmbus_message_header header;
+uint8_t version_supported;
+uint8_t status;
+} vmbus_message_version_response;
+
+typedef struct vmbus_message_offer_channel {
+vmbus_message_header header;
+uint8_t  type_uuid[16];
+uint8_t  instance_uuid[16];
+uint64_t _reserved1;
+uint64_t _reserved2;
+uint16_t channel_flags;
+uint16_t mmio_size_mb;
+uint8_t  user_data[VMBUS_CHANNEL_USER_DATA_SIZE];
+uint16_t sub_channel_index;
+uint16_t _reserved3;
+uint32_t child_relid;
+uint8_t  monitor_id;
+uint8_t  monitor_flags;  // VMBUS_OFFER_MONITOR_*
+uint16_t

[Qemu-devel] [RFC PATCH 04/34] hyperv: synic: only setup ack notifier if there's a callback

2018-02-06 Thread Roman Kagan

There's no point setting up an sint ack notifier if no callback is
specified.

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.c | 33 +++--
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index e762eac79f..f3ffafa4e9 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -77,15 +77,14 @@ static void kvm_hv_sint_ack_handler(EventNotifier *notifier)
 HvSintRoute *sint_route = container_of(notifier, HvSintRoute,
sint_ack_notifier);
 event_notifier_test_and_clear(notifier);
-if (sint_route->sint_ack_clb) {
-sint_route->sint_ack_clb(sint_route);
-}
+sint_route->sint_ack_clb(sint_route);
 }
 
 HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, uint32_t sint,
   HvSintAckClb sint_ack_clb)
 {
 HvSintRoute *sint_route;
+EventNotifier *ack_notifier;
 int r, gsi;
 
 sint_route = g_new0(HvSintRoute, 1);
@@ -94,13 +93,15 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, 
uint32_t sint,
 goto err;
 }
 
-r = event_notifier_init(_route->sint_ack_notifier, false);
-if (r) {
-goto err_sint_set_notifier;
-}
+ack_notifier = sint_ack_clb ? _route->sint_ack_notifier : NULL;
+if (ack_notifier) {
+r = event_notifier_init(ack_notifier, false);
+if (r) {
+goto err_sint_set_notifier;
+}
 
-event_notifier_set_handler(_route->sint_ack_notifier,
-   kvm_hv_sint_ack_handler);
+event_notifier_set_handler(ack_notifier, kvm_hv_sint_ack_handler);
+}
 
 gsi = kvm_irqchip_add_hv_sint_route(kvm_state, vp_index, sint);
 if (gsi < 0) {
@@ -109,7 +110,7 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, 
uint32_t sint,
 
 r = kvm_irqchip_add_irqfd_notifier_gsi(kvm_state,
_route->sint_set_notifier,
-   _route->sint_ack_notifier, 
gsi);
+   ack_notifier, gsi);
 if (r) {
 goto err_irqfd;
 }
@@ -123,8 +124,10 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, 
uint32_t sint,
 err_irqfd:
 kvm_irqchip_release_virq(kvm_state, gsi);
 err_gsi:
-event_notifier_set_handler(_route->sint_ack_notifier, NULL);
-event_notifier_cleanup(_route->sint_ack_notifier);
+if (ack_notifier) {
+event_notifier_set_handler(ack_notifier, NULL);
+event_notifier_cleanup(ack_notifier);
+}
 err_sint_set_notifier:
 event_notifier_cleanup(_route->sint_set_notifier);
 err:
@@ -139,8 +142,10 @@ void kvm_hv_sint_route_destroy(HvSintRoute *sint_route)
   _route->sint_set_notifier,
   sint_route->gsi);
 kvm_irqchip_release_virq(kvm_state, sint_route->gsi);
-event_notifier_set_handler(_route->sint_ack_notifier, NULL);
-event_notifier_cleanup(_route->sint_ack_notifier);
+if (sint_route->sint_ack_clb) {
+event_notifier_set_handler(_route->sint_ack_notifier, NULL);
+event_notifier_cleanup(_route->sint_ack_notifier);
+}
 event_notifier_cleanup(_route->sint_set_notifier);
 g_free(sint_route);
 }
-- 
2.14.3

[Qemu-devel] [RFC PATCH 18/34] hyperv: add support for KVM_HYPERV_EVENTFD

2018-02-06 Thread Roman Kagan

When setting up a notifier for Hyper-V event connection, attempt to use
the KVM-assisted one first, and fall back to userspace handling of the
hypercall if the kernel doesn't provide the requested feature.

Signed-off-by: Roman Kagan 
---
 include/sysemu/kvm.h |  1 +
 accel/kvm/kvm-all.c  | 15 +++
 target/i386/hyperv.c | 21 -
 3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index bbf12a1723..70ad0a54b7 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -496,6 +496,7 @@ void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, 
int gsi);
 void kvm_pc_gsi_handler(void *opaque, int n, int level);
 void kvm_pc_setup_irq_routing(bool pci_enabled);
 void kvm_init_irq_routing(KVMState *s);
+int kvm_set_hv_event_notifier(KVMState *s, uint32_t conn_id, EventNotifier *n);
 
 /**
  * kvm_arch_irqchip_create:
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f290f487a5..c3ba87b701 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1423,6 +1423,21 @@ static void kvm_irqchip_create(MachineState *machine, 
KVMState *s)
 s->gsimap = g_hash_table_new(g_direct_hash, g_direct_equal);
 }
 
+int kvm_set_hv_event_notifier(KVMState *s, uint32_t conn_id, EventNotifier *n)
+{
+struct kvm_hyperv_eventfd hvevfd = {
+.conn_id = conn_id,
+.fd = n ? event_notifier_get_fd(n) : -1,
+.flags = n ? 0 : KVM_HYPERV_EVENTFD_DEASSIGN,
+};
+
+if (!kvm_check_extension(s, KVM_CAP_HYPERV_EVENTFD)) {
+return -ENOSYS;
+}
+
+return kvm_vm_ioctl(s, KVM_HYPERV_EVENTFD, );
+}
+
 /* Find number of supported CPUs using the recommended
  * procedure from the kernel API documentation to cope with
  * older kernels that may be missing capabilities.
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index e43cbb9322..63dcb23fa8 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -313,7 +313,8 @@ unlock:
 return ret;
 }
 
-int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier)
+static int hyperv_set_evt_notifier_userspace(uint32_t conn_id,
+ EventNotifier *notifier)
 {
 int ret;
 EvtHandler *eh;
@@ -346,6 +347,24 @@ unlock:
 return ret;
 }
 
+static bool hv_evt_notifier_userspace;
+
+int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier)
+{
+if (!hv_evt_notifier_userspace) {
+int ret = kvm_set_hv_event_notifier(kvm_state, conn_id, notifier);
+if (ret != -ENOSYS) {
+return ret;
+}
+
+hv_evt_notifier_userspace = true;
+warn_report("Hyper-V event signaling in KVM not supported; "
+"using slower userspace hypercall processing");
+}
+
+return hyperv_set_evt_notifier_userspace(conn_id, notifier);
+}
+
 static uint64_t hvcall_post_message(uint64_t param, bool fast)
 {
 uint64_t ret;
-- 
2.14.3

[Qemu-devel] [RFC PATCH 05/34] hyperv: allow passing arbitrary data to sint ack callback

2018-02-06 Thread Roman Kagan

Make sint ack callback accept an opaque pointer, that is stored on
sint_route at creation time.

This allows for more convenient interaction with the callback.

Besides, nothing outside hyperv.c should need to know the layout of
HvSintRoute fields any more so its declaration can be removed from the
header.

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.h | 14 +++---
 hw/misc/hyperv_testdev.c |  2 +-
 target/i386/hyperv.c | 16 ++--
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index 82f4757975..93f7300dd6 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -19,21 +19,13 @@
 #include "qemu/event_notifier.h"
 
 typedef struct HvSintRoute HvSintRoute;
-typedef void (*HvSintAckClb)(HvSintRoute *sint_route);
-
-struct HvSintRoute {
-uint32_t sint;
-uint32_t vcpu_id;
-int gsi;
-EventNotifier sint_set_notifier;
-EventNotifier sint_ack_notifier;
-HvSintAckClb sint_ack_clb;
-};
+typedef void (*HvSintAckClb)(void *data);
 
 int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit);
 
 HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, uint32_t sint,
-  HvSintAckClb sint_ack_clb);
+  HvSintAckClb sint_ack_clb,
+  void *sint_ack_clb_data);
 
 void kvm_hv_sint_route_destroy(HvSintRoute *sint_route);
 
diff --git a/hw/misc/hyperv_testdev.c b/hw/misc/hyperv_testdev.c
index b47af477cb..827a8b1d82 100644
--- a/hw/misc/hyperv_testdev.c
+++ b/hw/misc/hyperv_testdev.c
@@ -55,7 +55,7 @@ static void sint_route_create(HypervTestDev *dev, uint8_t 
vpidx, uint8_t sint)
 sint_route->vpidx = vpidx;
 sint_route->sint = sint;
 
-sint_route->sint_route = kvm_hv_sint_route_create(vpidx, sint, NULL);
+sint_route->sint_route = kvm_hv_sint_route_create(vpidx, sint, NULL, NULL);
 assert(sint_route->sint_route);
 
 QLIST_INSERT_HEAD(>sint_routes, sint_route, le);
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index f3ffafa4e9..b2416f9a5b 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -16,6 +16,16 @@
 #include "hyperv.h"
 #include "hyperv-proto.h"
 
+struct HvSintRoute {
+uint32_t sint;
+uint32_t vcpu_id;
+int gsi;
+EventNotifier sint_set_notifier;
+EventNotifier sint_ack_notifier;
+HvSintAckClb sint_ack_clb;
+void *sint_ack_clb_data;
+};
+
 uint32_t hyperv_vp_index(X86CPU *cpu)
 {
 return CPU(cpu)->cpu_index;
@@ -77,11 +87,12 @@ static void kvm_hv_sint_ack_handler(EventNotifier *notifier)
 HvSintRoute *sint_route = container_of(notifier, HvSintRoute,
sint_ack_notifier);
 event_notifier_test_and_clear(notifier);
-sint_route->sint_ack_clb(sint_route);
+sint_route->sint_ack_clb(sint_route->sint_ack_clb_data);
 }
 
 HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, uint32_t sint,
-  HvSintAckClb sint_ack_clb)
+  HvSintAckClb sint_ack_clb,
+  void *sint_ack_clb_data)
 {
 HvSintRoute *sint_route;
 EventNotifier *ack_notifier;
@@ -116,6 +127,7 @@ HvSintRoute *kvm_hv_sint_route_create(uint32_t vp_index, 
uint32_t sint,
 }
 sint_route->gsi = gsi;
 sint_route->sint_ack_clb = sint_ack_clb;
+sint_route->sint_ack_clb_data = sint_ack_clb_data;
 sint_route->vcpu_id = vp_index;
 sint_route->sint = sint;
 
-- 
2.14.3

Re: [Qemu-devel] [PATCH RFC 18/21] qapi/common: Fix guardname() for funny filenames

2018-02-06 Thread Eric Blake


On 02/02/2018 07:03 AM, Markus Armbruster wrote:

guardname() fails to return a valid C identifier for arguments
containing anything but [A-Za-z0-9_.-'].  Fix that.

Signed-off-by: Markus Armbruster 
---
  scripts/qapi/common.py | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/qapi/common.py b/scripts/qapi/common.py
index 7c78d9..7d497b5b17 100644
--- a/scripts/qapi/common.py
+++ b/scripts/qapi/common.py
@@ -1860,7 +1860,7 @@ def mcgen(code, **kwds):
  
  
  def guardname(filename):

-return c_name(filename, protect=False).upper()
+return re.sub(r'[^A-Za-z0-9_]', '_', filename).upper()


For some choices of filename, the old code prefixes a q_ (via c_name) 
which gets turned into Q_ in the final guard name.  The new code does 
not.  Then again, all of the names protected by c_name() all contain 
lower case, while guard names are all upper case; so we aren't 
protecting ourselves from defining a reserved word; the main use for 
c_name() is to protect ourselves where we are not changing case (for 
example, _BOOL is no better than Q__BOOL as a guard name for a file 
named _Bool).


Might be worth mentioning this design consideration in the commit 
message, but the change itself is reasonable.


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Qemu-devel] [RFC PATCH 14/34] hyperv: process POST_MESSAGE hypercall

2018-02-06 Thread Roman Kagan

Add handling of POST_MESSAGE hypercall.  For that, add an interface to
regsiter a handler for the messages arrived from the guest on a
particular connection id (IOW set up a message connection in Hyper-V
speak).

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.h |  5 +++
 target/i386/hyperv.c | 87 
 2 files changed, 92 insertions(+)

diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index 4ce41fe314..fcc41caf1f 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -43,6 +43,11 @@ int hyperv_post_msg(HvSintRoute *sint_route, struct 
hyperv_message *msg);
 
 int hyperv_set_evt_flag(HvSintRoute *sint_route, unsigned evtno);
 
+struct hyperv_post_message_input;
+typedef uint64_t (*HvMsgHandler)(const struct hyperv_post_message_input *msg,
+ void *data);
+int hyperv_set_msg_handler(uint32_t conn_id, HvMsgHandler handler, void *data);
+
 int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier);
 
 #endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 9cf1225385..3dc8a7acb0 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -252,6 +252,14 @@ static void async_synic_update(CPUState *cs, 
run_on_cpu_data data)
 qemu_mutex_unlock_iothread();
 }
 
+typedef struct MsgHandler {
+struct rcu_head rcu;
+QLIST_ENTRY(MsgHandler) le;
+uint32_t conn_id;
+HvMsgHandler handler;
+void *data;
+} MsgHandler;
+
 typedef struct EvtHandler {
 struct rcu_head rcu;
 QLIST_ENTRY(EvtHandler) le;
@@ -259,15 +267,51 @@ typedef struct EvtHandler {
 EventNotifier *notifier;
 } EvtHandler;
 
+static QLIST_HEAD(, MsgHandler) msg_handlers;
 static QLIST_HEAD(, EvtHandler) evt_handlers;
 static QemuMutex handlers_mutex;
 
 static void __attribute__((constructor)) hv_init(void)
 {
+QLIST_INIT(_handlers);
 QLIST_INIT(_handlers);
 qemu_mutex_init(_mutex);
 }
 
+int hyperv_set_msg_handler(uint32_t conn_id, HvMsgHandler handler, void *data)
+{
+int ret;
+MsgHandler *mh;
+
+qemu_mutex_lock(_mutex);
+QLIST_FOREACH(mh, _handlers, le) {
+if (mh->conn_id == conn_id) {
+if (handler) {
+ret = -EEXIST;
+} else {
+QLIST_REMOVE_RCU(mh, le);
+g_free_rcu(mh, rcu);
+ret = 0;
+}
+goto unlock;
+}
+}
+
+if (handler) {
+mh = g_new(MsgHandler, 1);
+mh->conn_id = conn_id;
+mh->handler = handler;
+mh->data = data;
+QLIST_INSERT_HEAD_RCU(_handlers, mh, le);
+ret = 0;
+} else {
+ret = -ENOENT;
+}
+unlock:
+qemu_mutex_unlock(_mutex);
+return ret;
+}
+
 int hyperv_set_evt_notifier(uint32_t conn_id, EventNotifier *notifier)
 {
 int ret;
@@ -301,6 +345,46 @@ unlock:
 return ret;
 }
 
+static uint64_t hvcall_post_message(uint64_t param, bool fast)
+{
+uint64_t ret;
+hwaddr len;
+struct hyperv_post_message_input *msg;
+MsgHandler *mh;
+
+if (fast) {
+return HV_STATUS_INVALID_HYPERCALL_CODE;
+}
+if (param & (__alignof__(*msg) - 1)) {
+return HV_STATUS_INVALID_ALIGNMENT;
+}
+
+len = sizeof(*msg);
+msg = cpu_physical_memory_map(param, , 0);
+if (len < sizeof(*msg)) {
+ret = HV_STATUS_INSUFFICIENT_MEMORY;
+goto unmap;
+}
+if (msg->payload_size > sizeof(msg->payload)) {
+ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
+goto unmap;
+}
+
+ret = HV_STATUS_INVALID_CONNECTION_ID;
+rcu_read_lock();
+QLIST_FOREACH_RCU(mh, _handlers, le) {
+if (mh->conn_id == (msg->connection_id & HV_CONNECTION_ID_MASK)) {
+ret = mh->handler(msg, mh->data);
+break;
+}
+}
+rcu_read_unlock();
+
+unmap:
+cpu_physical_memory_unmap(msg, len, 0, 0);
+return ret;
+}
+
 static uint64_t sigevent_params(hwaddr addr, uint32_t *conn_id)
 {
 uint64_t ret;
@@ -389,6 +473,9 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit 
*exit)
 uint64_t param = exit->u.hcall.params[0];
 
 switch (code) {
+case HV_POST_MESSAGE:
+exit->u.hcall.result = hvcall_post_message(param, fast);
+break;
 case HV_SIGNAL_EVENT:
 exit->u.hcall.result = hvcall_signal_event(param, fast);
 break;
-- 
2.14.3

[Qemu-devel] [RFC PATCH 34/34] hv-net: define default rom file name

2018-02-06 Thread Roman Kagan

Signed-off-by: Roman Kagan 
---
 hw/net/hv-net.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/net/hv-net.c b/hw/net/hv-net.c
index 3d719458ea..b44ab20b0f 100644
--- a/hw/net/hv-net.c
+++ b/hw/net/hv-net.c
@@ -1423,6 +1423,7 @@ static void hv_net_class_init(ObjectClass *klass, void 
*data)
 vdc->num_channels = hv_net_num_channels;
 vdc->close_channel = hv_net_close_channel;
 vdc->chan_notify_cb = hv_net_notify_cb;
+vdc->romfile = "efi-hyperv.rom";
 }
 
 static void hv_net_instance_init(Object *obj)
-- 
2.14.3

[Qemu-devel] [RFC PATCH 00/34] Hyper-V / VMBus

2018-02-06 Thread Roman Kagan

This is a work-in-progress series with Hyper-V / VMBus device emulation.
It's still very raw but it's testable so I'd appreciate feedback on
whether the design is sound.

This stuff can also be seen at https://src.openvz.org/scm/up/qemu.git.

Current status of the components included:

* Hyper-V infrastructure (SynIC, VP_INDEX, events & messages, etc):
  - works, mostly submission-ready but needs addressing some issues

* VMBus infrastructure (types, vmbus state machine, channel
  communication, etc):
  - works in main scenarios

* SCSI controller:
  - works in main scenarios
  - supports multiqueue
  - iothread-unaware
  - performance competitive to virtio-scsi [*]
  - can be used for booting [**]

* network interface:
  - basic operation only
  - no mac filtering
  - no offloads
  - no multiqueue
  - can be used for booting with limitations [***]

Overall, it's in bad need for documentation, tests, proper patch split.
Migration works but only tested very lightly.

[*] basic measurements with fio rw=randread bs=4k iodepth=256 using
null-co backend give
o Windows guests: hv-scsi 10%-30% better than virtio-scsi
o Linux guests: hv-scsi 10%-20% worse than virtio-scsi
No performance analysis has been done yet.

[**] patches for SeaBIOS and OVMF can be found at
 https://src.openvz.org/scm/up/seabios.git and
 https://src.openvz.org/scm/up/edk2.git

[***] patches for iPXE can be found at
  https://src.openvz.org/scm/up/ipxe.git, limitations:
  - OVMF is yet unsupported
  - iPXE takes over VMBus state management from SeaBIOS so after
iPXE failure to boot off hv-net booting off hv-scsi doesn't work

Prerequisites:
 - KVM 4.14+
 - HYPERV_EVENTFD support (in kvm/queue) is optional but recommended for
   better performance

How to use:

# qemu \
... \
-machine ...,accel=kvm,vmbus \
-cpu ...,hv_vpindex,hv_synic,hv_stimer,(other hv_* recommended),kvm=off \
-nodefaults \
...
-device hv-scsi,(usual scsi props),(instanceid= optional) \
-drive ... \
-device scsi-hd,... \
...
-netdev ... \
-device hv-net,(usual net props),(instanceid= optional) \
...


Big kudos to Andrey Smetanin and Evgeny Yakovlev for their research and
prototyping.

Andrey Smetanin (1):
  i386: Hyper-V VMBus ACPI DSDT entry

Evgeny Yakovlev (1):
  vmbus: build configuration

Roman Kagan (32):
  hyperv: ensure VP index equal to QEMU cpu_index
  hyperv_testdev: refactor for readability
  hyperv: cosmetic: g_malloc -> g_new
  hyperv: synic: only setup ack notifier if there's a callback
  hyperv: allow passing arbitrary data to sint ack callback
  hyperv: address HvSintRoute by X86CPU pointer
  hyperv: make HvSintRoute reference-counted
  hyperv: qom-ify SynIC
  hyperv: block SynIC use in QEMU in incompatible configurations
  hyperv: make overlay pages for SynIC
  hyperv: add synic message delivery
  hyperv: add synic event flag signaling
  hyperv: process SIGNAL_EVENT hypercall
  hyperv: process POST_MESSAGE hypercall
  hyperv_testdev: add SynIC message and event testmodes
  hyperv: update copyright notices
  [not to commit] import HYPERV_EVENTFD stuff from kernel
  hyperv: add support for KVM_HYPERV_EVENTFD
  vmbus: add vmbus protocol definitions
  vmbus: vmbus implementation
  i386: en/disable vmbus by a machine property
  scsi: add Hyper-V/VMBus SCSI protocol definitions
  scsi: add Hyper-V/VMBus SCSI controller
  hv-scsi: limit the number of requests per notification
  tests: hv-scsi: add start-stop test
  net: add RNDIS definitions
  net: add Hyper-V/VMBus network protocol definitions
  net: add Hyper-V/VMBus net adapter
  hv-net: add .bootindex support
  loader: allow arbitrary basename for fw_cfg file roms
  vmbus: add support for rom files
  hv-net: define default rom file name

 configure  |   11 +
 Makefile.objs  |1 +
 hw/net/hvnet-proto.h   | 1161 +++
 hw/net/rndis.h |  391 +++
 hw/scsi/hvscsi-proto.h |  150 +++
 include/hw/i386/pc.h   |8 +
 include/hw/loader.h|2 +-
 include/hw/vmbus/vmbus-proto.h |  222 
 include/hw/vmbus/vmbus.h   |  109 ++
 include/sysemu/kvm.h   |1 +
 linux-headers/linux/kvm.h  |   14 +
 target/i386/hyperv.h   |   40 +-
 accel/kvm/kvm-all.c|   15 +
 hw/core/loader.c   |   43 +-
 hw/i386/acpi-build.c   |   42 +
 hw/i386/pc.c   |   34 +
 hw/i386/pc_piix.c  |5 +
 hw/i386/pc_q35.c   |5 +
 hw/misc/hyperv_testdev.c   |  267 -
 hw/net/hv-net.c| 1450 +++
 hw/scsi/hv-scsi.c  |  403 +++
 hw/vmbus/vmbus.c   | 2475 
 target/i386/hyperv.c   |  681 ++-
 target/i386/kvm.c  |   63 +-
 target/i386/machine.c  |9 +
 tests/hv-scsi-test.c   |   57 +
 util/qemu-config.c

[Qemu-devel] [RFC PATCH 08/34] hyperv: qom-ify SynIC

2018-02-06 Thread Roman Kagan

Make Hyper-V SynIC a device which is attached as a child to X86CPU.  For
now it only makes SynIC visibile in the qom hierarchy, and maintains its
internal fields in sync with the respecitve msrs of the parent cpu (the
fields will be used in followup patches).

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.h  |   4 ++
 target/i386/hyperv.c  | 111 +-
 target/i386/kvm.c |  14 ++-
 target/i386/machine.c |   9 
 4 files changed, 134 insertions(+), 4 deletions(-)

diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index af5fc05ea4..20bbd7bb29 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -34,4 +34,8 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route);
 uint32_t hyperv_vp_index(X86CPU *cpu);
 X86CPU *hyperv_find_vcpu(uint32_t vp_index);
 
+void hyperv_synic_add(X86CPU *cpu);
+void hyperv_synic_reset(X86CPU *cpu);
+void hyperv_synic_update(X86CPU *cpu);
+
 #endif
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 4d8ef6f2da..a27d33acb3 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -13,12 +13,27 @@
 
 #include "qemu/osdep.h"
 #include "qemu/main-loop.h"
+#include "qapi/error.h"
+#include "hw/qdev-properties.h"
 #include "hyperv.h"
 #include "hyperv-proto.h"
 
+typedef struct SynICState {
+DeviceState parent_obj;
+
+X86CPU *cpu;
+
+bool enabled;
+hwaddr msg_page_addr;
+hwaddr evt_page_addr;
+} SynICState;
+
+#define TYPE_SYNIC "hyperv-synic"
+#define SYNIC(obj) OBJECT_CHECK(SynICState, (obj), TYPE_SYNIC)
+
 struct HvSintRoute {
 uint32_t sint;
-X86CPU *cpu;
+SynICState *synic;
 int gsi;
 EventNotifier sint_set_notifier;
 EventNotifier sint_ack_notifier;
@@ -37,6 +52,37 @@ X86CPU *hyperv_find_vcpu(uint32_t vp_index)
 return X86_CPU(qemu_get_cpu(vp_index));
 }
 
+static SynICState *get_synic(X86CPU *cpu)
+{
+SynICState *synic =
+SYNIC(object_resolve_path_component(OBJECT(cpu), "synic"));
+assert(synic);
+return synic;
+}
+
+static void synic_update_msg_page_addr(SynICState *synic)
+{
+uint64_t msr = synic->cpu->env.msr_hv_synic_msg_page;
+hwaddr new_addr = (msr & HV_SIMP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
+
+synic->msg_page_addr = new_addr;
+}
+
+static void synic_update_evt_page_addr(SynICState *synic)
+{
+uint64_t msr = synic->cpu->env.msr_hv_synic_evt_page;
+hwaddr new_addr = (msr & HV_SIEFP_ENABLE) ? (msr & TARGET_PAGE_MASK) : 0;
+
+synic->evt_page_addr = new_addr;
+}
+
+static void synic_update(SynICState *synic)
+{
+synic->enabled = synic->cpu->env.msr_hv_synic_control & HV_SYNIC_ENABLE;
+synic_update_msg_page_addr(synic);
+synic_update_evt_page_addr(synic);
+}
+
 int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit *exit)
 {
 CPUX86State *env = >env;
@@ -65,6 +111,7 @@ int kvm_hv_handle_exit(X86CPU *cpu, struct kvm_hyperv_exit 
*exit)
 default:
 return -1;
 }
+synic_update(get_synic(cpu));
 return 0;
 case KVM_EXIT_HYPERV_HCALL: {
 uint16_t code;
@@ -95,6 +142,7 @@ HvSintRoute *hyperv_sint_route_new(uint32_t vp_index, 
uint32_t sint,
HvSintAckClb sint_ack_clb,
void *sint_ack_clb_data)
 {
+SynICState *synic;
 HvSintRoute *sint_route;
 EventNotifier *ack_notifier;
 int r, gsi;
@@ -105,6 +153,8 @@ HvSintRoute *hyperv_sint_route_new(uint32_t vp_index, 
uint32_t sint,
 return NULL;
 }
 
+synic = get_synic(cpu);
+
 sint_route = g_new0(HvSintRoute, 1);
 r = event_notifier_init(_route->sint_set_notifier, false);
 if (r) {
@@ -135,7 +185,7 @@ HvSintRoute *hyperv_sint_route_new(uint32_t vp_index, 
uint32_t sint,
 sint_route->gsi = gsi;
 sint_route->sint_ack_clb = sint_ack_clb;
 sint_route->sint_ack_clb_data = sint_ack_clb_data;
-sint_route->cpu = cpu;
+sint_route->synic = synic;
 sint_route->sint = sint;
 sint_route->refcount = 1;
 
@@ -189,3 +239,60 @@ int kvm_hv_sint_route_set_sint(HvSintRoute *sint_route)
 {
 return event_notifier_set(_route->sint_set_notifier);
 }
+
+static void synic_realize(DeviceState *dev, Error **errp)
+{
+Object *obj = OBJECT(dev);
+SynICState *synic = SYNIC(dev);
+
+synic->cpu = X86_CPU(obj->parent);
+}
+
+static void synic_reset(DeviceState *dev)
+{
+SynICState *synic = SYNIC(dev);
+synic_update(synic);
+}
+
+static void synic_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->realize = synic_realize;
+dc->reset = synic_reset;
+dc->user_creatable = false;
+}
+
+void hyperv_synic_add(X86CPU *cpu)
+{
+Object *obj;
+
+obj = object_new(TYPE_SYNIC);
+object_property_add_child(OBJECT(cpu), "synic", obj, _abort);
+object_unref(obj);
+object_property_set_bool(obj, true, "realized", _abort);
+}
+
+void hyperv_synic_reset(X86CPU *cpu)
+{
+

[Qemu-devel] [RFC PATCH 28/34] net: add RNDIS definitions

2018-02-06 Thread Roman Kagan

Add a header with constants used in Microsoft RNDIS protocol.

The header is taken unchanged from the Linux kernel.

TODO: reconcile with usb-net
Signed-off-by: Roman Kagan 
---
 hw/net/rndis.h | 391 +
 1 file changed, 391 insertions(+)
 create mode 100644 hw/net/rndis.h

diff --git a/hw/net/rndis.h b/hw/net/rndis.h
new file mode 100644
index 00..93c0a64aef
--- /dev/null
+++ b/hw/net/rndis.h
@@ -0,0 +1,391 @@
+/*
+ * Remote Network Driver Interface Specification (RNDIS)
+ * definitions of the magic numbers used by this protocol
+ */
+
+/* Remote NDIS Versions */
+#define RNDIS_MAJOR_VERSION0x0001
+#define RNDIS_MINOR_VERSION0x
+
+/* Device Flags */
+#define RNDIS_DF_CONNECTIONLESS0x0001U
+#define RNDIS_DF_CONNECTION_ORIENTED   0x0002U
+#define RNDIS_DF_RAW_DATA  0x0004U
+
+/*
+ * Codes for "msg_type" field of rndis messages;
+ * only the data channel uses packet messages (maybe batched);
+ * everything else goes on the control channel.
+ */
+#define RNDIS_MSG_COMPLETION   0x8000
+#define RNDIS_MSG_PACKET   0x0001  /* 1-N packets */
+#define RNDIS_MSG_INIT 0x0002
+#define RNDIS_MSG_INIT_C   (RNDIS_MSG_INIT|RNDIS_MSG_COMPLETION)
+#define RNDIS_MSG_HALT 0x0003
+#define RNDIS_MSG_QUERY0x0004
+#define RNDIS_MSG_QUERY_C  (RNDIS_MSG_QUERY|RNDIS_MSG_COMPLETION)
+#define RNDIS_MSG_SET  0x0005
+#define RNDIS_MSG_SET_C(RNDIS_MSG_SET|RNDIS_MSG_COMPLETION)
+#define RNDIS_MSG_RESET0x0006
+#define RNDIS_MSG_RESET_C  (RNDIS_MSG_RESET|RNDIS_MSG_COMPLETION)
+#define RNDIS_MSG_INDICATE 0x0007
+#define RNDIS_MSG_KEEPALIVE0x0008
+#define RNDIS_MSG_KEEPALIVE_C  (RNDIS_MSG_KEEPALIVE|RNDIS_MSG_COMPLETION)
+/*
+ * Reserved message type for private communication between lower-layer host
+ * driver and remote device, if necessary.
+ */
+#define RNDIS_MSG_BUS  0xff01
+
+/* codes for "status" field of completion messages */
+#defineRNDIS_STATUS_SUCCESS0x
+#define RNDIS_STATUS_PENDING   0x0103
+
+/*  Status codes */
+#define RNDIS_STATUS_NOT_RECOGNIZED0x00010001
+#define RNDIS_STATUS_NOT_COPIED0x00010002
+#define RNDIS_STATUS_NOT_ACCEPTED  0x00010003
+#define RNDIS_STATUS_CALL_ACTIVE   0x00010007
+
+#define RNDIS_STATUS_ONLINE0x40010003
+#define RNDIS_STATUS_RESET_START   0x40010004
+#define RNDIS_STATUS_RESET_END 0x40010005
+#define RNDIS_STATUS_RING_STATUS   0x40010006
+#define RNDIS_STATUS_CLOSED0x40010007
+#define RNDIS_STATUS_WAN_LINE_UP   0x40010008
+#define RNDIS_STATUS_WAN_LINE_DOWN 0x40010009
+#define RNDIS_STATUS_WAN_FRAGMENT  0x4001000A
+#defineRNDIS_STATUS_MEDIA_CONNECT  0x4001000B
+#defineRNDIS_STATUS_MEDIA_DISCONNECT   0x4001000C
+#define RNDIS_STATUS_HARDWARE_LINE_UP  0x4001000D
+#define RNDIS_STATUS_HARDWARE_LINE_DOWN0x4001000E
+#define RNDIS_STATUS_INTERFACE_UP  0x4001000F
+#define RNDIS_STATUS_INTERFACE_DOWN0x40010010
+#define RNDIS_STATUS_MEDIA_BUSY0x40010011
+#defineRNDIS_STATUS_MEDIA_SPECIFIC_INDICATION  0x40010012
+#define RNDIS_STATUS_WW_INDICATION RDIA_SPECIFIC_INDICATION
+#define RNDIS_STATUS_LINK_SPEED_CHANGE 0x40010013L
+#define RNDIS_STATUS_NETWORK_CHANGE0x40010018
+
+#define RNDIS_STATUS_NOT_RESETTABLE0x80010001
+#define RNDIS_STATUS_SOFT_ERRORS   0x80010003
+#define RNDIS_STATUS_HARD_ERRORS   0x80010004
+#define RNDIS_STATUS_BUFFER_OVERFLOW   0x8005
+
+#defineRNDIS_STATUS_FAILURE0xC001
+#define RNDIS_STATUS_RESOURCES 0xC09A
+#defineRNDIS_STATUS_NOT_SUPPORTED  0xc0BB
+#define RNDIS_STATUS_CLOSING   0xC0010002
+#define RNDIS_STATUS_BAD_VERSION   0xC0010004
+#define RNDIS_STATUS_BAD_CHARACTERISTICS   0xC0010005
+#define RNDIS_STATUS_ADAPTER_NOT_FOUND 0xC0010006
+#define RNDIS_STATUS_OPEN_FAILED   0xC0010007
+#define RNDIS_STATUS_DEVICE_FAILED 0xC0010008
+#define RNDIS_STATUS_MULTICAST_FULL0xC0010009
+#define RNDIS_STATUS_MULTICAST_EXISTS  0xC001000A
+#define RNDIS_STATUS_MULTICAST_NOT_FOUND   0xC001000B
+#define RNDIS_STATUS_REQUEST_ABORTED   0xC001000C
+#define RNDIS_STATUS_RESET_IN_PROGRESS 0xC001000D
+#define RNDIS_STATUS_CLOSING_INDICATING0xC001000E
+#define RNDIS_STATUS_INVALID_PACKET0xC001000F
+#define RNDIS_STATUS_OPEN_LIST_FULL0xC0010010
+#define

[Qemu-devel] [RFC PATCH 23/34] i386: en/disable vmbus by a machine property

2018-02-06 Thread Roman Kagan

Hyper-V VMBus logically belongs to the machine, so make its presence be
controlled by a boolean property of the machine.

TODO: consider doing this through adding the vmbus-bridge device instead
Signed-off-by: Roman Kagan 
---
 include/hw/i386/pc.h |  3 +++
 hw/i386/pc.c | 34 ++
 hw/i386/pc_piix.c|  5 +
 hw/i386/pc_q35.c |  5 +
 util/qemu-config.c   |  4 
 5 files changed, 51 insertions(+)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 744f6a20d2..62b67cd927 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -46,6 +46,7 @@ struct PCMachineState {
 uint64_t max_ram_below_4g;
 OnOffAuto vmport;
 OnOffAuto smm;
+bool vmbus;
 
 AcpiNVDIMMState acpi_nvdimm_state;
 
@@ -80,6 +81,7 @@ struct PCMachineState {
 #define PC_MACHINE_SMBUS"smbus"
 #define PC_MACHINE_SATA "sata"
 #define PC_MACHINE_PIT  "pit"
+#define PC_MACHINE_VMBUS"vmbus"
 
 /**
  * PCMachineClass:
@@ -209,6 +211,7 @@ void i8042_setup_a20_line(ISADevice *dev, qemu_irq a20_out);
 extern int fd_bootchk;
 
 bool pc_machine_is_smm_enabled(PCMachineState *pcms);
+bool pc_machine_is_vmbus_enabled(PCMachineState *pcms);
 void pc_register_ferr_irq(qemu_irq irq);
 void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ccc50baa85..d37072b575 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2178,6 +2178,34 @@ static void pc_machine_set_smm(Object *obj, Visitor *v, 
const char *name,
 visit_type_OnOffAuto(v, name, >smm, errp);
 }
 
+bool pc_machine_is_vmbus_enabled(PCMachineState *pcms)
+{
+if (!pcms->vmbus) {
+return false;
+}
+
+if (!kvm_enabled()) {
+error_report("VMBus requires KVM");
+exit(1);
+}
+
+return true;
+}
+
+static bool pc_machine_get_vmbus(Object *obj, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+return pcms->vmbus;
+}
+
+static void pc_machine_set_vmbus(Object *obj, bool vmbus, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+pcms->vmbus = vmbus;
+}
+
 static bool pc_machine_get_nvdimm(Object *obj, Error **errp)
 {
 PCMachineState *pcms = PC_MACHINE(obj);
@@ -2413,6 +2441,12 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 
 object_class_property_add_bool(oc, PC_MACHINE_PIT,
 pc_machine_get_pit, pc_machine_set_pit, _abort);
+
+/* no vmbus by default */
+object_class_property_add_bool(oc, PC_MACHINE_VMBUS,
+pc_machine_get_vmbus, pc_machine_set_vmbus, _abort);
+object_class_property_set_description(oc, PC_MACHINE_VMBUS,
+"Enable Hyper-V VMBus", _abort);
 }
 
 static const TypeInfo pc_machine_info = {
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index a25619dfbf..4a3cb406d5 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -56,6 +56,7 @@
 #include "migration/misc.h"
 #include "kvm_i386.h"
 #include "sysemu/numa.h"
+#include "hw/vmbus/vmbus.h"
 
 #define MAX_IDE_BUS 2
 
@@ -302,6 +303,10 @@ static void pc_init1(MachineState *machine,
 nvdimm_init_acpi_state(>acpi_nvdimm_state, system_io,
pcms->fw_cfg, OBJECT(pcms));
 }
+
+if (pc_machine_is_vmbus_enabled(pcms)) {
+vmbus_create();
+}
 }
 
 /* Looking for a pc_compat_2_4() function? It doesn't exist.
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index ed3a0b8ff7..9e5ce429b4 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -50,6 +50,7 @@
 #include "hw/usb.h"
 #include "qemu/error-report.h"
 #include "sysemu/numa.h"
+#include "hw/vmbus/vmbus.h"
 
 /* ICH9 AHCI has 6 ports */
 #define MAX_SATA_PORTS 6
@@ -279,6 +280,10 @@ static void pc_q35_init(MachineState *machine)
 nvdimm_init_acpi_state(>acpi_nvdimm_state, system_io,
pcms->fw_cfg, OBJECT(pcms));
 }
+
+if (pc_machine_is_vmbus_enabled(pcms)) {
+vmbus_create();
+}
 }
 
 #define DEFINE_Q35_MACHINE(suffix, name, compatfn, optionfn) \
diff --git a/util/qemu-config.c b/util/qemu-config.c
index 029fec53a9..951a6360a0 100644
--- a/util/qemu-config.c
+++ b/util/qemu-config.c
@@ -234,6 +234,10 @@ static QemuOptsList machine_opts = {
 .help = "Up to 8 chars in set of [A-Za-z0-9. ](lower case chars"
 " converted to upper case) to pass to machine"
 " loader, boot manager, and guest kernel",
+},{
+.name = "vmbus",
+.type = QEMU_OPT_BOOL,
+.help = "enable Hyper-V VMBus",
 },
 { /* End of list */ }
 }
-- 
2.14.3

[Qemu-devel] [RFC PATCH 31/34] hv-net: add .bootindex support

2018-02-06 Thread Roman Kagan

Add support for .bootindex property in hv-net.

This results in a corresponding entry appearing in fw_cfg "bootorder".

In order to actually boot off a hv-net device (via PXE) the firmware
needs also a driver for it (either built-in or supplied via ROM).

Signed-off-by: Roman Kagan 
---
 hw/net/hv-net.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/net/hv-net.c b/hw/net/hv-net.c
index 614922c0fb..3d719458ea 100644
--- a/hw/net/hv-net.c
+++ b/hw/net/hv-net.c
@@ -1425,11 +1425,20 @@ static void hv_net_class_init(ObjectClass *klass, void 
*data)
 vdc->chan_notify_cb = hv_net_notify_cb;
 }
 
+static void hv_net_instance_init(Object *obj)
+{
+HvNet *s = HV_NET(obj);
+device_add_bootindex_property(obj, >conf.bootindex,
+  "bootindex", "/ethernet-phy@0",
+  DEVICE(obj), NULL);
+}
+
 static const TypeInfo hv_net_type_info = {
 .name = TYPE_HV_NET,
 .parent = TYPE_VMBUS_DEVICE,
 .instance_size = sizeof(HvNet),
 .class_init = hv_net_class_init,
+.instance_init = hv_net_instance_init,
 };
 
 static void hv_net_register_types(void)
-- 
2.14.3

[Qemu-devel] [RFC PATCH 16/34] hyperv: update copyright notices

2018-02-06 Thread Roman Kagan

Signed-off-by: Roman Kagan 
---
 target/i386/hyperv.h | 1 +
 hw/misc/hyperv_testdev.c | 1 +
 target/i386/hyperv.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/target/i386/hyperv.h b/target/i386/hyperv.h
index fcc41caf1f..8b7fcd0b48 100644
--- a/target/i386/hyperv.h
+++ b/target/i386/hyperv.h
@@ -2,6 +2,7 @@
  * QEMU KVM Hyper-V support
  *
  * Copyright (C) 2015 Andrey Smetanin 
+ * Copyright (c) 2015-2018 Virtuozzo International GmbH.
  *
  * Authors:
  *  Andrey Smetanin 
diff --git a/hw/misc/hyperv_testdev.c b/hw/misc/hyperv_testdev.c
index 79077789e1..b9e1f1cc74 100644
--- a/hw/misc/hyperv_testdev.c
+++ b/hw/misc/hyperv_testdev.c
@@ -2,6 +2,7 @@
  * QEMU KVM Hyper-V test device to support Hyper-V kvm-unit-tests
  *
  * Copyright (C) 2015 Andrey Smetanin 
+ * Copyright (c) 2015-2018 Virtuozzo International GmbH.
  *
  * Authors:
  *  Andrey Smetanin 
diff --git a/target/i386/hyperv.c b/target/i386/hyperv.c
index 3dc8a7acb0..e43cbb9322 100644
--- a/target/i386/hyperv.c
+++ b/target/i386/hyperv.c
@@ -2,6 +2,7 @@
  * QEMU KVM Hyper-V support
  *
  * Copyright (C) 2015 Andrey Smetanin 
+ * Copyright (c) 2015-2018 Virtuozzo International GmbH.
  *
  * Authors:
  *  Andrey Smetanin 
-- 
2.14.3

[Qemu-devel] [RFC PATCH 24/34] scsi: add Hyper-V/VMBus SCSI protocol definitions

2018-02-06 Thread Roman Kagan

Add a header with data structures and constants defining the protocol
between the guest and the hypervisor implementing the Hyper-V VMBus SCSI
controller.

Mostly taken from the corresponding definitions in the Linux kernel.

Signed-off-by: Roman Kagan 
---
 hw/scsi/hvscsi-proto.h | 150 +
 1 file changed, 150 insertions(+)
 create mode 100644 hw/scsi/hvscsi-proto.h

diff --git a/hw/scsi/hvscsi-proto.h b/hw/scsi/hvscsi-proto.h
new file mode 100644
index 00..9dd20c9bfa
--- /dev/null
+++ b/hw/scsi/hvscsi-proto.h
@@ -0,0 +1,150 @@
+/*
+ * Hyper-V storage device protocol definitions
+ *
+ * Copyright (c) 2009, Microsoft Corporation.
+ * Copyright (c) 2017-2018 Virtuozzo International GmbH.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef _HVSCSI_PROTO_H_
+#define _HVSCSI_PROTO_H_
+
+#define HV_STOR_PROTO_VERSION(MAJOR_, MINOR_) \
+MAJOR_) & 0xff) << 8) | (((MINOR_) & 0xff)))
+
+#define HV_STOR_PROTO_VERSION_WIN6   HV_STOR_PROTO_VERSION(2, 0)
+#define HV_STOR_PROTO_VERSION_WIN7   HV_STOR_PROTO_VERSION(4, 2)
+#define HV_STOR_PROTO_VERSION_WIN8   HV_STOR_PROTO_VERSION(5, 1)
+#define HV_STOR_PROTO_VERSION_WIN8_1 HV_STOR_PROTO_VERSION(6, 0)
+#define HV_STOR_PROTO_VERSION_WIN10  HV_STOR_PROTO_VERSION(6, 2)
+#define HV_STOR_PROTO_VERSION_CURRENTHV_STOR_PROTO_VERSION_WIN8
+
+#define HV_STOR_OPERATION_COMPLETE_IO 1
+#define HV_STOR_OPERATION_REMOVE_DEVICE   2
+#define HV_STOR_OPERATION_EXECUTE_SRB 3
+#define HV_STOR_OPERATION_RESET_LUN   4
+#define HV_STOR_OPERATION_RESET_ADAPTER   5
+#define HV_STOR_OPERATION_RESET_BUS   6
+#define HV_STOR_OPERATION_BEGIN_INITIALIZATION7
+#define HV_STOR_OPERATION_END_INITIALIZATION  8
+#define HV_STOR_OPERATION_QUERY_PROTOCOL_VERSION  9
+#define HV_STOR_OPERATION_QUERY_PROPERTIES10
+#define HV_STOR_OPERATION_ENUMERATE_BUS   11
+#define HV_STOR_OPERATION_FCHBA_DATA  12
+#define HV_STOR_OPERATION_CREATE_SUB_CHANNELS 13
+
+#define HV_STOR_REQUEST_COMPLETION_FLAG   0x1
+
+#define HV_STOR_PROPERTIES_MULTI_CHANNEL_FLAG 0x1
+
+#define HV_SRB_MAX_CDB_SIZE 16
+#define HV_SRB_SENSE_BUFFER_SIZE20
+
+#define HV_SRB_REQUEST_TYPE_WRITE   0
+#define HV_SRB_REQUEST_TYPE_READ1
+#define HV_SRB_REQUEST_TYPE_UNKNOWN 2
+
+#define HV_SRB_MAX_LUNS_PER_TARGET  255
+#define HV_SRB_MAX_TARGETS  2
+#define HV_SRB_MAX_CHANNELS 8
+
+#define HV_SRB_FLAGS_QUEUE_ACTION_ENABLE0x0002
+#define HV_SRB_FLAGS_DISABLE_DISCONNECT 0x0004
+#define HV_SRB_FLAGS_DISABLE_SYNCH_TRANSFER 0x0008
+#define HV_SRB_FLAGS_BYPASS_FROZEN_QUEUE0x0010
+#define HV_SRB_FLAGS_DISABLE_AUTOSENSE  0x0020
+#define HV_SRB_FLAGS_DATA_IN0x0040
+#define HV_SRB_FLAGS_DATA_OUT   0x0080
+#define HV_SRB_FLAGS_NO_DATA_TRANSFER   0x
+#define HV_SRB_FLAGS_UNSPECIFIED_DIRECTION  (SRB_FLAGS_DATA_IN | 
SRB_FLAGS_DATA_OUT)
+#define HV_SRB_FLAGS_NO_QUEUE_FREEZE0x0100
+#define HV_SRB_FLAGS_ADAPTER_CACHE_ENABLE   0x0200
+#define HV_SRB_FLAGS_FREE_SENSE_BUFFER  0x0400
+#define HV_SRB_FLAGS_D3_PROCESSING  0x0800
+#define HV_SRB_FLAGS_IS_ACTIVE  0x0001
+#define HV_SRB_FLAGS_ALLOCATED_FROM_ZONE0x0002
+#define HV_SRB_FLAGS_SGLIST_FROM_POOL   0x0004
+#define HV_SRB_FLAGS_BYPASS_LOCKED_QUEUE0x0008
+#define HV_SRB_FLAGS_NO_KEEP_AWAKE  0x0010
+#define HV_SRB_FLAGS_PORT_DRIVER_ALLOCSENSE 0x0020
+#define HV_SRB_FLAGS_PORT_DRIVER_SENSEHASPORT   0x0040
+#define HV_SRB_FLAGS_DONT_START_NEXT_PACKET 0x0080
+#define HV_SRB_FLAGS_PORT_DRIVER_RESERVED   0x0F00
+#define HV_SRB_FLAGS_CLASS_DRIVER_RESERVED  0xF000
+
+#define HV_SRB_STATUS_AUTOSENSE_VALID   0x80
+#define HV_SRB_STATUS_INVALID_LUN   0x20
+#define HV_SRB_STATUS_SUCCESS   0x01
+#define HV_SRB_STATUS_ABORTED   0x02
+#define HV_SRB_STATUS_ERROR 0x04
+
+#define HV_STOR_PACKET_MAX_LENGTH sizeof(struct hv_stor_packet)
+#define HV_STOR_PACKET_MIN_LENGTH \
+(sizeof(struct hv_stor_packet) - sizeof(struct hv_srb_win8_extentions))
+
+typedef struct hv_stor_properties {
+uint32_t _reserved1;
+uint16_t max_channel_count;
+uint16_t _reserved2;
+uint32_t flags;
+uint32_t max_transfer_bytes;
+uint32_t _reserved3[2];
+} hv_stor_properties;
+
+typedef struct hv_srb_win8_extentions {
+uint16_t _reserved;
+uint8_t  queue_tag;
+uint8_t  queue_action;
+uint32_t srb_flags;
+uint32_t timeout;
+uint32_t

1 2 3 4 5 >

1 - 100 of 422 matches

Mail list logo