Re: [Qemu-devel] [PATCH v2] Add optionrom compatible with fw_cfg DMA version

2016-01-18 Thread Stefan Hajnoczi
On Mon, Jan 18, 2016 at 10:59:00AM +0100, Marc Marí wrote:
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 459260b..00339fa 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms,
>  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
>  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
>  
> -option_rom[nb_option_roms].name = "linuxboot.bin";
> -option_rom[nb_option_roms].bootindex = 0;
> +if (fw_cfg_dma_enabled(fw_cfg)) {
> +option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> +option_rom[nb_option_roms].bootindex = 0;
> +} else {
> +option_rom[nb_option_roms].name = "linuxboot.bin";
> +option_rom[nb_option_roms].bootindex = 0;
> +}

Live migration compatibility requires that guest-visible changes to the
machine are only introduced in a new -machine .

This ensures that migrating from an older QEMU version to a newer QEMU
version will not change the guest-visible memory layout or device
behavior.

The Option ROM should not change when migration between different QEMU
versions.

I've CCed Gerd and Juan, I think they know how changes to Option ROMs
affect live migration better than me.  What needs to be done to preserve
live migration compatibility?

> diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
> index ce4852a..076f351 100644
> --- a/pc-bios/optionrom/Makefile
> +++ b/pc-bios/optionrom/Makefile
> @@ -2,6 +2,7 @@ all: build-all
>  # Dummy command so that make thinks it has done something
>   @true
>  
> +BULD_DIR=$(CURDIR)

Is this a typo (BUILD_DIR)?  Is it even needed?

>  include ../../config-host.mak
>  include $(SRC_PATH)/rules.mak
>  
> @@ -11,15 +12,20 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/optionrom)
>  
>  CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin
>  CFLAGS += -I$(SRC_PATH)
> +CFLAGS += -I$(SRC_PATH)/include

Are you using any QEMU headers?  If not, then this change can be
dropped.

> +linuxboot_dma.img: linuxboot_dma.o
> + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -static -Ttext 
> 0 -e _start -s -o $@ $<,"  Building $(TARGET_DIR)$@")

Why is -static necessary?

> +#define BOOT_ROM_PRODUCT "Linux loader"

Please differentiate from linuxboot.S.  Maybe call it "Linux loader
DMA".

> +typedef struct FWCfgDmaAccess {
> +uint32_t control;
> +uint32_t length;
> +uint64_t address;
> +} __attribute__((gcc_struct, packed)) FWCfgDmaAccess;

gcc_struct is for gcc vs Windows compiler compatibility when the struct
contains bitfields.  No bitfields are used and no Windows compiled code
is linked, so it seems unnecessary.

> +static void bios_cfg_read_entry(void *buf, uint16_t entry, uint32_t len)
> +{
> +FWCfgDmaAccess access;
> +uint32_t control = (entry << 16) | BIOS_CFG_DMA_CTL_SELECT
> +| BIOS_CFG_DMA_CTL_READ;
> +
> +access.address = cpu_to_be64((uint64_t)(uint32_t)buf);
> +access.length = cpu_to_be32(len);
> +access.control = cpu_to_be32(control);
> +
> +barrier();
> +
> +outl(cpu_to_be32((uint32_t)), BIOS_CFG_DMA_ADDR_LOW);
> +
> +while(be32_to_cpu(access.control) & ~BIOS_CFG_DMA_CTL_ERROR) {
> +barrier();
> +}
> +}

This function is the unique part of linuxboot_dma.c.  Everything else is
copied and translated from linuxboot.S.  The refactorer in me wonders
how hard it would be to extend linuxboot.S instead of introducing this
new boot ROM.

Was there a technical reason why linuxboot.S cannot be extended (e.g.  a
size limit)?

> +/* As we are changing crytical registers, we cannot leave freedom to the

s/crytical/critical/

> diff --git a/pc-bios/optionrom/optionrom.h b/pc-bios/optionrom/optionrom.h
> index f1a9021..25c765f 100644
> --- a/pc-bios/optionrom/optionrom.h
> +++ b/pc-bios/optionrom/optionrom.h
> @@ -29,7 +29,8 @@
>  #define DEBUG_HERE \
>   jmp 1f; \
>   1:
> - 
> +
> +#ifndef C_CODE
>  /*
>   * Read a variable from the fw_cfg device.
>   * Clobbers: %edx
> @@ -49,6 +50,7 @@
>   inb (%dx), %al
>   bswap   %eax
>  .endm
> +#endif

Why do you need optionrom.h?  You seem to have expanded its macros
anyway (OPTION_ROM_START, BOOT_ROM_START, OPTION_ROM_END, BOOT_ROM_END).

If you need optionrom.h, please actually use its macros instead of
expanding them.

Otherwise, please don't include it.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PULL 5/5] hw/arm/virt: Support legacy -nic command line syntax

2016-01-18 Thread Eric Auger
On 01/18/2016 03:14 PM, Peter Maydell wrote:
> On 18 January 2016 at 13:57, Eric Auger  wrote:
>> Hi Peter,
>> On 01/18/2016 02:34 PM, Peter Maydell wrote:
>>> Hmm, I guess this is changing things in that we now will have a
>>> virtio PCI device appearing if you use the default (-net nic -net user)
>>> settings. But I don't see why that would particularly interfere
>>> with VFIO passthrough, except in as much as the guest now has
>>> two network cards in it and might be preferring one as eth0
>>> rather than the other...
>> Yes that's what currently happens I think. I get the slirp thing on eth0
>> and my passthrough'ed device on eth1. That's not very straightforward
>> for the end-user to get those 2 NIC's now. In case I passthrough some
>> NIC's I would have expected this default NIC not be instantiated?
> 
> The QEMU networking layer only knows about networking controlled
> by the -net or -netdev options (and in those cases it does disable
> the default network device). Because the back-end for passthrough
> NICs is completely unknown to QEMU (it is all done in hardware),
> I'm not sure we have any way to know that the thing you've passed through
> is a NIC and not some other random PCI device...
> 
>> Alex, how do you manage on x86 platforms with VFIO-PCI NIC?
> 
> ...but presumably the x86 folks have been here before us
> and know how this should work :-)

Yes. maybe we could create a new network backend VFIO and use that kind
of cmd line:

NET_OPTIONS="-netdev VFIO,id=myeth0 \
-device vfio-amd-xgbe,host=e090.xgmac,netdev=myeth0"

In the specialized VFIO-Platform device I can easily add a dummy NICConf
field and add

static Property amd_xgbe_properties[] = {
DEFINE_NIC_PROPERTIES(VFIOAmdXgbeDevice, conf),
DEFINE_PROP_END_OF_LIST(),
};

But it complexifies the user command line quite a lot and not sure it is
worth the candle?

Thanks

Eric


> 
> thanks
> -- PMM
> 




[Qemu-devel] [PATCH 0/2] 9pfs: fsdev: use error_report() instead of fprintf(stderr)

2016-01-18 Thread Greg Kurz
Hi,

This series moves all the 9pfs/fsdev code to use error_report(), with the
notable exception of virtfs-proxy-helper, which doesn't need it.

Markus,

Should this patches go through your tree ? Or can they go through my 9p tree
if you ack them ?

Cheers.

--
Greg

---

Greg Kurz (2):
  9pfs: use error_report() instead of fprintf(stderr)
  fsdev: use error_report() instead of fprintf(stderr)


 fsdev/qemu-fsdev.c  |7 ---
 hw/9pfs/9p-handle.c |5 +++--
 hw/9pfs/9p-local.c  |   15 ---
 hw/9pfs/9p-proxy.c  |   12 ++--
 hw/9pfs/9p.c|2 +-
 5 files changed, 22 insertions(+), 19 deletions(-)




[Qemu-devel] [PATCH 1/2] 9pfs: use error_report() instead of fprintf(stderr)

2016-01-18 Thread Greg Kurz
Signed-off-by: Greg Kurz 
---
 hw/9pfs/9p-handle.c |5 +++--
 hw/9pfs/9p-local.c  |   15 ---
 hw/9pfs/9p-proxy.c  |   12 ++--
 hw/9pfs/9p.c|2 +-
 4 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/hw/9pfs/9p-handle.c b/hw/9pfs/9p-handle.c
index 58b77b4c942d..8ba88775a2b6 100644
--- a/hw/9pfs/9p-handle.c
+++ b/hw/9pfs/9p-handle.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include "qemu/xattr.h"
+#include "qemu/error-report.h"
 #include 
 #include 
 #ifdef CONFIG_LINUX_MAGIC_H
@@ -655,12 +656,12 @@ static int handle_parse_opts(QemuOpts *opts, struct 
FsDriverEntry *fse)
 const char *path = qemu_opt_get(opts, "path");
 
 if (sec_model) {
-fprintf(stderr, "Invalid argument security_model specified with handle 
fsdriver\n");
+error_report("Invalid argument security_model specified with handle 
fsdriver");
 return -1;
 }
 
 if (!path) {
-fprintf(stderr, "fsdev: No path specified.\n");
+error_report("fsdev: No path specified.");
 return -1;
 }
 fse->path = g_strdup(path);
diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index bf63eab729ad..9c25ab2db26b 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include "qemu/xattr.h"
+#include "qemu/error-report.h"
 #include 
 #include 
 #ifdef CONFIG_LINUX_MAGIC_H
@@ -1209,9 +1210,9 @@ static int local_parse_opts(QemuOpts *opts, struct 
FsDriverEntry *fse)
 const char *path = qemu_opt_get(opts, "path");
 
 if (!sec_model) {
-fprintf(stderr, "security model not specified, "
-"local fs needs security model\nvalid options are:"
-"\tsecurity_model=[passthrough|mapped|none]\n");
+error_report("security model not specified, local fs needs security 
model");
+error_printf("valid options are:"
+ 
"\tsecurity_model=[passthrough|mapped-xattr|mapped-file|none]\n");
 return -1;
 }
 
@@ -1225,14 +1226,14 @@ static int local_parse_opts(QemuOpts *opts, struct 
FsDriverEntry *fse)
 } else if (!strcmp(sec_model, "mapped-file")) {
 fse->export_flags |= V9FS_SM_MAPPED_FILE;
 } else {
-fprintf(stderr, "Invalid security model %s specified, valid options 
are"
-"\n\t [passthrough|mapped-xattr|mapped-file|none]\n",
-sec_model);
+error_report("Invalid security model %s specified, valid options are",
+ sec_model);
+error_printf("\t [passthrough|mapped-xattr|mapped-file|none]\n");
 return -1;
 }
 
 if (!path) {
-fprintf(stderr, "fsdev: No path specified.\n");
+error_report("fsdev: No path specified.");
 return -1;
 }
 fse->path = g_strdup(path);
diff --git a/hw/9pfs/9p-proxy.c b/hw/9pfs/9p-proxy.c
index 73d00dd74d11..72b9952d7c8b 100644
--- a/hw/9pfs/9p-proxy.c
+++ b/hw/9pfs/9p-proxy.c
@@ -1100,19 +1100,19 @@ static int connect_namedsocket(const char *path)
 struct sockaddr_un helper;
 
 if (strlen(path) >= sizeof(helper.sun_path)) {
-fprintf(stderr, "Socket name too large\n");
+error_report("Socket name too large");
 return -1;
 }
 sockfd = socket(AF_UNIX, SOCK_STREAM, 0);
 if (sockfd < 0) {
-fprintf(stderr, "failed to create socket: %s\n", strerror(errno));
+error_report("failed to create socket: %s", strerror(errno));
 return -1;
 }
 strcpy(helper.sun_path, path);
 helper.sun_family = AF_UNIX;
 size = strlen(helper.sun_path) + sizeof(helper.sun_family);
 if (connect(sockfd, (struct sockaddr *), size) < 0) {
-fprintf(stderr, "failed to connect to %s: %s\n", path, 
strerror(errno));
+error_report("failed to connect to %s: %s", path, strerror(errno));
 close(sockfd);
 return -1;
 }
@@ -1128,11 +1128,11 @@ static int proxy_parse_opts(QemuOpts *opts, struct 
FsDriverEntry *fs)
 const char *sock_fd = qemu_opt_get(opts, "sock_fd");
 
 if (!socket && !sock_fd) {
-fprintf(stderr, "socket and sock_fd none of the option specified\n");
+error_report("socket and sock_fd none of the option specified");
 return -1;
 }
 if (socket && sock_fd) {
-fprintf(stderr, "Both socket and sock_fd options specified\n");
+error_report("Both socket and sock_fd options specified");
 return -1;
 }
 if (socket) {
@@ -1155,7 +1155,7 @@ static int proxy_init(FsContext *ctx)
 } else {
 sock_id = atoi(ctx->fs_root);
 if (sock_id < 0) {
-fprintf(stderr, "socket descriptor not initialized\n");
+error_report("socket descriptor not initialized");
 }
 }
 if (sock_id < 0) {
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 3ff310605cd4..1f3bd12e0d0c 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -3370,7 +3370,7 @@ static void __attribute__((__constructor__)) 

[Qemu-devel] [PATCH 2/2] fsdev: use error_report() instead of fprintf(stderr)

2016-01-18 Thread Greg Kurz
Only fix the code that gets built into QEMU.

Signed-off-by: Greg Kurz 
---
 fsdev/qemu-fsdev.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
index ccfec139ab2b..55e2f7a0fb58 100644
--- a/fsdev/qemu-fsdev.c
+++ b/fsdev/qemu-fsdev.c
@@ -17,6 +17,7 @@
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "qemu/config-file.h"
+#include "qemu/error-report.h"
 
 static QTAILQ_HEAD(FsDriverEntry_head, FsDriverListEntry) fsdriver_entries =
 QTAILQ_HEAD_INITIALIZER(fsdriver_entries);
@@ -40,7 +41,7 @@ int qemu_fsdev_add(QemuOpts *opts)
 bool ro = qemu_opt_get_bool(opts, "readonly", 0);
 
 if (!fsdev_id) {
-fprintf(stderr, "fsdev: No id specified\n");
+error_report("fsdev: No id specified");
 return -1;
 }
 
@@ -52,11 +53,11 @@ int qemu_fsdev_add(QemuOpts *opts)
 }
 
 if (i == ARRAY_SIZE(FsDrivers)) {
-fprintf(stderr, "fsdev: fsdriver %s not found\n", fsdriver);
+error_report("fsdev: fsdriver %s not found", fsdriver);
 return -1;
 }
 } else {
-fprintf(stderr, "fsdev: No fsdriver specified\n");
+error_report("fsdev: No fsdriver specified");
 return -1;
 }
 




[Qemu-devel] [PATCH] add MAINTAINERS entry for qemu socket code

2016-01-18 Thread Gerd Hoffmann
Cc: Daniel P. Berrange 
Cc: Paolo Bonzini 
Signed-off-by: Gerd Hoffmann 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8f44dca..bc753fc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1260,6 +1260,14 @@ F: io/
 F: include/io/
 F: tests/test-io-*
 
+Sockets
+M: Daniel P. Berrange 
+M: Gerd Hoffmann 
+M: Paolo Bonzini 
+S: Maintained
+F: include/qemu/sockets.h
+F: util/qemu-sockets.c
+
 Usermode Emulation
 --
 Overall
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] hmp: avoid redundant null termination of buffer

2016-01-18 Thread Markus Armbruster
Wolfgang Bumiller  writes:

> On Mon, Jan 18, 2016 at 02:02:07PM +0100, Markus Armbruster wrote:
>> Wolfgang Bumiller  writes:
>> 
>> > On Tue, Jan 12, 2016 at 05:52:38PM +0100, Markus Armbruster wrote:
>> >> Wolfgang Bumiller  writes:
>> >> 
>> >  while (1) {
>> >  separator = strchr(keys, '-');
>> >  keyname_len = separator ? separator - keys : strlen(keys);
>> 
>> Preexisting: I wonder why the compiler doesn't warn here: separator -
>> keys is ptrdiff_t, strlen() is size_t, and the left hand side is int.
>
> I noticed and agree it should warn. We know that separator > keys (ie
> positive), but we also use keyname_len as a '.*' parameter to printf()
> which expects it to be an 'int', so when changing it to size_t we need
> to cast it there. Would have to pass a pretty long key name for this to
> be an issue... can this happen over any sane interface that doesn't
> already give you the power to just 'kill -9 $qemu'?

Merely unclean, not actually broken in a practical sense.

>> > -pstrcpy(keyname_buf, sizeof(keyname_buf), keys);
>> >  
>> >  /* Be compatible with old interface, convert user inputted "<" */
>> > -if (!strncmp(keyname_buf, "<", 1) && keyname_len == 1) {
>> > -pstrcpy(keyname_buf, sizeof(keyname_buf), "less");
>> > +if (!strncmp(keys, "<", 1) && keyname_len == 1) {
>> 
>> This strncmp() is a rather roundabout way to say keys[0] == '<'.  I
>> guess I'd dumb it down while touching it.  Your choice.
>
> Yes, but with the previous pstrcpy() of "less" etc. I thought this was a
> style thing (and the compiler optimizes it out anyway last time I
> checked).
>
>> > +keys = "less";
>> 
>> Works because we're resetting keys to point into the argument string at
>> the end of the loop.
>> 
>> >  keyname_len = 4;
>> >  }
>> > -keyname_buf[keyname_len] = 0;
>> >  
>> >  keylist = g_malloc0(sizeof(*keylist));
>> >  keylist->value = g_malloc0(sizeof(*keylist->value));
>> > @@ -1769,16 +1766,16 @@ void hmp_sendkey(Monitor *mon, const QDict *qdict)
>> >  }
>> >  tmp = keylist;
>> >  
>> > -if (strstart(keyname_buf, "0x", NULL)) {
>> > +if (strstart(keys, "0x", NULL)) {
>> >  char *endp;
>> > -int value = strtoul(keyname_buf, , 0);
>> > -if (*endp != '\0') {
>> > +int value = strtoul(keys, , 0);
>> > +if (*endp != '\0' && *endp != '-') {
>> 
>> strtoul() will not parse beyond keyname_len, because it'll only accept
>> hex digits after 0x, thus the '-' or 0 at keyname_len will make it stop.
>> 
>> I guess I'd throw in assert(endp <= keys + keyname_len), and test
>> endp != keys + keyname_len.  What do you think?
>
> Makes sense, but I doubt it'll ever be hit with sane strtoul()
> implementations, but an assetion can't be harmful here either :-)

The assertion primarily documents we've considered strtoul() reading
beyond the bound.  It might also protects us from hasty, incorrect
changes in the future, but I guess that's secondary in this case.

Preexisting: we don't check errno.  Out of scope for this patch.

>> >  goto err_out;
>> >  }
>> >  keylist->value->type = KEY_VALUE_KIND_NUMBER;
>> >  keylist->value->u.number = value;
>> >  } else {
>> > -int idx = index_from_key(keyname_buf);
>> > +int idx = index_from_key(keys, keyname_len);
>> >  if (idx == Q_KEY_CODE__MAX) {
>> >  goto err_out;
>> >  }
>> > @@ -1800,7 +1797,7 @@ out:
>> >  return;
>> >  
>> >  err_out:
>> > -monitor_printf(mon, "invalid parameter: %s\n", keyname_buf);
>> > +monitor_printf(mon, "invalid parameter: %.*s\n", keyname_len, keys);
>> >  goto out;
>> >  }
>> >  
>> > diff --git a/include/ui/console.h b/include/ui/console.h
>> > index adac36d..116bc2b 100644
>> > --- a/include/ui/console.h
>> > +++ b/include/ui/console.h
>> > @@ -448,7 +448,7 @@ static inline int vnc_display_pw_expire(const char 
>> > *id, time_t expires)
>> >  void curses_display_init(DisplayState *ds, int full_screen);
>> >  
>> >  /* input.c */
>> > -int index_from_key(const char *key);
>> > +int index_from_key(const char *key, size_t key_length);
>> >  
>> >  /* gtk.c */
>> >  void early_gtk_display_init(int opengl);
>> > diff --git a/ui/input-legacy.c b/ui/input-legacy.c
>> > index 35dfc27..3454055 100644
>> > --- a/ui/input-legacy.c
>> > +++ b/ui/input-legacy.c
>> > @@ -57,12 +57,13 @@ struct QEMUPutLEDEntry {
>> >  static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers =
>> >  QTAILQ_HEAD_INITIALIZER(led_handlers);
>> >  
>> > -int index_from_key(const char *key)
>> > +int index_from_key(const char *key, size_t key_length)
>> >  {
>> >  int i;
>> >  
>> >  for (i = 0; QKeyCode_lookup[i] != NULL; i++) {
>> > -if (!strcmp(key, 

Re: [Qemu-devel] [PATCH 5/5] blockjob: add Job parameter to BlockCompletionFunc

2016-01-18 Thread Kevin Wolf
Am 12.01.2016 um 01:36 hat John Snow geschrieben:
> It will no longer be sufficient to rely on the opaque parameter
> containing a BDS, and there's no way to reliably include a
> self-reference to the job we're creating, so always pass the Job
> object forward to any callbacks.
> 
> Signed-off-by: John Snow 
> ---
>  block/backup.c|  2 +-
>  block/commit.c|  2 +-
>  block/mirror.c|  6 +++---
>  block/stream.c|  2 +-
>  blockdev.c| 14 +++---
>  blockjob.c| 13 +++--
>  include/block/block.h |  2 ++
>  include/block/block_int.h | 10 +-
>  include/block/blockjob.h  |  6 --
>  qemu-img.c|  4 ++--
>  tests/test-blockjob-txn.c |  4 ++--
>  11 files changed, 39 insertions(+), 26 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 58c76be..cadb880 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -506,7 +506,7 @@ BlockJob *backup_start(BlockDriverState *bs, 
> BlockDriverState *target,
> BdrvDirtyBitmap *sync_bitmap,
> BlockdevOnError on_source_error,
> BlockdevOnError on_target_error,
> -   BlockCompletionFunc *cb, void *opaque,
> +   BlockJobCompletionFunc *cb, void *opaque,
> BlockJobTxn *txn, Error **errp)
>  {
>  int64_t len;
> diff --git a/block/commit.c b/block/commit.c
> index a5d02aa..ef4fd5a 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -202,7 +202,7 @@ static const BlockJobDriver commit_job_driver = {
>  
>  void commit_start(BlockDriverState *bs, BlockDriverState *base,
>BlockDriverState *top, int64_t speed,
> -  BlockdevOnError on_error, BlockCompletionFunc *cb,
> +  BlockdevOnError on_error, BlockJobCompletionFunc *cb,
>void *opaque, const char *backing_file_str, Error **errp)
>  {
>  CommitBlockJob *s;
> diff --git a/block/mirror.c b/block/mirror.c
> index 92706ab..18134e4 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -714,7 +714,7 @@ static BlockJob *mirror_start_job(BlockDriverState *bs,
>BlockdevOnError on_source_error,
>BlockdevOnError on_target_error,
>bool unmap,
> -  BlockCompletionFunc *cb,
> +  BlockJobCompletionFunc *cb,
>void *opaque, Error **errp,
>const BlockJobDriver *driver,
>bool is_none_mode, BlockDriverState *base)
> @@ -801,7 +801,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
> *target,
>MirrorSyncMode mode, BlockdevOnError on_source_error,
>BlockdevOnError on_target_error,
>bool unmap,
> -  BlockCompletionFunc *cb,
> +  BlockJobCompletionFunc *cb,
>void *opaque, Error **errp)
>  {
>  bool is_none_mode;
> @@ -826,7 +826,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
> *target,
>  BlockJob *commit_active_start(BlockDriverState *bs, BlockDriverState *base,
>int64_t speed,
>BlockdevOnError on_error,
> -  BlockCompletionFunc *cb,
> +  BlockJobCompletionFunc *cb,
>void *opaque, Error **errp)
>  {
>  int64_t length, base_length;
> diff --git a/block/stream.c b/block/stream.c
> index 1dfeac0..1bd8220 100644
> --- a/block/stream.c
> +++ b/block/stream.c
> @@ -216,7 +216,7 @@ static const BlockJobDriver stream_job_driver = {
>  BlockJob *stream_start(BlockDriverState *bs, BlockDriverState *base,
>const char *backing_file_str, int64_t speed,
>BlockdevOnError on_error,
> -  BlockCompletionFunc *cb,
> +  BlockJobCompletionFunc *cb,
>void *opaque, Error **errp)
>  {
>  StreamBlockJob *s;
> diff --git a/blockdev.c b/blockdev.c
> index 9b37ace..6713ecb 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2855,28 +2855,28 @@ out:
>  aio_context_release(aio_context);
>  }
>  
> -static void block_job_cb(void *opaque, int ret)
> +static void block_job_cb(BlockJob *job, int ret)
>  {
>  /* Note that this function may be executed from another AioContext 
> besides
>   * the QEMU main loop.  If you need to access anything that assumes the
>   * QEMU global mutex, use a BH or introduce a mutex.
>   */
>  
> -BlockDriverState *bs = opaque;
> +BlockDriverState *bs = job->bs;
>  const char *msg = NULL;
>  
> -trace_block_job_cb(bs, bs->job, ret);
> +trace_block_job_cb(bs, job, ret);

You could 

Re: [Qemu-devel] [PATCH 4/5] block/backup: Add subclassed notifier

2016-01-18 Thread Kevin Wolf
Am 12.01.2016 um 19:02 hat John Snow geschrieben:
> 
> 
> On 01/12/2016 01:01 PM, Paolo Bonzini wrote:
> > 
> > 
> > On 12/01/2016 18:57, John Snow wrote:
> >>
> >>
> >> On 01/12/2016 03:46 AM, Paolo Bonzini wrote:
> >>>
> >>>
> >>> On 12/01/2016 01:36, John Snow wrote:
>  Instead of relying on peeking at bs->job, we want to explicitly get
>  a reference to the job that was involved in this notifier callback.
> 
>  Extend the Notifier to include a job pointer, and include a reference
>  to the job registering the callback. This cuts out a few more cases
>  where we have to rely on bs->job.
> 
>  Signed-off-by: John Snow 
> >>>
> >>> Why don't you just put the NotifierWithReturn inside the BackupBlockJob
> >>> struct, and use container_of to get from NWR to BackupBlockJob?
> >>>
> >>> Paolo
> >>>
> >>
> >> That's another way (including the notifier within the job vs. including
> >> the job within the notifier.) This one simply occurred to me first. Any
> >> strong benefit to that method, or just a matter of style?
> > 
> > It's usually the one that is used with notifiers, no other reason.
> 
> I'll follow convention, I just didn't bump into an example to model.

This means that I should wait for a v2? (Hm, or is this Markus' area,
actually? Or Jeff's?)

Otherwise, this series is:
Reviewed-by: Kevin Wolf 



Re: [Qemu-devel] CMSG_SPACE() causing compile time error on Mac OS X

2016-01-18 Thread Programmingkid

On Jan 18, 2016, at 4:58 AM, Daniel P. Berrange wrote:

> On Sun, Jan 17, 2016 at 05:23:44PM -0500, Programmingkid wrote:
>> I was wondering if you had problems compiling QEMU on Mac OS X recently. On 
>> my system, the channel-socket.c file causes this error:
>> 
>> io/channel-socket.c: In function 'qio_channel_socket_writev':
>> io/channel-socket.c:497:18: error: variable-sized object may not be 
>> initialized
>> char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)] = { 0 };
> 
> Can you try applying this patch:
> 
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index eaa411f..bc117b1 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -493,10 +495,12 @@ static ssize_t qio_channel_socket_writev(QIOChannel 
> *ioc,
> QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
> ssize_t ret;
> struct msghdr msg = { NULL, };
> -char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)] = { 0 };
> +char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)];
> size_t fdsize = sizeof(int) * nfds;
> struct cmsghdr *cmsg;
> 
> +memset(control, 0, CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS));
> +
> msg.msg_iov = (struct iovec *)iov;
> msg.msg_iovlen = niov;
> 
> 
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Your patch solves the problem. Good job!

Reviewed-By: John Arbuckle 




Re: [Qemu-devel] [PATCH 1/1] HMP: Add equivalent to x-blockdev-change

2016-01-18 Thread Markus Armbruster
"Dr. David Alan Gilbert (git)"  writes:

> From: "Dr. David Alan Gilbert" 
>
> x-blockdev-change has no HMP equivalent, so add x_block_change.

Uh, I can find neither QMP command x-blockdev-change nor
qmp_x_blockdev_change() in master.

> Example useages are:
> x_block_change  foo -a bah
>to add the node bah to the parent foo
>
> x_block_change  foo -d bah
>to delete the node bah from the parent foo
[...]
> diff --git a/hmp.c b/hmp.c
> index dc6dc30..631dacb 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1042,6 +1042,26 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
>  }
>  }
>  
> +void hmp_block_change(Monitor *mon, const QDict *qdict)
> +{
> +const char *parent = qdict_get_str(qdict, "parent");
> +const char *child = qdict_get_str(qdict, "child");
> +bool add = qdict_get_try_bool(qdict, "add", false);
> +bool del = qdict_get_try_bool(qdict, "del", false);
> +Error *err = NULL;
> +
> +if (add == del) {
> +error_setg(, "One of -a or -d must be set");
> +hmp_handle_error(mon, );
> +return;
> +}
> +
> +qmp_x_blockdev_change(parent,
> +  del, child,
> +  add, child, );
> +hmp_handle_error(mon, );
> +}
> +
>  void hmp_block_resize(Monitor *mon, const QDict *qdict)
>  {
>  const char *device = qdict_get_str(qdict, "device");
[...]



Re: [Qemu-devel] [PATCH] qemu-img: Speed up comparing empty/zero images

2016-01-18 Thread Kevin Wolf
Am 13.01.2016 um 09:37 hat Fam Zheng geschrieben:
> Two empty raw files are always compared by actually reading data even if
> there is no data, because BDRV_BLOCK_ZERO is considered "allocated" in
> bdrv_is_allocated_above().  That is inefficient.
> 
> Use bdrv_get_block_status_above() for more information, and skip the
> consecutive zero sectors.
> 
> This brings a huge speed up in comparing sparse/empty raw images:
> 
> $ qemu-img create a 1G
> 
> $ time ~/build/master/bin/qemu-img compare a a
> Images are identical.
> 
> real0m6.583s
> user0m0.191s
> sys 0m6.367s
> 
> $ time qemu-img compare a a
> Images are identical.
> 
> real0m0.033s
> user0m0.003s
> sys 0m0.031s
> 
> Signed-off-by: Fam Zheng 

Thanks, applied to the block branch.

Kevin



Re: [Qemu-devel] [PULL 5/5] hw/arm/virt: Support legacy -nic command line syntax

2016-01-18 Thread Eric Auger
Hi Peter,
On 01/18/2016 02:34 PM, Peter Maydell wrote:
> On 18 January 2016 at 13:29, Eric Auger  wrote:
>> How is it supposed to live with Passthrough'ed NIC? Current way to
>> instantiate a VFIO platform NIC looks like:
>>
>> -device vfio-amd-xgbe,host=e090.xgmac
>> where vfio-amd-xgbe is the name of the VFIO AMD XGBE platform QEMU
>> device and e090.xgmac is the name of the device in
>> /sys/bus/platform/devices.
> 
>> Before that commit I was able to instantiate this VFIO device and got
>> networking working, testing it with ping. Now ping don't work anymore. I
>> Guess I now use this other default NIC?
>>
>> Curiously it does not seem to prevent networking with upstreamed Calxeda
>> Midway device and I did not figure why yet?
> 
> Hmm, I guess this is changing things in that we now will have a
> virtio PCI device appearing if you use the default (-net nic -net user)
> settings. But I don't see why that would particularly interfere
> with VFIO passthrough, except in as much as the guest now has
> two network cards in it and might be preferring one as eth0
> rather than the other...
Yes that's what currently happens I think. I get the slirp thing on eth0
and my passthrough'ed device on eth1. That's not very straightforward
for the end-user to get those 2 NIC's now. In case I passthrough some
NIC's I would have expected this default NIC not be instantiated?

Alex, how do you manage on x86 platforms with VFIO-PCI NIC?

Thanks

Eric
> 
> thanks
> -- PMM
> 




[Qemu-devel] how to setup a watchdog?

2016-01-18 Thread lejeczek

apologies I bother devel, but..
I tried to get help on libvirt mailing list but not luck, 
then qemu, still nothing

I hope maybe somebody here?

I'm trying Qemu's watchdog.
My understanding was that hardware (here qemu's watchdog) 
would take
action, eg. cold reboot the system if there is no ping from 
the OS watchdog, so I
thought stopping watchdog service in VM should be a quick 
test, right?


I have this in the guest:


  slot='0x08' function='0x0'/>



and I see /dev/watchdog in my guest. Yet nothing happens, 
guest(linux) runs uninterrupted.
I must be missing something, an expert said it's config 
problem, is it really is?


many thanks





Re: [Qemu-devel] [PATCH 1/1] HMP: Add equivalent to x-blockdev-change

2016-01-18 Thread Dr. David Alan Gilbert
* Markus Armbruster (arm...@redhat.com) wrote:
> "Dr. David Alan Gilbert (git)"  writes:
> 
> > From: "Dr. David Alan Gilbert" 
> >
> > x-blockdev-change has no HMP equivalent, so add x_block_change.
> 
> Uh, I can find neither QMP command x-blockdev-change nor
> qmp_x_blockdev_change() in master.

It's not in master yet; it's in Wen Congyang's 'child add/delete support'
series; which I've got running with COLO.

Dave

> 
> > Example useages are:
> > x_block_change  foo -a bah
> >to add the node bah to the parent foo
> >
> > x_block_change  foo -d bah
> >to delete the node bah from the parent foo
> [...]
> > diff --git a/hmp.c b/hmp.c
> > index dc6dc30..631dacb 100644
> > --- a/hmp.c
> > +++ b/hmp.c
> > @@ -1042,6 +1042,26 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
> >  }
> >  }
> >  
> > +void hmp_block_change(Monitor *mon, const QDict *qdict)
> > +{
> > +const char *parent = qdict_get_str(qdict, "parent");
> > +const char *child = qdict_get_str(qdict, "child");
> > +bool add = qdict_get_try_bool(qdict, "add", false);
> > +bool del = qdict_get_try_bool(qdict, "del", false);
> > +Error *err = NULL;
> > +
> > +if (add == del) {
> > +error_setg(, "One of -a or -d must be set");
> > +hmp_handle_error(mon, );
> > +return;
> > +}
> > +
> > +qmp_x_blockdev_change(parent,
> > +  del, child,
> > +  add, child, );
> > +hmp_handle_error(mon, );
> > +}
> > +
> >  void hmp_block_resize(Monitor *mon, const QDict *qdict)
> >  {
> >  const char *device = qdict_get_str(qdict, "device");
> [...]
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PULL 5/5] hw/arm/virt: Support legacy -nic command line syntax

2016-01-18 Thread Eric Auger
Hi,

How is it supposed to live with Passthrough'ed NIC? Current way to
instantiate a VFIO platform NIC looks like:

-device vfio-amd-xgbe,host=e090.xgmac
where vfio-amd-xgbe is the name of the VFIO AMD XGBE platform QEMU
device and e090.xgmac is the name of the device in
/sys/bus/platform/devices.

Before that commit I was able to instantiate this VFIO device and got
networking working, testing it with ping. Now ping don't work anymore. I
Guess I now use this other default NIC?

Curiously it does not seem to prevent networking with upstreamed Calxeda
Midway device and I did not figure why yet?

Thank you in advance

Best Regards

Eric

On 01/11/2016 03:34 PM, Peter Maydell wrote:
> From: Ashok Kumar 
> 
> Support the legacy -nic syntax for creating PCI network devices
> as well as the new-style -device options. This makes life easier
> for people moving from x86 KVM virtualization to ARM KVM virtualization
> and expecting their network configuration options to work the same
> way for both setups.
> 
> We use "virtio" as the default NIC model if the user doesn't specify one.
> 
> Signed-off-by: Ashok Kumar 
> Message-id: 1452091659-17698-1-git-send-email-ash...@broadcom.com
> Reviewed-by: Peter Maydell 
> [PMM: expanded and clarified commit message]
> Signed-off-by: Peter Maydell 
> ---
>  hw/arm/virt.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index acc1fcb..fd52b76 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -808,6 +808,7 @@ static void create_pcie(const VirtBoardInfo *vbi, 
> qemu_irq *pic,
>  DeviceState *dev;
>  char *nodename;
>  int i;
> +PCIHostState *pci;
>  
>  dev = qdev_create(NULL, TYPE_GPEX_HOST);
>  qdev_init_nofail(dev);
> @@ -847,6 +848,19 @@ static void create_pcie(const VirtBoardInfo *vbi, 
> qemu_irq *pic,
>  sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic[irq + i]);
>  }
>  
> +pci = PCI_HOST_BRIDGE(dev);
> +if (pci->bus) {
> +for (i = 0; i < nb_nics; i++) {
> +NICInfo *nd = _table[i];
> +
> +if (!nd->model) {
> +nd->model = g_strdup("virtio");
> +}
> +
> +pci_nic_init_nofail(nd, pci->bus, nd->model, NULL);
> +}
> +}
> +
>  nodename = g_strdup_printf("/pcie@%" PRIx64, base);
>  qemu_fdt_add_subnode(vbi->fdt, nodename);
>  qemu_fdt_setprop_string(vbi->fdt, nodename,
> 




Re: [Qemu-devel] [PULL 5/5] hw/arm/virt: Support legacy -nic command line syntax

2016-01-18 Thread Peter Maydell
On 18 January 2016 at 13:29, Eric Auger  wrote:
> How is it supposed to live with Passthrough'ed NIC? Current way to
> instantiate a VFIO platform NIC looks like:
>
> -device vfio-amd-xgbe,host=e090.xgmac
> where vfio-amd-xgbe is the name of the VFIO AMD XGBE platform QEMU
> device and e090.xgmac is the name of the device in
> /sys/bus/platform/devices.

> Before that commit I was able to instantiate this VFIO device and got
> networking working, testing it with ping. Now ping don't work anymore. I
> Guess I now use this other default NIC?
>
> Curiously it does not seem to prevent networking with upstreamed Calxeda
> Midway device and I did not figure why yet?

Hmm, I guess this is changing things in that we now will have a
virtio PCI device appearing if you use the default (-net nic -net user)
settings. But I don't see why that would particularly interfere
with VFIO passthrough, except in as much as the guest now has
two network cards in it and might be preferring one as eth0
rather than the other...

thanks
-- PMM



Re: [Qemu-devel] [PATCH] hmp: avoid redundant null termination of buffer

2016-01-18 Thread Wolfgang Bumiller
On Mon, Jan 18, 2016 at 02:02:07PM +0100, Markus Armbruster wrote:
> Wolfgang Bumiller  writes:
> 
> > On Tue, Jan 12, 2016 at 05:52:38PM +0100, Markus Armbruster wrote:
> >> Wolfgang Bumiller  writes:
> >> 
> >  while (1) {
> >  separator = strchr(keys, '-');
> >  keyname_len = separator ? separator - keys : strlen(keys);
> 
> Preexisting: I wonder why the compiler doesn't warn here: separator -
> keys is ptrdiff_t, strlen() is size_t, and the left hand side is int.

I noticed and agree it should warn. We know that separator > keys (ie
positive), but we also use keyname_len as a '.*' parameter to printf()
which expects it to be an 'int', so when changing it to size_t we need
to cast it there. Would have to pass a pretty long key name for this to
be an issue... can this happen over any sane interface that doesn't
already give you the power to just 'kill -9 $qemu'?

> > -pstrcpy(keyname_buf, sizeof(keyname_buf), keys);
> >  
> >  /* Be compatible with old interface, convert user inputted "<" */
> > -if (!strncmp(keyname_buf, "<", 1) && keyname_len == 1) {
> > -pstrcpy(keyname_buf, sizeof(keyname_buf), "less");
> > +if (!strncmp(keys, "<", 1) && keyname_len == 1) {
> 
> This strncmp() is a rather roundabout way to say keys[0] == '<'.  I
> guess I'd dumb it down while touching it.  Your choice.

Yes, but with the previous pstrcpy() of "less" etc. I thought this was a
style thing (and the compiler optimizes it out anyway last time I
checked).

> > +keys = "less";
> 
> Works because we're resetting keys to point into the argument string at
> the end of the loop.
> 
> >  keyname_len = 4;
> >  }
> > -keyname_buf[keyname_len] = 0;
> >  
> >  keylist = g_malloc0(sizeof(*keylist));
> >  keylist->value = g_malloc0(sizeof(*keylist->value));
> > @@ -1769,16 +1766,16 @@ void hmp_sendkey(Monitor *mon, const QDict *qdict)
> >  }
> >  tmp = keylist;
> >  
> > -if (strstart(keyname_buf, "0x", NULL)) {
> > +if (strstart(keys, "0x", NULL)) {
> >  char *endp;
> > -int value = strtoul(keyname_buf, , 0);
> > -if (*endp != '\0') {
> > +int value = strtoul(keys, , 0);
> > +if (*endp != '\0' && *endp != '-') {
> 
> strtoul() will not parse beyond keyname_len, because it'll only accept
> hex digits after 0x, thus the '-' or 0 at keyname_len will make it stop.
> 
> I guess I'd throw in assert(endp <= keys + keyname_len), and test
> endp != keys + keyname_len.  What do you think?

Makes sense, but I doubt it'll ever be hit with sane strtoul()
implementations, but an assetion can't be harmful here either :-)

> >  goto err_out;
> >  }
> >  keylist->value->type = KEY_VALUE_KIND_NUMBER;
> >  keylist->value->u.number = value;
> >  } else {
> > -int idx = index_from_key(keyname_buf);
> > +int idx = index_from_key(keys, keyname_len);
> >  if (idx == Q_KEY_CODE__MAX) {
> >  goto err_out;
> >  }
> > @@ -1800,7 +1797,7 @@ out:
> >  return;
> >  
> >  err_out:
> > -monitor_printf(mon, "invalid parameter: %s\n", keyname_buf);
> > +monitor_printf(mon, "invalid parameter: %.*s\n", keyname_len, keys);
> >  goto out;
> >  }
> >  
> > diff --git a/include/ui/console.h b/include/ui/console.h
> > index adac36d..116bc2b 100644
> > --- a/include/ui/console.h
> > +++ b/include/ui/console.h
> > @@ -448,7 +448,7 @@ static inline int vnc_display_pw_expire(const char *id, 
> > time_t expires)
> >  void curses_display_init(DisplayState *ds, int full_screen);
> >  
> >  /* input.c */
> > -int index_from_key(const char *key);
> > +int index_from_key(const char *key, size_t key_length);
> >  
> >  /* gtk.c */
> >  void early_gtk_display_init(int opengl);
> > diff --git a/ui/input-legacy.c b/ui/input-legacy.c
> > index 35dfc27..3454055 100644
> > --- a/ui/input-legacy.c
> > +++ b/ui/input-legacy.c
> > @@ -57,12 +57,13 @@ struct QEMUPutLEDEntry {
> >  static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers =
> >  QTAILQ_HEAD_INITIALIZER(led_handlers);
> >  
> > -int index_from_key(const char *key)
> > +int index_from_key(const char *key, size_t key_length)
> >  {
> >  int i;
> >  
> >  for (i = 0; QKeyCode_lookup[i] != NULL; i++) {
> > -if (!strcmp(key, QKeyCode_lookup[i])) {
> > +if (!strncmp(key, QKeyCode_lookup[i], key_length) &&
> > +!QKeyCode_lookup[i][key_length]) {
> >  break;
> >  }
> >  }
> 
> Could !strncmp(key, QKeyCode_lookup[i], key_length + 1), but that's
> probably overly clever.

That's assuming the key name ends with a \0, which is not the case
coming from a combined key combination where key points to "ctrl-alt-f1"
and should find "ctrl".

> Overall, this is more subtle than a simple g_strndup() 

Re: [Qemu-devel] [PULL] qemu-sparc update

2016-01-18 Thread Peter Maydell
On 16 January 2016 at 12:41, Mark Cave-Ayland
 wrote:
> Hi Peter,
>
> This is simply your VMStateDescription patchset for SPARC with appropriate
> SoBs added. Please pull.
>
>
> ATB,
>
> Mark.
>
>
> The following changes since commit 5a57acb66f19ee52723aa05b8afbbc41c3e9ec99:
>
>   Merge remote-tracking branch 
> 'remotes/pmaydell/tags/pull-target-arm-20160115' into staging (2016-01-15 
> 15:49:43 +)
>
> are available in the git repository at:
>
>
>   https://github.com/mcayland/qemu.git tags/qemu-sparc-signed
>
> for you to fetch changes up to 6d5322442a6904fef51a409ec40f2945581220d6:
>
>   target-sparc: Migrate CWP and PIL for SPARC64 (2016-01-16 12:01:23 +)
>
> 
> qemu-sparc update
>

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v3] qmp: return err msg when powerdown a vm when it isn't in running state

2016-01-18 Thread Markus Armbruster
Qinghua Jin  writes:

> When send system_powerdown to QMP when the vm isn't in RUN_STATE_RUNNING,
> it will be ignored by system. So reply a err msg with the situation.
>
> Signed-off-by: Qinghua Jin 
> ---
>
> This is an update of the patch as per Fam Zheng's suggestion:
>   http://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03949.html
>
>  qmp.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/qmp.c b/qmp.c
> index 0a1fa19..384df56 100644
> --- a/qmp.c
> +++ b/qmp.c
> @@ -114,8 +114,13 @@ void qmp_system_reset(Error **errp)
>  qemu_system_reset_request();
>  }
>  
> -void qmp_system_powerdown(Error **erp)
> +void qmp_system_powerdown(Error **errp)
>  {
> +if (!runstate_check(RUN_STATE_RUNNING)) {
> +error_setg(errp,
> +   "Can not powerdown virtual machine as it is not running");
> +return;
> +}
>  qemu_system_powerdown_request();
>  }

This is the handler for QMP command system_powerdown.  The patch makes
the QMP command fail in run states other than RUN_STATE_RUNNING.

There are other callers: the HMP command handler hmp_system_powerdown(),
GTK's gd_menu_powerdown(), Cocoa's powerDownQEMU().  They all ignore the
error.  Why is that appropriate?

qemu_system_powerdown_request() sets powerdown_requested = 1.  This
makes the next qemu_powerdown_requested() return 1.  Gets called only by
main_loop_should_exit().  I can't quite see how "it will be ignored by
system" when we're in RUN_STATE_RUNNING.  Can you explain?



[Qemu-devel] [PATCH v2 1/4] acpi: take oem_id in build_header(), optionally

2016-01-18 Thread Laszlo Ersek
This patch is the continuation of commit 8870ca0e94f2 ("acpi: support
specified oem table id for build_header"). It will allow us to control the
OEM ID field too in the SDT header.

Cc: "Michael S. Tsirkin"  (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov  (supporter:ACPI/SMBIOS)
Cc: Xiao Guangrong  (maintainer:NVDIMM)
Cc: Shannon Zhao  (maintainer:ARM ACPI Subsystem)
Cc: Paolo Bonzini  (maintainer:X86)
Cc: Richard W.M. Jones 
Cc: Aleksei Kovura 
Cc: Michael Tokarev 
Cc: Steven Newbury 
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek 
---

Notes:
v2:
- no change

 include/hw/acpi/aml-build.h |  2 +-
 hw/acpi/aml-build.c | 11 ---
 hw/acpi/nvdimm.c|  4 ++--
 hw/arm/virt-acpi-build.c| 12 ++--
 hw/i386/acpi-build.c| 20 ++--
 5 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 6d6f705..c460bdd 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -357,7 +357,7 @@ Aml *aml_sizeof(Aml *arg);
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev,
- const char *oem_table_id);
+ const char *oem_id, const char *oem_table_id);
 void *acpi_data_push(GArray *table_data, unsigned size);
 unsigned acpi_data_len(GArray *table);
 void acpi_add_table(GArray *table_offsets, GArray *table_data);
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 78e1290..05b8bd0 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1430,12 +1430,17 @@ Aml *aml_alias(const char *source_object, const char 
*alias_object)
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev,
- const char *oem_table_id)
+ const char *oem_id, const char *oem_table_id)
 {
 memcpy(>signature, sig, 4);
 h->length = cpu_to_le32(len);
 h->revision = rev;
-memcpy(h->oem_id, ACPI_BUILD_APPNAME6, 6);
+
+if (oem_id) {
+strncpy((char *)h->oem_id, oem_id, sizeof h->oem_id);
+} else {
+memcpy(h->oem_id, ACPI_BUILD_APPNAME6, 6);
+}
 
 if (oem_table_id) {
 strncpy((char *)h->oem_table_id, oem_table_id, 
sizeof(h->oem_table_id));
@@ -1510,5 +1515,5 @@ build_rsdt(GArray *table_data, GArray *linker, GArray 
*table_offsets)
sizeof(uint32_t));
 }
 build_header(linker, table_data,
- (void *)rsdt, "RSDT", rsdt_len, 1, NULL);
+ (void *)rsdt, "RSDT", rsdt_len, 1, NULL, NULL);
 }
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index df1b176..73749aa 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -365,7 +365,7 @@ static void nvdimm_build_nfit(GSList *device_list, GArray 
*table_offsets,
 
 build_header(linker, table_data,
  (void *)(table_data->data + header), "NFIT",
- sizeof(NvdimmNfitHeader) + structures->len, 1, NULL);
+ sizeof(NvdimmNfitHeader) + structures->len, 1, NULL, NULL);
 g_array_free(structures, true);
 }
 
@@ -470,7 +470,7 @@ static void nvdimm_build_ssdt(GSList *device_list, GArray 
*table_offsets,
 g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
 build_header(linker, table_data,
 (void *)(table_data->data + table_data->len - ssdt->buf->len),
-"SSDT", ssdt->buf->len, 1, "NVDIMM");
+"SSDT", ssdt->buf->len, 1, NULL, "NVDIMM");
 free_aml_allocator();
 }
 
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0d5c635..985fca4 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -407,7 +407,7 @@ build_spcr(GArray *table_data, GArray *linker, 
VirtGuestInfo *guest_info)
 spcr->pci_vendor_id = 0x;  /* PCI Vendor ID: not a PCI device */
 
 build_header(linker, table_data, (void *)spcr, "SPCR", sizeof(*spcr), 2,
- NULL);
+ NULL, NULL);
 }
 
 static void
@@ -426,7 +426,7 @@ build_mcfg(GArray *table_data, GArray *linker, 
VirtGuestInfo *guest_info)
 mcfg->allocation[0].end_bus_number = (memmap[VIRT_PCIE_ECAM].size
   / PCIE_MMCFG_SIZE_MIN) - 1;
 
-build_header(linker, table_data, (void *)mcfg, "MCFG", len, 1, NULL);
+build_header(linker, table_data, (void *)mcfg, "MCFG", len, 1, NULL, NULL);
 }
 
 /* GTDT */
@@ -452,7 +452,7 @@ build_gtdt(GArray *table_data, GArray *linker)
 
 build_header(linker, table_data,
  (void *)(table_data->data + gtdt_start), "GTDT",
- table_data->len - gtdt_start, 2, 

[Qemu-devel] [PATCH v2 0/4] set the OEM fields in the RSDT and the FADT from the SLIC

2016-01-18 Thread Laszlo Ersek
This is version 2 of
.

Changes in v2 (also annotated per patch):
- introduce acpi_get_slic_oem() [MST],
- drop machine property that disables the SLIC-based override [MST].

Cc: "Michael S. Tsirkin" 
Cc: Aleksei Kovura 
Cc: Igor Mammedov 
Cc: Michael Tokarev 
Cc: Paolo Bonzini 
Cc: Richard W.M. Jones 
Cc: Shannon Zhao 
Cc: Steven Newbury 
Cc: Xiao Guangrong 

Thanks
Laszlo

Laszlo Ersek (4):
  acpi: take oem_id in build_header(), optionally
  acpi: expose oem_id and oem_table_id in build_rsdt()
  acpi: add function to extract oem_id and oem_table_id from the user's
SLIC
  pc: set the OEM fields in the RSDT and the FADT from the SLIC

 include/hw/acpi/acpi.h  |  7 +++
 include/hw/acpi/aml-build.h |  5 +++--
 hw/acpi/aml-build.c | 14 ++
 hw/acpi/core.c  | 16 
 hw/acpi/nvdimm.c|  4 ++--
 hw/arm/virt-acpi-build.c| 14 +++---
 hw/i386/acpi-build.c| 31 ++-
 qemu-options.hx |  4 
 8 files changed, 67 insertions(+), 28 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH v2 2/4] acpi: expose oem_id and oem_table_id in build_rsdt()

2016-01-18 Thread Laszlo Ersek
Since build_rsdt() is implemented as common utility code (in
"hw/acpi/aml-build.c"), it should expose -- and forward -- the oem_id and
oem_table_id parameters between board code and the generic build_header()
function.

Cc: "Michael S. Tsirkin"  (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov  (supporter:ACPI/SMBIOS)
Cc: Shannon Zhao  (maintainer:ARM ACPI Subsystem)
Cc: Paolo Bonzini  (maintainer:X86)
Cc: Richard W.M. Jones 
Cc: Aleksei Kovura 
Cc: Michael Tokarev 
Cc: Steven Newbury 
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek 
---

Notes:
v2:
- no change

 include/hw/acpi/aml-build.h | 3 ++-
 hw/acpi/aml-build.c | 5 +++--
 hw/arm/virt-acpi-build.c| 2 +-
 hw/i386/acpi-build.c| 2 +-
 4 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index c460bdd..aa29d30 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -364,6 +364,7 @@ void acpi_add_table(GArray *table_offsets, GArray 
*table_data);
 void acpi_build_tables_init(AcpiBuildTables *tables);
 void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre);
 void
-build_rsdt(GArray *table_data, GArray *linker, GArray *table_offsets);
+build_rsdt(GArray *table_data, GArray *linker, GArray *table_offsets,
+   const char *oem_id, const char *oem_table_id);
 
 #endif
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 05b8bd0..ce7fe81 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1496,7 +1496,8 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, 
bool mfre)
 
 /* Build rsdt table */
 void
-build_rsdt(GArray *table_data, GArray *linker, GArray *table_offsets)
+build_rsdt(GArray *table_data, GArray *linker, GArray *table_offsets,
+   const char *oem_id, const char *oem_table_id)
 {
 AcpiRsdtDescriptorRev1 *rsdt;
 size_t rsdt_len;
@@ -1515,5 +1516,5 @@ build_rsdt(GArray *table_data, GArray *linker, GArray 
*table_offsets)
sizeof(uint32_t));
 }
 build_header(linker, table_data,
- (void *)rsdt, "RSDT", rsdt_len, 1, NULL, NULL);
+ (void *)rsdt, "RSDT", rsdt_len, 1, oem_id, oem_table_id);
 }
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 985fca4..2e90515 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -642,7 +642,7 @@ void virt_acpi_build(VirtGuestInfo *guest_info, 
AcpiBuildTables *tables)
 
 /* RSDT is pointed to by RSDP */
 rsdt = tables_blob->len;
-build_rsdt(tables_blob, tables->linker, table_offsets);
+build_rsdt(tables_blob, tables->linker, table_offsets, NULL, NULL);
 
 /* RSDP is in FSEG memory, so allocate it separately */
 build_rsdp(tables->rsdp, tables->linker, rsdt);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index e1ebd07..6408362 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2705,7 +2705,7 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 
 /* RSDT is pointed to by RSDP */
 rsdt = tables_blob->len;
-build_rsdt(tables_blob, tables->linker, table_offsets);
+build_rsdt(tables_blob, tables->linker, table_offsets, NULL, NULL);
 
 /* RSDP is in FSEG memory, so allocate it separately */
 build_rsdp(tables->rsdp, tables->linker, rsdt);
-- 
1.8.3.1





Re: [Qemu-devel] [PATCH] hmp: avoid redundant null termination of buffer

2016-01-18 Thread Markus Armbruster
Wolfgang Bumiller  writes:

> On Tue, Jan 12, 2016 at 05:52:38PM +0100, Markus Armbruster wrote:
>> Wolfgang Bumiller  writes:
>> 
>> >> On January 12, 2016 at 5:00 PM Markus Armbruster  
>> >> wrote:
>> >> Will you prepare a revised patch?
>> >
>> > Can do that tomorrow, but which option is the preferred one? If "%.*s" 
>> > works
>> > everywhere then changing index_from_key() and using "%.*s" would be the 
>> > most
>> > optimal I think.
>> >
>> > I don't want to bounce 5 more versions back and forth of something that's
>> > supposed to be rather trivial.
>> 
>> Understandable.
>> 
>> If your patch works and is simple, I won't ask you to redo it using
>> another method just because I might like that better.
>
> Less simple (or at least longer) but gets rid of the static buffer,
> shows the exact keyname in the error message and gets rid of the copying
> of the word "less", too, by adding a length to index_from_key() as per
> your suggestion. Seemed like the cleanest option.
>
> Note that at the end of the loop (not visible in this patch's context
> lines) 'keys' is reassigned to separator+1 or the loop ends if no
> separator was there, which makes the `keys = "less"` assignment valid.
> Though maybe adding an extra `const char *keyname` that becomes
> `keyname = keys` at the beginning of the loop might be better? Not sure
> which style you prefer, I can resend if you like.
>
> ===
>>From 136dd5ac96fc21654a31aff7fa88b86570c8fc72 Mon Sep 17 00:00:00 2001
> From: Wolfgang Bumiller 
> Date: Wed, 13 Jan 2016 08:46:31 +0100
> Subject: [PATCH] hmp: fix sendkey out of bounds write (CVE-2015-8619)
>
> When processing 'sendkey' command, hmp_sendkey routine null
> terminates the 'keyname_buf' array. This results in an OOB
> write issue, if 'keyname_len' was to fall outside of
> 'keyname_buf' array.
>
> Since the keyname's length is known the keyname_buf can be
> removed altogether by adding a length parameter to
> index_from_key() and using it for the error output as well.
>
> Reported-by: Ling Liu 
> Signed-off-by: Wolfgang Bumiller 
> ---
>  hmp.c| 17 +++--
>  include/ui/console.h |  2 +-
>  ui/input-legacy.c|  5 +++--
>  3 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/hmp.c b/hmp.c
> index c2b2c16..066ccf8 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1742,21 +1742,18 @@ void hmp_sendkey(Monitor *mon, const QDict *qdict)
>  int has_hold_time = qdict_haskey(qdict, "hold-time");
>  int hold_time = qdict_get_try_int(qdict, "hold-time", -1);
>  Error *err = NULL;
> -char keyname_buf[16];
>  char *separator;
>  int keyname_len;
>  
>  while (1) {
>  separator = strchr(keys, '-');
>  keyname_len = separator ? separator - keys : strlen(keys);

Preexisting: I wonder why the compiler doesn't warn here: separator -
keys is ptrdiff_t, strlen() is size_t, and the left hand side is int.

> -pstrcpy(keyname_buf, sizeof(keyname_buf), keys);
>  
>  /* Be compatible with old interface, convert user inputted "<" */
> -if (!strncmp(keyname_buf, "<", 1) && keyname_len == 1) {
> -pstrcpy(keyname_buf, sizeof(keyname_buf), "less");
> +if (!strncmp(keys, "<", 1) && keyname_len == 1) {

This strncmp() is a rather roundabout way to say keys[0] == '<'.  I
guess I'd dumb it down while touching it.  Your choice.

> +keys = "less";

Works because we're resetting keys to point into the argument string at
the end of the loop.

>  keyname_len = 4;
>  }
> -keyname_buf[keyname_len] = 0;
>  
>  keylist = g_malloc0(sizeof(*keylist));
>  keylist->value = g_malloc0(sizeof(*keylist->value));
> @@ -1769,16 +1766,16 @@ void hmp_sendkey(Monitor *mon, const QDict *qdict)
>  }
>  tmp = keylist;
>  
> -if (strstart(keyname_buf, "0x", NULL)) {
> +if (strstart(keys, "0x", NULL)) {
>  char *endp;
> -int value = strtoul(keyname_buf, , 0);
> -if (*endp != '\0') {
> +int value = strtoul(keys, , 0);
> +if (*endp != '\0' && *endp != '-') {

strtoul() will not parse beyond keyname_len, because it'll only accept
hex digits after 0x, thus the '-' or 0 at keyname_len will make it stop.

I guess I'd throw in assert(endp <= keys + keyname_len), and test
endp != keys + keyname_len.  What do you think?

>  goto err_out;
>  }
>  keylist->value->type = KEY_VALUE_KIND_NUMBER;
>  keylist->value->u.number = value;
>  } else {
> -int idx = index_from_key(keyname_buf);
> +int idx = index_from_key(keys, keyname_len);
>  if (idx == Q_KEY_CODE__MAX) {
>  goto err_out;
>  }
> @@ -1800,7 +1797,7 @@ out:
>  return;
>  
>  err_out:
> -

Re: [Qemu-devel] [PATCH] misc: spelling

2016-01-18 Thread Markus Armbruster
Michael Tokarev  writes:

> 18.12.2015 13:06, marcandre.lur...@redhat.com wrote:
>> From: Marc-André Lureau 
>
> This patch does not have a required Signed-off-By tag, sorry.

Marc-André, please reply with your S-o-B to unblock this patch.



Re: [Qemu-devel] [PATCH RFC 0/4] ARM SMMUv3 Emulation

2016-01-18 Thread Edgar E. Iglesias
On Mon, Jan 18, 2016 at 06:11:35AM +, Prem (Premachandra) Mallappa wrote:
> > Edgar has done all of the SMMU work for Xilinx, he knows it the best.
> > I'll let him comment on it.
> > 
> > For anyone interested you can see our implementation at:
> > https://github.com/Xilinx/qemu/blob/master/hw/misc/arm-smmu.c. It does
> > use the register API that we have been trying to upstream.
> > 
> Hi,
> I took a quick look at the code, The Xilinx implements mmu-500 which is a 
> SMMU- v2.
> I am not very familiar with V2, however, the architecture and internal 
> workings are different between v2 and v3. 
> hence there are 2 different drivers in Linux for V2 and V3, and the code I 
> posted is for SMMU-v3.
>

Hi,

Yes, we did SMMUv2 but I think there is opportunity to reuse quite a
bit of code.

I don't mind basing the implementation from the broadcom code but
there are a few things that could be taken from our our code or at
least re-implemented.

I haven't looked at the SMMUv3 at all but I'm guessing the translation
tables are the same. We could share code between a v2 and a v3 SMMU
and possibly even some of it with the CPU models.
Our translation code has been tested with S1 only, S2 only and S1 + S2.
We've ran it with the Linux SMMU v2 driver, XEN SMMU v2 driver and
our internal testsuites. This code could be a candidate for merging with
your SMMU code as I noticed that the broadcom implementation is lacking
some of the modes.

Another thing I noticed is that it doesn't seem like you've modelled
the TBUs (these may be named differently in SMMUv3 terminology if they
even exist)? Anyway, we need these as the address-spaces on the ZynqMP
are not all the same for all TBUs.

Cheers,
Edgar



[Qemu-devel] [PATCH v2] monitor: spelling fix

2016-01-18 Thread marcandre . lureau
From: Marc-André Lureau 

Signed-off-by: Marc-André Lureau 
Reviewed-by: Markus Armbruster 
Reviewed-by: Luiz Capitulino 
---
 monitor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/monitor.c b/monitor.c
index c53a453..baa89cb 100644
--- a/monitor.c
+++ b/monitor.c
@@ -311,7 +311,7 @@ static void monitor_flush_locked(Monitor *mon)
 return;
 }
 if (rc > 0) {
-/* partinal write */
+/* partial write */
 QString *tmp = qstring_from_str(buf + rc);
 QDECREF(mon->outbuf);
 mon->outbuf = tmp;
-- 
2.5.0




Re: [Qemu-devel] usb-storage assertions

2016-01-18 Thread Gerd Hoffmann
  Hi,

> > ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
> > arch is this?  i386?  x86_64?
> 
> i386 7.0 for the reference, but I`m sure that this wouldn`t matter in
> any way.

7.0 trace:

[ ... ]
12417@1453119296.506252:usb_ehci_opreg_write wr mmio 0020 [USBCMD] = 0
12417@1453119296.506284:usb_ehci_queue_action q 0x7f94f7a98bb0: free
[ ... ]
12417@1453119296.507336:usb_ehci_state periodic schedule INACTIVE
12417@1453119296.507340:usb_ehci_usbsts usbsts PSS 0
12417@1453119296.507344:usb_ehci_queue_action q 0x7f94f7a98cd0: free
12417@1453119296.507349:usb_ehci_queue_action q 0x7f94f7a98c40: free
12417@1453119296.507353:usb_ehci_queue_action q 0x7f94f749f040: free
12417@1453119296.507357:usb_ehci_queue_action q 0x7f94f749efb0: free
12417@1453119296.507361:usb_ehci_state async schedule INACTIVE
12417@1453119296.507365:usb_ehci_usbsts usbsts ASS 0
12417@1453119296.507369:usb_ehci_usbsts usbsts HALT 1
12417@1453119296.507402:usb_ehci_opreg_write wr mmio 0020 [USBCMD] = 2
12417@1453119296.507413:usb_ehci_reset === RESET ===
12417@1453119296.507418:usb_ehci_port_detach detach port #0, owner ehci
12417@1453119296.507423:usb_ehci_irq level 1, frindex 0x2958, sts
0x1004, mask 0x37
12417@1453119296.507435:usb_ehci_port_attach attach port #0, owner comp,
device QEMU USB MSD
12417@1453119296.507444:usb_ehci_opreg_change ch mmio 0020 [USBCMD] =
8 (old: 0)
12417@1453119296.507466:usb_ehci_opreg_read rd mmio 0024 [USBSTS] = 1000
12417@1453119296.507510:usb_ehci_opreg_read rd mmio 0024 [USBSTS] = 1000
[ ... ]

So, to shutdown ehci netbsd clears the cmd register, then sets the reset
bit in the cmd register.  Fine.  

Then it goes read the status register, in a loop, forever.  No idea why,
and I also can't spot then place in the source code.  Hmm ...

>  Just to mention - ancient 2.6, like 2.6.18, are actually
> doing things quite faster using same frontend+backend combination, may
> be due to lack of proper timeout checks... Actually there is a very
> small chance that the real performance regression was introduced
> during further development, so I instead believe in improper
> interaction of a newer guest EHCI driver and qemu frontend.

Could also be older ehci guest drivers took a shortcut which turned out
to not be correct and not working reliable ...

> Please let
> me know if any countable measurements like fio could be a matter of
> interest - I don`t think that many people are concerned about USB/USB2
> frontend performance at all, since they are bringing in a ton of
> unwelcomed wakeups and the one thing which could be a matter of
> concern in that case is an emulated xHCI, IMHO.

Yes, xhci is clearly the best choice when it comes to performance.

Reasons to use ehci instead basically boils down to (a) lacking/broken
guest drivers (old windows versions, also early linux xhci driver
versions had endian issues, firmware needs xhci support too to boot from
usb) and (b) historical (ehci emulation was there first, so support in
the management stack tends to be better for that, also people are used
to it).

cheers,
  Gerd




[Qemu-devel] [PATCH v2 3/4] acpi: add function to extract oem_id and oem_table_id from the user's SLIC

2016-01-18 Thread Laszlo Ersek
The acpi_get_slic_oem() function stores pointers to these fields in the
(first) SLIC table that the user passes in with the -acpitable switch.

Cc: "Michael S. Tsirkin"  (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov  (supporter:ACPI/SMBIOS)
Cc: Richard W.M. Jones 
Cc: Aleksei Kovura 
Cc: Michael Tokarev 
Cc: Steven Newbury 
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek 
---

Notes:
v2:
- introduce [MST]

 include/hw/acpi/acpi.h |  7 +++
 hw/acpi/core.c | 16 
 2 files changed, 23 insertions(+)

diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h
index b20bd55..2de3021 100644
--- a/include/hw/acpi/acpi.h
+++ b/include/hw/acpi/acpi.h
@@ -196,4 +196,11 @@ unsigned acpi_table_len(void *current);
 void acpi_table_add(const QemuOpts *opts, Error **errp);
 void acpi_table_add_builtin(const QemuOpts *opts, Error **errp);
 
+typedef struct AcpiSlicOem AcpiSlicOem;
+struct AcpiSlicOem {
+  char *id;
+  char *table_id;
+};
+int acpi_get_slic_oem(AcpiSlicOem *oem);
+
 #endif /* !QEMU_HW_ACPI_H */
diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index 21e113d..a9d0f68 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -349,6 +349,22 @@ uint8_t *acpi_table_next(uint8_t *current)
 }
 }
 
+int acpi_get_slic_oem(AcpiSlicOem *oem)
+{
+uint8_t *u;
+
+for (u = acpi_table_first(); u; u = acpi_table_next(u)) {
+struct acpi_table_header *hdr = (void *)(u - sizeof(hdr->_length));
+
+if (memcmp(hdr->sig, "SLIC", 4) == 0) {
+oem->id = hdr->oem_id;
+oem->table_id = hdr->oem_table_id;
+return 0;
+}
+}
+return -1;
+}
+
 static void acpi_notify_wakeup(Notifier *notifier, void *data)
 {
 ACPIREGS *ar = container_of(notifier, ACPIREGS, wakeup);
-- 
1.8.3.1





Re: [Qemu-devel] [vfio-users] [PATCH v2 1/3] input: add qemu_input_qcode_to_linux + qemu_input_linux_to_qcode

2016-01-18 Thread Gerd Hoffmann
On Mo, 2016-01-18 at 11:47 +, Jonathan Scruggs wrote:
> Hi Gerd,
> 
> Would there be a way to add repeating keys back in that doesn't cause
> issues? Maybe slow down the repeat cycle? Or is this strictly a issue
> with how the actual event drivers or the buffers work and would need
> changing to that on the host side?

I don't know ...

> In my mind it seams fairly straightforward in just forwarding these
> events to the VM.

I assumed that as well, it was there initially and only removed after it
turned out to cause problems.

I've added a patch to the git branch bringing it back, but guarded with
a new config option (repeat={on,off}) and turned off by default.

> Would it be different if the keyboard was using the PS/2 versus the
> USB interface on the guest? I have a USB controller added for the
> guest but the keyboard and mouse are on PS/2 interfaces.

Worth testing.  Just add "-device usb-kbd" to the qemu command line and
see what happens ...

> A second thought. What if you made the keyboard and mouse USB only,
> then on the guest, make sure the USB controller is using Message
> Signal-Based interrupts. On Windows, the controller was set to the old
> style Line-Based. The slow downs could be caused by a lake in speed
> with he interrupts and USB polling speed.

In case windows is new enough to have xhci support (win8+) you can try
using a xhci hostadapter, which supports MSI (uhci and ehci don't), then
hook up the usb keyboard to it.

"-device nec-usb-xhci,id=xhci -device usb-kbd,bus=xhci.0"

cheers,
  Gerd




Re: [Qemu-devel] [PULL 5/5] hw/arm/virt: Support legacy -nic command line syntax

2016-01-18 Thread Peter Maydell
On 18 January 2016 at 13:57, Eric Auger  wrote:
> Hi Peter,
> On 01/18/2016 02:34 PM, Peter Maydell wrote:
>> Hmm, I guess this is changing things in that we now will have a
>> virtio PCI device appearing if you use the default (-net nic -net user)
>> settings. But I don't see why that would particularly interfere
>> with VFIO passthrough, except in as much as the guest now has
>> two network cards in it and might be preferring one as eth0
>> rather than the other...
> Yes that's what currently happens I think. I get the slirp thing on eth0
> and my passthrough'ed device on eth1. That's not very straightforward
> for the end-user to get those 2 NIC's now. In case I passthrough some
> NIC's I would have expected this default NIC not be instantiated?

The QEMU networking layer only knows about networking controlled
by the -net or -netdev options (and in those cases it does disable
the default network device). Because the back-end for passthrough
NICs is completely unknown to QEMU (it is all done in hardware),
I'm not sure we have any way to know that the thing you've passed through
is a NIC and not some other random PCI device...

> Alex, how do you manage on x86 platforms with VFIO-PCI NIC?

...but presumably the x86 folks have been here before us
and know how this should work :-)

thanks
-- PMM



[Qemu-devel] [PATCH v2 4/4] pc: set the OEM fields in the RSDT and the FADT from the SLIC

2016-01-18 Thread Laszlo Ersek
The Microsoft spec about the SLIC and MSDM ACPI tables at
 requires the OEM ID and
OEM Table ID fields to be consistent between the SLIC and the RSDT/XSDT.
That further affects the FADT, because a similar match between the FADT
and the RSDT/XSDT is required by the ACPI spec in general.

This patch wires up the previous three patches.

Cc: "Michael S. Tsirkin"  (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov  (supporter:ACPI/SMBIOS)
Cc: Paolo Bonzini  (maintainer:X86)
Cc: Richard W.M. Jones 
Cc: Aleksei Kovura 
Cc: Michael Tokarev 
Cc: Steven Newbury 
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek 
---

Notes:
v2:
- rebase to new acpi_get_slic_oem() [MST]
- drop machine type property to disable this behavior [MST]

 hw/i386/acpi-build.c | 13 +
 qemu-options.hx  |  4 
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6408362..9a74284 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -337,7 +337,8 @@ static void fadt_setup(AcpiFadtDescriptorRev1 *fadt, 
AcpiPmInfo *pm)
 /* FADT */
 static void
 build_fadt(GArray *table_data, GArray *linker, AcpiPmInfo *pm,
-   unsigned facs, unsigned dsdt)
+   unsigned facs, unsigned dsdt,
+   const char *oem_id, const char *oem_table_id)
 {
 AcpiFadtDescriptorRev1 *fadt = acpi_data_push(table_data, sizeof(*fadt));
 
@@ -358,7 +359,7 @@ build_fadt(GArray *table_data, GArray *linker, AcpiPmInfo 
*pm,
 fadt_setup(fadt, pm);
 
 build_header(linker, table_data,
- (void *)fadt, "FACP", sizeof(*fadt), 1, NULL, NULL);
+ (void *)fadt, "FACP", sizeof(*fadt), 1, oem_id, oem_table_id);
 }
 
 static void
@@ -2621,11 +2622,13 @@ void acpi_build(PcGuestInfo *guest_info, 
AcpiBuildTables *tables)
 uint8_t *u;
 size_t aml_len = 0;
 GArray *tables_blob = tables->table_data;
+AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
 
 acpi_get_cpu_info();
 acpi_get_pm_info();
 acpi_get_misc_info();
 acpi_get_pci_info();
+acpi_get_slic_oem(_oem);
 
 table_offsets = g_array_new(false, true /* clear */,
 sizeof(uint32_t));
@@ -2654,7 +2657,8 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 
 /* ACPI tables pointed to by RSDT */
 acpi_add_table(table_offsets, tables_blob);
-build_fadt(tables_blob, tables->linker, , facs, dsdt);
+build_fadt(tables_blob, tables->linker, , facs, dsdt,
+   slic_oem.id, slic_oem.table_id);
 
 ssdt = tables_blob->len;
 acpi_add_table(table_offsets, tables_blob);
@@ -2705,7 +2709,8 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 
 /* RSDT is pointed to by RSDP */
 rsdt = tables_blob->len;
-build_rsdt(tables_blob, tables->linker, table_offsets, NULL, NULL);
+build_rsdt(tables_blob, tables->linker, table_offsets,
+   slic_oem.id, slic_oem.table_id);
 
 /* RSDP is in FSEG memory, so allocate it separately */
 build_rsdp(tables->rsdp, tables->linker, rsdt);
diff --git a/qemu-options.hx b/qemu-options.hx
index b4763ba..0ef79ee 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1472,6 +1472,10 @@ ACPI headers (possible overridden by other options).
 For data=, only data
 portion of the table is used, all header information is specified in the
 command line.
+If a SLIC table is supplied to QEMU, then the SLIC's oem_id and oem_table_id
+fields will override the same in the RSDT and the FADT (a.k.a. FACP), in order
+to ensure the field matches required by the Microsoft SLIC spec and the ACPI
+spec.
 ETEXI
 
 DEF("smbios", HAS_ARG, QEMU_OPTION_smbios,
-- 
1.8.3.1




Re: [Qemu-devel] defining VIXL_DEBUG?

2016-01-18 Thread Paolo Bonzini


On 18/01/2016 12:04, Peter Maydell wrote:
>> >
>> > where the '1x' encodings of bits 22:23 (marked as reserved in the ARMv8
>> > ARM) would cause an abort as far as I can see.
> Isn't this handled by  Decoder::DecodeAddSubImmediate(), which checks
> bit 23?

Yes, that's right.

Paolo

> In any case if we're worried it would be easy to set up a trivial
> test loop that just feeds all 2^32 integers to the disassembler.



Re: [Qemu-devel] [PATCH] hw/misc: slavepci_passthru driver

2016-01-18 Thread Marc-André Lureau
Hi

- Original Message -
> Hi there,
> 
> I'd like to submit this new pci driver ( hw/misc )for inclusion,
> if you think it could be useful to other as well as ourself.
> 
> The driver "worked for our needs" BUT we haven't done extensive
> testing and this is our first attempt to submit a patch so I kindly
> ask for extra-forgiveness .
> 
> The "slavepci_passthru" driver is useful in the scenario described
> below to implement a simplified passthru when the host CPU does not
> support IOMMU and one is interested only in pci target-mode (slave
> devices).

Let's CC Alex, who worked on the most recent framework for something related to 
that (VFIO).

> 
> Embedded system cpu (e.g. Atom, AMD G-Series) often lack the VT-d
> extensions (IOMMU) needed to be able to pass-thru pci peripherals to
> the guest machine (i.e. the pci pass-thru feature cannot be used).
> 
> If one is only interested in using the pci board as a pci-target
> (slave device), this driver mmap(s) the host-pci-bars into the guest
> within a virtual pci-device.
> 
> This is useful in our case for debugging via qemu gsbserver facility
> (i.e. '-s' option in qemu) a system running barebone-executable .
> 
> Currently the driver assumes the custom pci card has four 32-bit bars
> to be mapped (in current patch this is mandatory)
> 
> HowTo:
> To use the new driver one shall:
> - define two environment variables for assigning proper VID and DID to
>   associate to the guest pci card
> - give the host pci bar address to map in the guest.
> 
> Example Usage:
> 
> Let us suppose that we have in the host a slave pci device with the
> following 4 bars (i.e. output of lspci -v -s YOUR-CARD | grep Memory)
>   Memory at db80 (32-bit, non-prefetchable) [size=4K]
>   Memory at db90 (32-bit, non-prefetchable) [size=8K]
>   Memory at dba0 (32-bit, non-prefetchable) [size=4K]
>   Memory at dbb0 (32-bit, non-prefetchable) [size=4K]
> 
> We can map these bars in a guest-pci with VID=0xe33e DID=0x000a using
> 
> SLAVEPASSTHRU_VID="0xe33e" SLAVEPASSTHRU_DID="0xa" qemu-system-x86_64 \
>   YOUR-SET-OF-FLAGS \
>   -device
>   
> slavepassthru,size1=4096,baseaddr1=0xdb90,size2=8192,baseaddr2=0xdba0,size3=4096,baseaddr3=0xdbd0,size4=4096,baseaddr4=0xdbe0
> 
> Please note that if your device has less than four bars you can give
> the same size and baseaddress to the unused bars.
> 
> Thanks
> Francesco Zuliani
> 
> Actual commit patch:
> 
> From 1371bc4e4681f43a2d02b91ec5d7b84f7ccb1f32 Mon Sep 17 00:00:00 2001
> From: Francesco Zuliani 
> Date: Mon, 18 Jan 2016 14:26:54 +0100
> Subject: [PATCH] hw/misc: slave_pci_passthru driver
> Added a slavepci_passthru hw/misc pci-device-driver.
> 
> It enables pass-thru in system missing IOMMU feature (e.g. Intel Atom
> lacks Vt-d extension). It maps hosts target-mode pci-board bars onto
> the guests virtual pci-device bars.
> 
> Signed-off-by: Francesco Zuliani 
> ---
>  default-configs/pci.mak |   1 +
>  hw/misc/Makefile.objs   |   1 +
>  hw/misc/slavepci_passthru.c | 453
>  
>  3 files changed, 455 insertions(+)
>  create mode 100644 hw/misc/slavepci_passthru.c
> 
> diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> index f250119..699a5a5 100644
> --- a/default-configs/pci.mak
> +++ b/default-configs/pci.mak
> @@ -36,4 +36,5 @@ CONFIG_EDU=y
>  CONFIG_VGA=y
>  CONFIG_VGA_PCI=y
>  CONFIG_IVSHMEM=$(CONFIG_POSIX)
> +CONFIG_SLAVEPCIPASSTHRU=$(CONFIG_POSIX)
>  CONFIG_ROCKER=y
> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
> index d4765c2..e346a15 100644
> --- a/hw/misc/Makefile.objs
> +++ b/hw/misc/Makefile.objs
> @@ -20,6 +20,7 @@ common-obj-$(CONFIG_PUV3) += puv3_pm.o
>  common-obj-$(CONFIG_MACIO) += macio/
>  
>  obj-$(CONFIG_IVSHMEM) += ivshmem.o
> +obj-$(CONFIG_SLAVEPCIPASSTHRU) += slavepci_passthru.o
>  
>  obj-$(CONFIG_REALVIEW) += arm_sysctl.o
>  obj-$(CONFIG_NSERIES) += cbus.o
> diff --git a/hw/misc/slavepci_passthru.c b/hw/misc/slavepci_passthru.c
> new file mode 100644
> index 000..ee709b7
> --- /dev/null
> +++ b/hw/misc/slavepci_passthru.c
> @@ -0,0 +1,453 @@
> +/*
> + * Host Device PCI-Card Slave Pass-Thru: based on ivshmem "qemu-device"
> + *
> + *
> + * Author:
> + *  By Francesco Zuliani AT Neat S.r.l. 
> + *
> + * Based On: ivshmem.c
> + *  Original Author Cam Macdonell
> + *
> + * This code is licensed under the GNU GPL v2.
> + *
> + * Contributions after 2012-01-13 are licensed under the terms of the
> + * GNU GPL, version 2 or (at your option) any later version.
> + */
> +
> +#include "hw/hw.h"
> +#include "hw/i386/pc.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci/msix.h"
> +#include "sysemu/kvm.h"
> +#include "migration/migration.h"
> +#include "qemu/error-report.h"
> +#include "qemu/event_notifier.h"
> +#include "sysemu/char.h"
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> 

[Qemu-devel] [PATCH v5 2/8] device_tree: introduce load_device_tree_from_sysfs

2016-01-18 Thread Eric Auger
This function returns the host device tree blob from sysfs
(/proc/device-tree). It uses a recursive function inspired
from dtc read_fstree.

Signed-off-by: Eric Auger 

---
v1 -> v2:
- do not implement/expose read_fstree and load_device_tree_from_sysfs
  if CONFIG_LINUX is not defined (lstat is not implemeted in mingw)
- correct indentation in read_fstree
- use /proc/device-tree symlink instead of /sys/firmware/devicetree/base
  path (kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-ofw)
- use g_file_get_contents in read_fstree
- introduce SYSFS_DT_BASEDIR macro and use strlen
- exit on error in load_device_tree_from_sysfs
- user error_setg

RFC -> v1:
- remove runtime dependency on dtc binary and introduce read_fstree
---
 device_tree.c| 100 +++
 include/sysemu/device_tree.h |   3 ++
 2 files changed, 103 insertions(+)

diff --git a/device_tree.c b/device_tree.c
index a9f5f8e..b262c2d 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -17,6 +17,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_LINUX
+#include 
+#endif
 
 #include "qemu-common.h"
 #include "qemu/error-report.h"
@@ -117,6 +120,103 @@ fail:
 return NULL;
 }
 
+#ifdef CONFIG_LINUX
+
+#define SYSFS_DT_BASEDIR "/proc/device-tree"
+
+/**
+ * read_fstree: this function is inspired from dtc read_fstree
+ * @fdt: preallocated fdt blob buffer, to be populated
+ * @dirname: directory to scan under SYSFS_DT_BASEDIR
+ * the search is recursive and the tree is searched down to the
+ * leafs (property files).
+ *
+ * the function self-asserts in case of error
+ */
+static void read_fstree(void *fdt, const char *dirname)
+{
+DIR *d;
+struct dirent *de;
+struct stat st;
+const char *root_dir = SYSFS_DT_BASEDIR;
+char *parent_node;
+
+if (strstr(dirname, root_dir) != dirname) {
+error_report("%s: %s must be searched within %s",
+ __func__, dirname, root_dir);
+exit(1);
+}
+parent_node = (char *)[strlen(SYSFS_DT_BASEDIR)];
+
+d = opendir(dirname);
+if (!d) {
+error_setg(_fatal, "%s cannot open %s", __func__, dirname);
+}
+
+while ((de = readdir(d)) != NULL) {
+char *tmpnam;
+
+if (!g_strcmp0(de->d_name, ".")
+|| !g_strcmp0(de->d_name, "..")) {
+continue;
+}
+
+tmpnam = g_strjoin("/", dirname, de->d_name, NULL);
+
+if (lstat(tmpnam, ) < 0) {
+error_setg(_fatal, "%s cannot lstat %s", __func__, tmpnam);
+}
+
+if (S_ISREG(st.st_mode)) {
+gchar *val;
+gsize len;
+
+if (!g_file_get_contents(tmpnam, , , NULL)) {
+error_setg(_fatal, "%s not able to extract info from %s",
+   __func__, tmpnam);
+}
+
+if (strlen(parent_node) > 0) {
+qemu_fdt_setprop(fdt, parent_node,
+ de->d_name, val, len);
+} else {
+qemu_fdt_setprop(fdt, "/", de->d_name, val, len);
+}
+g_free(val);
+} else if (S_ISDIR(st.st_mode)) {
+char *node_name;
+
+node_name = g_strdup_printf("%s/%s",
+parent_node, de->d_name);
+qemu_fdt_add_subnode(fdt, node_name);
+g_free(node_name);
+read_fstree(fdt, tmpnam);
+}
+
+g_free(tmpnam);
+}
+
+closedir(d);
+}
+
+/* load_device_tree_from_sysfs: extract the dt blob from host sysfs */
+void *load_device_tree_from_sysfs(void)
+{
+void *host_fdt;
+int host_fdt_size;
+
+host_fdt = create_device_tree(_fdt_size);
+read_fstree(host_fdt, SYSFS_DT_BASEDIR);
+if (fdt_check_header(host_fdt)) {
+error_setg(_fatal,
+   "%s host device tree extracted into memory is invalid",
+   __func__);
+}
+return host_fdt;
+}
+
+#endif /* CONFIG_LINUX */
+
 static int findnode_nofail(void *fdt, const char *node_path)
 {
 int offset;
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 359e143..fdf25a4 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -16,6 +16,9 @@
 
 void *create_device_tree(int *sizep);
 void *load_device_tree(const char *filename_path, int *sizep);
+#ifdef CONFIG_LINUX
+void *load_device_tree_from_sysfs(void);
+#endif
 
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size);
-- 
1.9.1




[Qemu-devel] [PATCH v5 0/8] AMD XGBE KVM platform passthrough

2016-01-18 Thread Eric Auger
This series allows to set up AMD XGBE passthrough. This was tested on AMD
Seattle.

The first upstreamed device supporting KVM platform passthrough was the
Calxeda Midway XGMAC. Compared to this latter, the XGBE XGMAC exposes a
much more complex device tree node.

- First There are 2 device tree node formats:
one where XGBE and PHY are described in separate nodes and another one
that combines both description in a single node (only supported by 4.2
onwards kernels). Only the combined description is supported for passthrough,
meaning the host must be >= 4.2 and must feature a device tree with a combined
description. The guest will also be exposed with a combined description,
meaning only >= 4.2 guest are supported. It is not planned to support
separate node representation since assignment of the PHY is less
straigtforward.

- the XGMAC/PHY node depends on 2 clock nodes (DMA and PTP).
The code checks those clocks are fixed to make sure they cannot be
switched off at some point after the native driver gets unbound.

- there are many property values to populate on guest side. Most of them
cannot be hardcoded. That series implements host device tree blob extraction
from the host /proc/device-tree (inspired from dtc implementation)
and retrieve host property values to populate guest dtb.

- the case where the host uses ACPI is not yet covered since there is
  no usable ACPI description for this HW yet.

The patches can be found at
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-xgbe-v5

Previous versions can be found at
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-xgbe-v

HISTORY:

v4 -> v5:
- add Peter's R-b on qemu_fdt_getprop patches + remove comment on
  error_fatal
- renamed inherit_properties* into copied_properties
- qemu_fdt_node_path now returns a char **: it supports the case where
  several nodes match the name/compat
- add_amd_xgbe_fdt_node now checks a single node is found

v3 -> v4:
- explicitly return 0 in node creation functions

v2 -> v3:
- added "device_tree: qemu_fdt_getprop_cell converted to use the error API"

v1 -> v2:
- take into account Peter's comments:
  - add CONFIG_LINUX protection
  - improve error handling and messages
  - no fixed size buffer anymore
  - fix read_fstree error handling
- use /proc/device-tree symlink instead of /sys/firmware/devicetree/base
- added hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check
- see individual commits for full details

RFC -> v1:
- no dependency anymore on dtc binary: load_device_tree_from_sysfs is
  self-contained and implements something similar to dtc read_fstree.
- take into account Alex' comments
- remove qemu_fdt_getprop_optional and use error API instead

Best Regards

Eric


Eric Auger (8):
  hw/vfio/platform: amd-xgbe device
  device_tree: introduce load_device_tree_from_sysfs
  device_tree: introduce qemu_fdt_node_path
  device_tree: qemu_fdt_getprop converted to use the error API
  device_tree: qemu_fdt_getprop_cell converted to use the error API
  hw/arm/sysbus-fdt: helpers for clock node generation
  hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation
  hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check

 device_tree.c   | 177 --
 hw/arm/boot.c   |   6 +-
 hw/arm/sysbus-fdt.c | 320 ++--
 hw/arm/vexpress.c   |   6 +-
 hw/vfio/Makefile.objs   |   1 +
 hw/vfio/amd-xgbe.c  |  55 +++
 include/hw/vfio/vfio-amd-xgbe.h |  51 +++
 include/sysemu/device_tree.h|  44 +-
 8 files changed, 631 insertions(+), 29 deletions(-)
 create mode 100644 hw/vfio/amd-xgbe.c
 create mode 100644 include/hw/vfio/vfio-amd-xgbe.h

-- 
1.9.1




[Qemu-devel] [PATCH v5 5/8] device_tree: qemu_fdt_getprop_cell converted to use the error API

2016-01-18 Thread Eric Auger
This patch aligns the prototype with qemu_fdt_getprop. The caller
can choose whether the function self-asserts on error (passing
_fatal as Error ** argument, corresponding to the legacy behavior),
or behaves differently such as simply output a message.

In this later case the caller can use the new lenp parameter to interpret
the error if any.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Crosthwaite 

---
v4 -> v5:
- Add Peter's R-b
- remove comment about error_fatal

v3 : creation
---
 device_tree.c| 21 ++---
 hw/arm/boot.c|  6 --
 hw/arm/vexpress.c|  6 --
 include/sysemu/device_tree.h | 14 +-
 4 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 45fd76d..5e56f8e 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -350,15 +350,22 @@ const void *qemu_fdt_getprop(void *fdt, const char 
*node_path,
 }
 
 uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
-   const char *property)
+   const char *property, int *lenp, Error **errp)
 {
 int len;
-const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, ,
- _fatal);
-if (len != 4) {
-error_report("%s: %s/%s not 4 bytes long (not a cell?)",
- __func__, node_path, property);
-exit(1);
+const uint32_t *p;
+
+if (!lenp) {
+lenp = 
+}
+p = qemu_fdt_getprop(fdt, node_path, property, lenp, errp);
+if (!p) {
+return 0;
+} else if (*lenp != 4) {
+error_setg(errp, "%s: %s/%s not 4 bytes long (not a cell?)",
+   __func__, node_path, property);
+*lenp = -EINVAL;
+return 0;
 }
 return be32_to_cpu(*p);
 }
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 75f69bf..541b74c 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -386,8 +386,10 @@ static int load_dtb(hwaddr addr, const struct 
arm_boot_info *binfo,
 return 0;
 }
 
-acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
-scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells",
+   NULL, _fatal);
+scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells",
+   NULL, _fatal);
 if (acells == 0 || scells == 0) {
 fprintf(stderr, "dtb file invalid (#address-cells or #size-cells 
0)\n");
 goto fail;
diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index ea9a984..f6e28dc 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -477,8 +477,10 @@ static void vexpress_modify_dtb(const struct arm_boot_info 
*info, void *fdt)
 uint32_t acells, scells, intc;
 const VEDBoardInfo *daughterboard = (const VEDBoardInfo *)info;
 
-acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
-scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells",
+   NULL, _fatal);
+scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells",
+   NULL, _fatal);
 intc = find_int_controller(fdt);
 if (!intc) {
 /* Not fatal, we just won't provide virtio. This will
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 123beb5..7897e54 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -58,8 +58,20 @@ int qemu_fdt_setprop_phandle(void *fdt, const char 
*node_path,
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
  const char *property, int *lenp,
  Error **errp);
+/**
+ * qemu_fdt_getprop_cell: retrieve the value of a given 4 byte property
+ * @fdt: pointer to the device tree blob
+ * @node_path: node path
+ * @property: name of the property to find
+ * @lenp: fdt error if any or -EINVAL if the property size is different from
+ *4 bytes, or 4 (expected length of the property) upon success.
+ * @errp: handle to an error object
+ *
+ * returns the property value on success
+ */
 uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
-   const char *property);
+   const char *property, int *lenp,
+   Error **errp);
 uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
 uint32_t qemu_fdt_alloc_phandle(void *fdt);
 int qemu_fdt_nop_node(void *fdt, const char *node_path);
-- 
1.9.1




[Qemu-devel] [PATCH v5 7/8] hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation

2016-01-18 Thread Eric Auger
This patch allows the instantiation of the vfio-amd-xgbe device
from the QEMU command line (-device vfio-amd-xgbe,host="").

The guest is exposed with a device tree node that combines the description
of both XGBE and PHY (representation supported from 4.2 onwards kernel):
Documentation/devicetree/bindings/net/amd-xgbe.txt.

There are 5 register regions, 6 interrupts including 4 optional
edge-sensitive per-channel interrupts.

Some property values are inherited from host device tree. Host device tree
must feature a combined XGBE/PHY representation (>= 4.2 host kernel).

2 clock nodes (dma and ptp) also are created. It is checked those clocks
are fixed on host side.

AMD XGBE node creation function has a dependency on vfio Linux header and
more generally node creation function for VFIO platform devices only make
sense with CONFIG_LINUX so let's protect this code with #ifdef CONFIG_LINUX.

Signed-off-by: Eric Auger 

---

v4 -> v5:
- use new proto for qemu_fdt_node_path, check there is only 1 node
  matching the name/compat.
- renamed inherit_properties*

v3 -> v4:
- add_amd_xgbe_fdt_node explicitly returns 0. Reword comment to emphasize
  the fact the function self-asserts in case of error
v1 -> v2:
- add CONFIG_LINUX protection
- improves robustness in sysfs_to_dt_name
- output messages to end-user on misc failures and self-exits:
  error management becomes more violent than before assuming if
  the end-user wants passthrough we must exit if the device cannot
  be instantiated
- fix misc style issues
- remove qemu_fdt_setprop returned value check since it self-asserts

RFC -> v1:
- use qemu_fdt_getprop with Error **
- free substrings in sysfs_to_dt_name
- add some comments related to endianess in add_amd_xgbe_fdt_node
- reword commit message (dtc binary dependency has disappeared)
- check the host device has 5 regions meaning this is a combined
  XGBE/PHY device
---
 hw/arm/sysbus-fdt.c | 195 ++--
 1 file changed, 189 insertions(+), 6 deletions(-)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index d85c9e6..bf6ec24 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -22,6 +22,10 @@
  */
 
 #include 
+#include "qemu-common.h"
+#ifdef CONFIG_LINUX
+#include 
+#endif
 #include "hw/arm/sysbus-fdt.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
@@ -29,6 +33,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/vfio/vfio-platform.h"
 #include "hw/vfio/vfio-calxeda-xgmac.h"
+#include "hw/vfio/vfio-amd-xgbe.h"
 #include "hw/arm/fdt.h"
 
 /*
@@ -64,6 +69,8 @@ typedef struct HostProperty {
 bool optional;
 } HostProperty;
 
+#ifdef CONFIG_LINUX
+
 /**
  * copy_properties_from_host
  *
@@ -126,12 +133,9 @@ static HostProperty clock_copied_properties[] = {
  * @host_phandle: phandle of the clock in host device tree
  * @guest_phandle: phandle to assign to the guest node
  */
-void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
- uint32_t host_phandle,
- uint32_t guest_phandle);
-void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
- uint32_t host_phandle,
- uint32_t guest_phandle)
+static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+uint32_t host_phandle,
+uint32_t guest_phandle)
 {
 char *node_path = NULL;
 char *nodename;
@@ -176,6 +180,28 @@ void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
 g_free(node_path);
 }
 
+/**
+ * sysfs_to_dt_name: convert the name found in sysfs into the node name
+ * for instance e090.xgmac is converted into xgmac@e090
+ * @sysfs_name: directory name in sysfs
+ *
+ * returns the device tree name upon success or NULL in case the sysfs name
+ * does not match the expected format
+ */
+static char *sysfs_to_dt_name(const char *sysfs_name)
+{
+gchar **substrings =  g_strsplit(sysfs_name, ".", 2);
+char *dt_name = NULL;
+
+if (!substrings || !substrings[1] || !substrings[0]) {
+goto out;
+}
+dt_name = g_strdup_printf("%s@%s", substrings[1], substrings[0]);
+out:
+g_strfreev(substrings);
+return dt_name;
+}
+
 /* Device Specific Code */
 
 /**
@@ -243,9 +269,166 @@ fail_reg:
 return ret;
 }
 
+
+/* AMD xgbe properties whose values are copied/pasted from host */
+static HostProperty amd_xgbe_copied_properties[] = {
+{"compatible", false},
+{"dma-coherent", true},
+{"amd,per-channel-interrupt", true},
+{"phy-mode", false},
+{"mac-address", true},
+{"amd,speed-set", false},
+{"amd,serdes-blwc", true},
+{"amd,serdes-cdr-rate", true},
+{"amd,serdes-pq-skew", true},
+{"amd,serdes-tx-amp", true},
+{"amd,serdes-dfe-tap-config", true},
+{"amd,serdes-dfe-tap-enable", true},
+{"clock-names", false},
+};
+
+/**
+ * add_amd_xgbe_fdt_node
+ *
+ * Generates the combined xgbe/phy node following 

[Qemu-devel] [PATCH v5 1/8] hw/vfio/platform: amd-xgbe device

2016-01-18 Thread Eric Auger
This patch introduces the amd-xgbe VFIO platform device. It
allows the guest to do passthrough on a device exposing an
"amd,xgbe-seattle-v1a" compat string.

Signed-off-by: Eric Auger 
Reviewed-by: Alex Bennée 

---
RFC -> v1:
- add Alex' R-b
---
 hw/vfio/Makefile.objs   |  1 +
 hw/vfio/amd-xgbe.c  | 55 +
 include/hw/vfio/vfio-amd-xgbe.h | 51 ++
 3 files changed, 107 insertions(+)
 create mode 100644 hw/vfio/amd-xgbe.c
 create mode 100644 include/hw/vfio/vfio-amd-xgbe.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index d324863..ceddbb8 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -3,4 +3,5 @@ obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o pci-quirks.o
 obj-$(CONFIG_SOFTMMU) += platform.o
 obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
+obj-$(CONFIG_SOFTMMU) += amd-xgbe.o
 endif
diff --git a/hw/vfio/amd-xgbe.c b/hw/vfio/amd-xgbe.c
new file mode 100644
index 000..53451eb
--- /dev/null
+++ b/hw/vfio/amd-xgbe.c
@@ -0,0 +1,55 @@
+/*
+ * AMD XGBE VFIO device
+ *
+ * Copyright Linaro Limited, 2015
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-amd-xgbe.h"
+
+static void amd_xgbe_realize(DeviceState *dev, Error **errp)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+VFIOAmdXgbeDeviceClass *k = VFIO_AMD_XGBE_DEVICE_GET_CLASS(dev);
+
+vdev->compat = g_strdup("amd,xgbe-seattle-v1a");
+
+k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_amd_xgbe_vmstate = {
+.name = TYPE_VFIO_AMD_XGBE,
+.unmigratable = 1,
+};
+
+static void vfio_amd_xgbe_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VFIOAmdXgbeDeviceClass *vcxc =
+VFIO_AMD_XGBE_DEVICE_CLASS(klass);
+vcxc->parent_realize = dc->realize;
+dc->realize = amd_xgbe_realize;
+dc->desc = "VFIO AMD XGBE";
+dc->vmsd = _platform_amd_xgbe_vmstate;
+}
+
+static const TypeInfo vfio_amd_xgbe_dev_info = {
+.name = TYPE_VFIO_AMD_XGBE,
+.parent = TYPE_VFIO_PLATFORM,
+.instance_size = sizeof(VFIOAmdXgbeDevice),
+.class_init = vfio_amd_xgbe_class_init,
+.class_size = sizeof(VFIOAmdXgbeDeviceClass),
+};
+
+static void register_amd_xgbe_dev_type(void)
+{
+type_register_static(_amd_xgbe_dev_info);
+}
+
+type_init(register_amd_xgbe_dev_type)
diff --git a/include/hw/vfio/vfio-amd-xgbe.h b/include/hw/vfio/vfio-amd-xgbe.h
new file mode 100644
index 000..9fff65e
--- /dev/null
+++ b/include/hw/vfio/vfio-amd-xgbe.h
@@ -0,0 +1,51 @@
+/*
+ * VFIO AMD XGBE device
+ *
+ * Copyright Linaro Limited, 2015
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_AMD_XGBE_H
+#define HW_VFIO_VFIO_AMD_XGBE_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_AMD_XGBE "vfio-amd-xgbe"
+
+/**
+ * This device exposes:
+ * - 5 MMIO regions: MAC, PCS, SerDes Rx/Tx regs,
+ SerDes Integration Registers 1/2 & 2/2
+ * - 2 level sensitive IRQs and optional DMA channel IRQs
+ */
+struct VFIOAmdXgbeDevice {
+VFIOPlatformDevice vdev;
+};
+
+typedef struct VFIOAmdXgbeDevice VFIOAmdXgbeDevice;
+
+struct VFIOAmdXgbeDeviceClass {
+/*< private >*/
+VFIOPlatformDeviceClass parent_class;
+/*< public >*/
+DeviceRealize parent_realize;
+};
+
+typedef struct VFIOAmdXgbeDeviceClass VFIOAmdXgbeDeviceClass;
+
+#define VFIO_AMD_XGBE_DEVICE(obj) \
+ OBJECT_CHECK(VFIOAmdXgbeDevice, (obj), TYPE_VFIO_AMD_XGBE)
+#define VFIO_AMD_XGBE_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOAmdXgbeDeviceClass, (klass), \
+TYPE_VFIO_AMD_XGBE)
+#define VFIO_AMD_XGBE_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOAmdXgbeDeviceClass, (obj), \
+  TYPE_VFIO_AMD_XGBE)
+
+#endif
-- 
1.9.1




[Qemu-devel] [PATCH v5 4/8] device_tree: qemu_fdt_getprop converted to use the error API

2016-01-18 Thread Eric Auger
Current qemu_fdt_getprop exits if the property is not found. It is
sometimes needed to read an optional property, in which case we do
not wish to exit but simply returns a null value.

This patch converts qemu_fdt_getprop to accept an Error **, and existing
users are converted to pass _fatal. This preserves the existing
behaviour. Then to use the API with your optional semantic a null
parameter can be conveyed.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Crosthwaite 

---
v4 -> v5:
- add Peter's R-b
- remove comment about error_fatal

v1 -> v2:
- add a doc comment in the header file

RFC -> v1:
- get rid of qemu_fdt_getprop_optional and implement Peter's suggestion
  that consists in using the error API

Signed-off-by: Eric Auger 
---
 device_tree.c| 11 ++-
 include/sysemu/device_tree.h | 13 -
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 3c88a37..45fd76d 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -333,18 +333,18 @@ int qemu_fdt_setprop_string(void *fdt, const char 
*node_path,
 }
 
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
- const char *property, int *lenp)
+ const char *property, int *lenp, Error **errp)
 {
 int len;
 const void *r;
+
 if (!lenp) {
 lenp = 
 }
 r = fdt_getprop(fdt, findnode_nofail(fdt, node_path), property, lenp);
 if (!r) {
-error_report("%s: Couldn't get %s/%s: %s", __func__,
- node_path, property, fdt_strerror(*lenp));
-exit(1);
+error_setg(errp, "%s: Couldn't get %s/%s: %s", __func__,
+  node_path, property, fdt_strerror(*lenp));
 }
 return r;
 }
@@ -353,7 +353,8 @@ uint32_t qemu_fdt_getprop_cell(void *fdt, const char 
*node_path,
const char *property)
 {
 int len;
-const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, );
+const uint32_t *p = qemu_fdt_getprop(fdt, node_path, property, ,
+ _fatal);
 if (len != 4) {
 error_report("%s: %s/%s not 4 bytes long (not a cell?)",
  __func__, node_path, property);
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 436b5dd..123beb5 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -45,8 +45,19 @@ int qemu_fdt_setprop_string(void *fdt, const char *node_path,
 int qemu_fdt_setprop_phandle(void *fdt, const char *node_path,
  const char *property,
  const char *target_node_path);
+/**
+ * qemu_fdt_getprop: retrieve the value of a given property
+ * @fdt: pointer to the device tree blob
+ * @node_path: node path
+ * @property: name of the property to find
+ * @lenp: fdt error if any or length of the property on success
+ * @errp: handle to an error object
+ *
+ * returns a pointer to the property on success and NULL on failure
+ */
 const void *qemu_fdt_getprop(void *fdt, const char *node_path,
- const char *property, int *lenp);
+ const char *property, int *lenp,
+ Error **errp);
 uint32_t qemu_fdt_getprop_cell(void *fdt, const char *node_path,
const char *property);
 uint32_t qemu_fdt_get_phandle(void *fdt, const char *path);
-- 
1.9.1




[Qemu-devel] [PATCH v5 6/8] hw/arm/sysbus-fdt: helpers for clock node generation

2016-01-18 Thread Eric Auger
Some passthrough'ed devices depend on clock nodes. Those need to be
generated in the guest device tree. This patch introduces some helpers
to build a clock node from information retrieved in the host device tree.

- copy_properties_from_host copies properties from a host device tree
  node to a guest device tree node
- fdt_build_clock_node builds a guest clock node and checks the host
  fellow clock is a fixed one.

fdt_build_clock_node will become static as soon as it gets used. A
dummy pre-declaration is needed for compilation of this patch.

Signed-off-by: Eric Auger 

---
v4 -> v5:
- renamed inherit_properties

v1 -> v2:
- inherit properties now outputs an error message in case
  qemu_fdt_getprop fails for an existing optional property
- no hardcoded fixed buffer length
- fdt_build_clock_node becomes void and auto-asserts on error
- use boolean values when defining the clock properties

RFC -> v1:
- use the new proto of qemu_fdt_getprop
- remove newline in error_report
- fix some style issues
---
 hw/arm/sysbus-fdt.c | 120 
 1 file changed, 120 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 9d28797..d85c9e6 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -21,6 +21,7 @@
  *
  */
 
+#include 
 #include "hw/arm/sysbus-fdt.h"
 #include "qemu/error-report.h"
 #include "sysemu/device_tree.h"
@@ -56,6 +57,125 @@ typedef struct NodeCreationPair {
 int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+/* helpers */
+
+typedef struct HostProperty {
+const char *name;
+bool optional;
+} HostProperty;
+
+/**
+ * copy_properties_from_host
+ *
+ * copies properties listed in an array from host device tree to
+ * guest device tree. If a non optional property is not found, the
+ * function self-asserts. An optional property is ignored if not found
+ * in the host device tree.
+ * @props: array of HostProperty to copy
+ * @nb_props: number of properties in the array
+ * @host_dt: host device tree blob
+ * @guest_dt: guest device tree blob
+ * @node_path: host dt node path where the property is supposed to be
+  found
+ * @nodename: guest node name the properties should be added to
+ */
+static void copy_properties_from_host(HostProperty *props, int nb_props,
+  void *host_fdt, void *guest_fdt,
+  char *node_path, char *nodename)
+{
+int i, prop_len;
+const void *r;
+Error *err = NULL;
+
+for (i = 0; i < nb_props; i++) {
+r = qemu_fdt_getprop(host_fdt, node_path,
+ props[i].name,
+ _len,
+ props[i].optional ?  : _fatal);
+if (r) {
+qemu_fdt_setprop(guest_fdt, nodename,
+ props[i].name, r, prop_len);
+} else {
+if (prop_len != -FDT_ERR_NOTFOUND) {
+/* optional property not returned although property exists */
+error_report_err(err);
+} else {
+error_free(err);
+}
+}
+}
+}
+
+/* clock properties whose values are copied/pasted from host */
+static HostProperty clock_copied_properties[] = {
+{"compatible", false},
+{"#clock-cells", false},
+{"clock-frequency", true},
+{"clock-output-names", true},
+};
+
+/**
+ * fdt_build_clock_node
+ *
+ * Build a guest clock node, used as a dependency from a passthrough'ed
+ * device. Most information are retrieved from the host clock node.
+ * Also check the host clock is a fixed one.
+ *
+ * @host_fdt: host device tree blob from which info are retrieved
+ * @guest_fdt: guest device tree blob where the clock node is added
+ * @host_phandle: phandle of the clock in host device tree
+ * @guest_phandle: phandle to assign to the guest node
+ */
+void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+ uint32_t host_phandle,
+ uint32_t guest_phandle);
+void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
+ uint32_t host_phandle,
+ uint32_t guest_phandle)
+{
+char *node_path = NULL;
+char *nodename;
+const void *r;
+int ret, node_offset, prop_len, path_len = 16;
+
+node_offset = fdt_node_offset_by_phandle(host_fdt, host_phandle);
+if (node_offset <= 0) {
+error_setg(_fatal,
+   "not able to locate clock handle %d in host device tree",
+   host_phandle);
+}
+node_path = g_malloc(path_len);
+while ((ret = fdt_get_path(host_fdt, node_offset, node_path, path_len))
+== -FDT_ERR_NOSPACE) {
+path_len += 16;
+node_path = g_realloc(node_path, path_len);
+}
+if (ret < 0) {
+error_setg(_fatal,
+   "not able to retrieve node path for clock handle %d",
+  

[Qemu-devel] [PATCH v5 8/8] hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check

2016-01-18 Thread Eric Auger
qemu_fdt_setprop self-asserts in case of error hence no need to check
the returned value.

Signed-off-by: Eric Auger 

---

v3 -> v4: fix returned value
---
 hw/arm/sysbus-fdt.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index bf6ec24..d179214 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -216,7 +216,7 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 PlatformBusDevice *pbus = data->pbus;
 void *fdt = data->fdt;
 const char *parent_node = data->pbus_node_name;
-int compat_str_len, i, ret = -1;
+int compat_str_len, i;
 char *nodename;
 uint32_t *irq_attr, *reg_attr;
 uint64_t mmio_base, irq_number;
@@ -241,12 +241,8 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 reg_attr[2 * i + 1] = cpu_to_be32(
 memory_region_size(>regions[i]->mem));
 }
-ret = qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
-   vbasedev->num_regions * 2 * sizeof(uint32_t));
-if (ret) {
-error_report("could not set reg property of node %s", nodename);
-goto fail_reg;
-}
+qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
+ vbasedev->num_regions * 2 * sizeof(uint32_t));
 
 irq_attr = g_new(uint32_t, vbasedev->num_irqs * 3);
 for (i = 0; i < vbasedev->num_irqs; i++) {
@@ -256,17 +252,12 @@ static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice 
*sbdev, void *opaque)
 irq_attr[3 * i + 1] = cpu_to_be32(irq_number);
 irq_attr[3 * i + 2] = cpu_to_be32(GIC_FDT_IRQ_FLAGS_LEVEL_HI);
 }
-ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+qemu_fdt_setprop(fdt, nodename, "interrupts",
  irq_attr, vbasedev->num_irqs * 3 * sizeof(uint32_t));
-if (ret) {
-error_report("could not set interrupts property of node %s",
- nodename);
-}
 g_free(irq_attr);
-fail_reg:
 g_free(reg_attr);
 g_free(nodename);
-return ret;
+return 0;
 }
 
 
-- 
1.9.1




Re: [Qemu-devel] [PATCH] add MAINTAINERS entry for qemu socket code

2016-01-18 Thread Daniel P. Berrange
On Mon, Jan 18, 2016 at 04:03:23PM +0100, Gerd Hoffmann wrote:
> Cc: Daniel P. Berrange 
> Cc: Paolo Bonzini 
> Signed-off-by: Gerd Hoffmann 
> ---
>  MAINTAINERS | 8 
>  1 file changed, 8 insertions(+)

Reviewed-by: Daniel P. Berrange 

> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8f44dca..bc753fc 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1260,6 +1260,14 @@ F: io/
>  F: include/io/
>  F: tests/test-io-*
>  
> +Sockets
> +M: Daniel P. Berrange 
> +M: Gerd Hoffmann 
> +M: Paolo Bonzini 
> +S: Maintained
> +F: include/qemu/sockets.h
> +F: util/qemu-sockets.c
> +


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH v5 3/8] device_tree: introduce qemu_fdt_node_path

2016-01-18 Thread Eric Auger
This new helper routine returns a NULL terminated array of
node paths matching a node name and a compat string.

Signed-off-by: Eric Auger 

---

v4 -> v5:
- support the case where several nodes exist, ie.
  return an array of node paths. Also add Error **
  parameter

v1 -> v2:
- move doc comment in header file
- do not use a fixed size buffer
- break on errors in while loop
- use strcmp instead of strncmp

RFC -> v1:
- improve error handling according to Alex' comments
---
 device_tree.c| 49 
 include/sysemu/device_tree.h | 14 +
 2 files changed, 63 insertions(+)

diff --git a/device_tree.c b/device_tree.c
index b262c2d..3c88a37 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -231,6 +231,55 @@ static int findnode_nofail(void *fdt, const char 
*node_path)
 return offset;
 }
 
+char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
+  Error **errp)
+{
+int offset, len, ret;
+const char *iter_name;
+unsigned int path_len = 16, n = 0;
+GSList *path_list = NULL, *iter;
+char **path_array;
+
+offset = fdt_node_offset_by_compatible(fdt, -1, compat);
+
+while (offset >= 0) {
+iter_name = fdt_get_name(fdt, offset, );
+if (!iter_name) {
+offset = len;
+break;
+}
+if (!strcmp(iter_name, name)) {
+char *path;
+
+path = g_malloc(path_len);
+while ((ret = fdt_get_path(fdt, offset, path, path_len))
+  == -FDT_ERR_NOSPACE) {
+path_len += 16;
+path = g_realloc(path, path_len);
+}
+path_list = g_slist_prepend(path_list, path);
+n++;
+}
+offset = fdt_node_offset_by_compatible(fdt, offset, compat);
+}
+
+if (offset < 0 && offset != -FDT_ERR_NOTFOUND) {
+error_setg(errp, "%s: abort parsing dt for %s/%s: %s",
+   __func__, name, compat, fdt_strerror(offset));
+}
+
+path_array = g_new(char *, n + 1);
+path_array[n--] = NULL;
+
+for (iter = path_list; iter; iter = iter->next) {
+path_array[n--] = iter->data;
+}
+
+g_slist_free(path_list);
+
+return path_array;
+}
+
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size)
 {
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index fdf25a4..436b5dd 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -20,6 +20,20 @@ void *load_device_tree(const char *filename_path, int 
*sizep);
 void *load_device_tree_from_sysfs(void);
 #endif
 
+/**
+ * qemu_fdt_node_path: return the paths of nodes matching a given
+ * name and compat string
+ * @fdt: pointer to the dt blob
+ * @name: node name
+ * @compat: compatibility string
+ * @errp: handle to an error object
+ *
+ * returns a newly allocated NULL-terminated array of node paths.
+ * Use g_strfreev() to free it.
+ */
+char **qemu_fdt_node_path(void *fdt, const char *name, char *compat,
+  Error **errp);
+
 int qemu_fdt_setprop(void *fdt, const char *node_path,
  const char *property, const void *val, int size);
 int qemu_fdt_setprop_cell(void *fdt, const char *node_path,
-- 
1.9.1




[Qemu-devel] [V4 0/4] AMD IO MMU

2016-01-18 Thread David Kiarie
David Kiarie (4):
  hw/i386: Introduce AMD IO MMU
  hw/core: Add AMD IO MMU to machine properties
  hw/i386: ACPI table for AMD IO MMU
  hw/pci-host: Emulate AMD IO MMU

 hw/core/machine.c |   17 +-
 hw/i386/Makefile.objs |1 +
 hw/i386/acpi-build.c  |   70 ++
 hw/i386/amd_iommu.c   | 1409 +
 hw/i386/amd_iommu.h   |  399 
 hw/pci-host/piix.c|   11 +
 hw/pci-host/q35.c |   14 +-
 include/hw/acpi/acpi-defs.h   |   55 ++
 include/hw/boards.h   |3 +-
 include/hw/i386/intel_iommu.h |1 +
 include/hw/pci/pci.h  |2 +
 qemu-options.hx   |6 +-
 util/qemu-config.c|4 +-
 vl.c  |8 +
 14 files changed, 1984 insertions(+), 16 deletions(-)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

Hi all, 

V4 of IO MMU patches.
Changes since V3
 -Fixed Marcel's comments
 -byte swapping in ACPI code fixed


As for IO MMU MMIO region:

This is the code that sets up the IO MMU base address in coreboot. It seems to 
be reading something from the BUS config region which as per the comment should 
have a value written by BIOS.

case CB_AmdSetMidPostConfig:
nbConfigPtr->pNbConfig->IoApicBaseAddress = 
IO_APIC_ADDR;
#ifndef IOMMU_SUPPORT_DISABLE //TODO enable iommu
/* SBIOS must alloc 16K memory for IOMMU MMIO */
UINT32  MmcfgBarAddress; //using default 
IOmmuBaseAddress
LibNbPciRead(nbConfigPtr->NbPciAddress.AddressValue | 
0x1C,
AccessWidth32,
,
nbConfigPtr);
MmcfgBarAddress &= ~0xf;
if (MmcfgBarAddress != 0) {
nbConfigPtr->IommuBaseAddress = MmcfgBarAddress;
}
nbConfigPtr->IommuBaseAddress = 0; //disable iommu
#endif

I have a feeling that this is getting overly and unnecessary complex - AMD have 
their own BIOS which they, only know what it does and we have ours( which of 
course, we know how it behaves).

If we choose a static address and assign that to IO MMU mmio we could 
hypothetically have two problems.
  -SeaBIOS allocating BAR from the same region.
  -Someone selecting the region for other devices such as HPET.

The first problem can be solved as we know from what addresses seaBIOS allocats 
BARs while as for the second they should know better. I have therefore selected 
an unused IO region just next IOAPIC and HPET region and mapped 16K for IO MMU 
mmio.

David.

-- 
2.1.4




[Qemu-devel] [V4 2/4] hw/core: Add AMD IO MMU to machine properties

2016-01-18 Thread David Kiarie
Add IO MMU as a string to machine properties which
is used to control whether and the type of IO MMU
to emulate

Signed-off-by: David Kiarie 
---
 hw/core/machine.c | 17 +
 include/hw/boards.h   |  3 ++-
 include/hw/i386/intel_iommu.h |  1 +
 qemu-options.hx   |  6 +++---
 util/qemu-config.c|  4 ++--
 vl.c  |  8 
 6 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index c46ddc7..cb309aa 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -283,18 +283,19 @@ static void machine_set_firmware(Object *obj, const char 
*value, Error **errp)
 ms->firmware = g_strdup(value);
 }
 
-static bool machine_get_iommu(Object *obj, Error **errp)
+static char *machine_get_iommu(Object *obj, Error **errp)
 {
 MachineState *ms = MACHINE(obj);
 
-return ms->iommu;
+return g_strdup(ms->iommu);
 }
 
-static void machine_set_iommu(Object *obj, bool value, Error **errp)
+static void machine_set_iommu(Object *obj, const char *value, Error **errp)
 {
 MachineState *ms = MACHINE(obj);
 
-ms->iommu = value;
+g_free(ms->iommu);
+ms->iommu = g_strdup(value);
 }
 
 static void machine_set_suppress_vmdesc(Object *obj, bool value, Error **errp)
@@ -454,11 +455,10 @@ static void machine_initfn(Object *obj)
 object_property_set_description(obj, "firmware",
 "Firmware image",
 NULL);
-object_property_add_bool(obj, "iommu",
- machine_get_iommu,
- machine_set_iommu, NULL);
+object_property_add_str(obj, "iommu",
+machine_get_iommu, machine_set_iommu, NULL);
 object_property_set_description(obj, "iommu",
-"Set on/off to enable/disable Intel IOMMU 
(VT-d)",
+"IOMMU list",
 NULL);
 object_property_add_bool(obj, "suppress-vmdesc",
  machine_get_suppress_vmdesc,
@@ -484,6 +484,7 @@ static void machine_finalize(Object *obj)
 g_free(ms->dumpdtb);
 g_free(ms->dt_compatible);
 g_free(ms->firmware);
+g_free(ms->iommu);
 }
 
 bool machine_usb(MachineState *machine)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 0f30959..b119245 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -36,6 +36,7 @@ bool machine_usb(MachineState *machine);
 bool machine_kernel_irqchip_allowed(MachineState *machine);
 bool machine_kernel_irqchip_required(MachineState *machine);
 bool machine_kernel_irqchip_split(MachineState *machine);
+bool machine_amd_iommu(MachineState *machine);
 int machine_kvm_shadow_mem(MachineState *machine);
 int machine_phandle_start(MachineState *machine);
 bool machine_dump_guest_core(MachineState *machine);
@@ -126,7 +127,7 @@ struct MachineState {
 bool usb_disabled;
 bool igd_gfx_passthru;
 char *firmware;
-bool iommu;
+char *iommu;
 bool suppress_vmdesc;
 
 ram_addr_t ram_size;
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 5dbadb7..0b32bd6 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -27,6 +27,7 @@
 #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu"
 #define INTEL_IOMMU_DEVICE(obj) \
  OBJECT_CHECK(IntelIOMMUState, (obj), TYPE_INTEL_IOMMU_DEVICE)
+#define INTEL_IOMMU_STR "intel"
 
 /* DMAR Hardware Unit Definition address (IOMMU unit) */
 #define Q35_HOST_BRIDGE_IOMMU_ADDR  0xfed9ULL
diff --git a/qemu-options.hx b/qemu-options.hx
index 215d00d..ac327c8 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -38,7 +38,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "kvm_shadow_mem=size of KVM shadow MMU\n"
 "dump-guest-core=on|off include guest memory in a core 
dump (default=on)\n"
 "mem-merge=on|off controls memory merge support (default: 
on)\n"
-"iommu=on|off controls emulated Intel IOMMU (VT-d) support 
(default=off)\n"
+"iommu=amd|intel enables and selects the emulated IO MMU 
(default: off)\n"
 "igd-passthru=on|off controls IGD GFX passthrough support 
(default=off)\n"
 "aes-key-wrap=on|off controls support for AES key wrapping 
(default=on)\n"
 "dea-key-wrap=on|off controls support for DEA key wrapping 
(default=on)\n"
@@ -72,8 +72,8 @@ Include guest memory in a core dump. The default is on.
 Enables or disables memory merge support. This feature, when supported by
 the host, de-duplicates identical memory pages among VMs instances
 (enabled by default).
-@item iommu=on|off
-Enables or disables emulated Intel IOMMU (VT-d) support. The default is off.
+@item iommu=intel|amd
+Enables and selects the emulated IO MMU. The default is 

[Qemu-devel] [PATCH] hw/pci: ensure that only PCI/PCIe bridges can be attached to pxb/pxb-pcie devices

2016-01-18 Thread Marcel Apfelbaum
PCI devices can't be plugged directly into PCI extra root bridges
because their resources can't be computed by firmware before the ACPI
tables are loaded.

Signed-off-by: Marcel Apfelbaum 
---

Hi,

This patch follows the discussion:
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01484.html

Thanks,
Marcel

 hw/pci/pci.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 168b9cc..584f504 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -850,6 +850,13 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev, PCIBus *bus,
 DeviceState *dev = DEVICE(pci_dev);
 
 pci_dev->bus = bus;
+/* Only pci bridges can be attached to extra PCI root buses */
+if (pci_bus_is_root(bus) && bus->parent_dev && !pc->is_bridge) {
+error_setg(errp,
+   "PCI: Only PCI/PCIe bridges can be plugged into %s",
+bus->parent_dev->name);
+return NULL;
+}
 
 if (devfn < 0) {
 for(devfn = bus->devfn_min ; devfn < ARRAY_SIZE(bus->devices);
-- 
2.4.3




[Qemu-devel] [V4 3/4] hw/i386: ACPI table for AMD IO MMU

2016-01-18 Thread David Kiarie
Add IVRS table for AMD IO MMU.

Signed-off-by: David Kiarie 
---
 hw/i386/acpi-build.c| 70 +
 include/hw/acpi/acpi-defs.h | 55 +++
 2 files changed, 125 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 78758e2..5c0d6b7 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -52,6 +52,7 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci-host/q35.h"
 #include "hw/i386/intel_iommu.h"
+#include "hw/i386/amd_iommu.h"
 #include "hw/timer/hpet.h"
 
 #include "hw/acpi/aml-build.h"
@@ -2424,6 +2425,70 @@ build_dmar_q35(GArray *table_data, GArray *linker)
 }
 
 static void
+build_amd_iommu(GArray *table_data, GArray *linker)
+{
+int iommu_start = table_data->len;
+bool iommu_ambig;
+
+AcpiAMDIOMMUIVRS *ivrs;
+AcpiAMDIOMMUHardwareUnit *iommu;
+
+/* IVRS definition */
+ivrs = acpi_data_push(table_data, sizeof(*ivrs));
+ivrs->revision = cpu_to_le16(ACPI_IOMMU_IVRS_TYPE);
+ivrs->length = cpu_to_le16((sizeof(*ivrs) + sizeof(*iommu)));
+ivrs->v_common_info = cpu_to_le64(AMD_IOMMU_HOST_ADDRESS_WIDTH << 8);
+
+AMDIOMMUState *s = (AMDIOMMUState *)object_resolve_path_type("",
+TYPE_AMD_IOMMU_DEVICE, _ambig);
+
+/* IVDB definition */
+iommu = acpi_data_push(table_data, sizeof(*iommu));
+if (!iommu_ambig) {
+iommu->type = cpu_to_le16(0x10);
+/* IVHD flags */
+iommu->flags = cpu_to_le16(iommu->flags);
+iommu->flags = cpu_to_le16(IVHD_HT_TUNEN | IVHD_PPRSUP | IVHD_IOTLBSUP
+   | IVHD_PREFSUP);
+iommu->length = cpu_to_le16(sizeof(*iommu));
+iommu->device_id = cpu_to_le16(PCI_DEVICE_ID_RD890_IOMMU);
+iommu->capability_offset = cpu_to_le16(s->capab_offset);
+iommu->mmio_base = cpu_to_le64(s->mmio.addr);
+iommu->pci_segment = 0;
+iommu->interrupt_info = 0;
+/* EFR features */
+iommu->efr_register = cpu_to_le64(IVHD_EFR_GTSUP | IVHD_EFR_HATS
+  | IVHD_EFR_GATS);
+iommu->efr_register = cpu_to_le64(iommu->efr_register);
+/* device entries */
+memset(iommu->dev_entries, 0, 20);
+/* Add device flags here
+ *  create entries for devices to be treated specially by IO MMU,
+ *  currently we report all devices to IO MMU with no special flags
+ *  DTE settings made here apply to all devices
+ *  Refer to AMD IOMMU spec Table 97
+ */
+iommu->dev_entries[12] = 3;
+iommu->dev_entries[16] = 4;
+iommu->dev_entries[17] = 0xff;
+iommu->dev_entries[18] = 0xff;
+}
+
+build_header(linker, table_data, (void *)(table_data->data + iommu_start),
+ "IVRS", table_data->len - iommu_start, 1, NULL);
+}
+
+static bool acpi_has_amd_iommu(void)
+{
+bool ambiguous;
+Object *amd_iommu;
+
+amd_iommu = object_resolve_path_type("", TYPE_AMD_IOMMU_DEVICE,
+ );
+return amd_iommu && !ambiguous;
+}
+
+static void
 build_dsdt(GArray *table_data, GArray *linker,
AcpiPmInfo *pm, AcpiMiscInfo *misc)
 {
@@ -2691,6 +2756,11 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables 
*tables)
 build_dmar_q35(tables_blob, tables->linker);
 }
 
+if (acpi_has_amd_iommu() && !acpi_has_iommu()) {
+acpi_add_table(table_offsets, tables_blob);
+build_amd_iommu(tables_blob, tables->linker);
+}
+
 if (acpi_has_nvdimm()) {
 nvdimm_build_acpi(table_offsets, tables_blob, tables->linker);
 }
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index c7a03d4..a161358 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -570,4 +570,59 @@ typedef struct AcpiDmarHardwareUnit AcpiDmarHardwareUnit;
 /* Masks for Flags field above */
 #define ACPI_DMAR_INCLUDE_PCI_ALL   1
 
+/* IVRS constants */
+#define ACPI_IOMMU_HARDWAREUNIT_TYPE 0x10
+#define ACPI_IOMMU_IVRS_TYPE 0x1
+#define AMD_IOMMU_HOST_ADDRESS_WIDTH 39UL
+
+/* AMD IOMMU IVRS table */
+struct AcpiAMDIOMMUIVRS {
+ACPI_TABLE_HEADER_DEF
+uint32_t v_common_info; /* common virtualization information */
+uint64_t reserved;  /* reserved  */
+} QEMU_PACKED;
+typedef struct AcpiAMDIOMMUIVRS AcpiAMDIOMMUIVRS;
+
+/* flags in the IVHD headers */
+#define IVHD_HT_TUNEN(1UL << 0)
+#define IVHD_PASS_PW (1UL << 1)
+#define IVHD_RESPASS_PW  (1UL << 2)
+#define IVHD_ISOC(1UL << 3)
+#define IVHD_IOTLBSUP(1UL << 4)
+#define IVHD_COHERENT(1UL << 5)
+#define IVHD_PREFSUP (1UL << 6)
+#define IVHD_PPRSUP  (1UL << 7)
+
+/* features in the IVHD headers */
+#define IVHD_EFR_HATS   48
+#define IVHD_EFR_GATS   48
+#define IVHD_EFR_MSI_NUM
+#define IVHD_EFR_PNBANKS
+#define IVHD_EFR_PNCOUNTERS
+#define IVHD_EFR_PASMAX
+#define IVHD_EFR_HESUP  (1UL 

[Qemu-devel] [V4 1/4] hw/i386: Introduce AMD IO MMU

2016-01-18 Thread David Kiarie
Add AMD IO MMU emulation to Qemu in addition to Intel IO MMU.
The IO MMU does basic translation, error checking and has a
minimal IOTLB implementation.

Signed-off-by: David Kiarie 
---
 hw/i386/Makefile.objs |1 +
 hw/i386/amd_iommu.c   | 1409 +
 hw/i386/amd_iommu.h   |  399 ++
 include/hw/pci/pci.h  |2 +
 4 files changed, 1811 insertions(+)
 create mode 100644 hw/i386/amd_iommu.c
 create mode 100644 hw/i386/amd_iommu.h

diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index b52d5b8..2f1a265 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -3,6 +3,7 @@ obj-y += multiboot.o
 obj-y += pc.o pc_piix.o pc_q35.o
 obj-y += pc_sysfw.o
 obj-y += intel_iommu.o
+obj-y += amd_iommu.o
 obj-$(CONFIG_XEN) += ../xenpv/ xen/
 
 obj-y += kvmvapic.o
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
new file mode 100644
index 000..20111fe
--- /dev/null
+++ b/hw/i386/amd_iommu.c
@@ -0,0 +1,1409 @@
+/*
+ * QEMU emulation of AMD IOMMU (AMD-Vi)
+ *
+ * Copyright (C) 2011 Eduard - Gabriel Munteanu
+ * Copyright (C) 2015 David Kiarie, 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ *
+ * Cache implementation inspired by hw/i386/intel_iommu.c
+ *
+ */
+#include "hw/i386/amd_iommu.h"
+
+//#define DEBUG_AMD_IOMMU
+#ifdef DEBUG_AMD_IOMMU
+enum {
+DEBUG_GENERAL, DEBUG_CAPAB, DEBUG_MMIO, DEBUG_ELOG,
+DEBUG_CACHE, DEBUG_COMMAND, DEBUG_MMU
+};
+
+#define IOMMU_DBGBIT(x)   (1 << DEBUG_##x)
+static int iommu_dbgflags = IOMMU_DBGBIT(MMIO);
+
+#define IOMMU_DPRINTF(what, fmt, ...) do { \
+if (iommu_dbgflags & IOMMU_DBGBIT(what)) { \
+fprintf(stderr, "(amd-iommu)%s: " fmt "\n", __func__, \
+## __VA_ARGS__); } \
+} while (0)
+#else
+#define IOMMU_DPRINTF(what, fmt, ...) do {} while (0)
+#endif
+
+/* configure MMIO registers at startup/reset */
+static void amd_iommu_set_quad(AMDIOMMUState *s, hwaddr addr, uint64_t val,
+   uint64_t romask, uint64_t w1cmask)
+{
+stq_le_p(>mmior[addr], val);
+stq_le_p(>romask[addr], romask);
+stq_le_p(>w1cmask[addr], w1cmask);
+}
+
+static uint16_t amd_iommu_readw(AMDIOMMUState *s, hwaddr addr)
+{
+return lduw_le_p(>mmior[addr]);
+}
+
+static uint32_t amd_iommu_readl(AMDIOMMUState *s, hwaddr addr)
+{
+return ldl_le_p(>mmior[addr]);
+}
+
+static uint64_t amd_iommu_readq(AMDIOMMUState *s, hwaddr addr)
+{
+return ldq_le_p(>mmior[addr]);
+}
+
+/* internal write */
+static void amd_iommu_writeq_raw(AMDIOMMUState *s, uint64_t val, hwaddr addr)
+{
+stq_le_p(>mmior[addr], val);
+}
+
+/* external write */
+static void amd_iommu_writew(AMDIOMMUState *s, hwaddr addr, uint16_t val)
+{
+uint16_t romask = lduw_le_p(>romask[addr]);
+uint16_t w1cmask = lduw_le_p(>w1cmask[addr]);
+uint16_t oldval = lduw_le_p(>mmior[addr]);
+stw_le_p(>mmior[addr], (val & ~(val & w1cmask)) | (romask & oldval));
+}
+
+static void amd_iommu_writel(AMDIOMMUState *s, hwaddr addr, uint32_t val)
+{
+uint32_t romask = ldl_le_p(>romask[addr]);
+uint32_t w1cmask = ldl_le_p(>w1cmask[addr]);
+uint32_t oldval = ldl_le_p(>mmior[addr]);
+stl_le_p(>mmior[addr], (val & ~(val & w1cmask)) | (romask & oldval));
+}
+
+static void amd_iommu_writeq(AMDIOMMUState *s, hwaddr addr, uint64_t val)
+{
+uint64_t romask = ldq_le_p(>romask[addr]);
+uint64_t w1cmask = ldq_le_p(>w1cmask[addr]);
+uint32_t oldval = ldq_le_p(>mmior[addr]);
+stq_le_p(>mmior[addr], (val & ~(val & w1cmask)) | (romask & oldval));
+}
+
+static void amd_iommu_log_event(AMDIOMMUState *s, uint16_t *evt)
+{
+/* event logging not enabled */
+if (!s->evtlog_enabled || *(uint64_t *)>mmior[MMIO_STATUS]
+   | MMIO_STATUS_EVT_OVF) {
+return;
+}
+
+/* event log buffer full */
+if (s->evtlog_tail >= s->evtlog_len) {
+*(uint64_t *)>mmior[MMIO_STATUS] |= MMIO_STATUS_EVT_OVF;
+/* generate interrupt */
+}
+
+if (dma_memory_write(_space_memory, s->evtlog_len + s->evtlog_tail,
+   , EVENT_LEN)) {
+IOMMU_DPRINTF(ELOG, "error: fail to write at address 0x%"PRIx64
+  " + offset 0x%"PRIx32, s->evtlog, s->evtlog_tail);
+}
+
+ s->evtlog_tail += EVENT_LEN;
+ *(uint64_t *)>mmior[MMIO_STATUS] |= MMIO_STATUS_COMP_INT;
+}
+
+/* log an error 

[Qemu-devel] [V4 4/4] hw/pci-host: Emulate AMD IO MMU

2016-01-18 Thread David Kiarie
Support AMD IO MMU emulation in q35 and piix chipsets

Signed-off-by: David Kiarie 
---
 hw/pci-host/piix.c | 11 +++
 hw/pci-host/q35.c  | 14 --
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index b0d7e31..3ba245d 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -35,6 +35,7 @@
 #include "hw/i386/ioapic.h"
 #include "qapi/visitor.h"
 #include "qemu/error-report.h"
+#include "hw/i386/amd_iommu.h"
 
 /*
  * I440FX chipset data sheet.
@@ -297,6 +298,16 @@ static void i440fx_pcihost_realize(DeviceState *dev, Error 
**errp)
 
 sysbus_add_io(sbd, 0xcfc, >data_mem);
 sysbus_init_ioports(sbd, 0xcfc, 4);
+
+/* AMD IOMMU (AMD-Vi) */
+if (g_strcmp0(object_property_get_str(qdev_get_machine(), "iommu", NULL),
+  "amd") == 0) {
+AMDIOMMUState *iommu_state;
+PCIDevice *iommu;
+iommu = pci_create_simple(s->bus, 0x20, TYPE_AMD_IOMMU_DEVICE);
+iommu_state = AMD_IOMMU_DEVICE(iommu);
+pci_setup_iommu(s->bus, bridge_host_amd_iommu, iommu_state);
+}
 }
 
 static void i440fx_realize(PCIDevice *dev, Error **errp)
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 1fb4707..0c60e3c 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -30,6 +30,7 @@
 #include "hw/hw.h"
 #include "hw/pci-host/q35.h"
 #include "qapi/visitor.h"
+#include "hw/i386/amd_iommu.h"
 
 /
  * Q35 host
@@ -505,9 +506,18 @@ static void mch_realize(PCIDevice *d, Error **errp)
  mch->pci_address_space, >pam_regions[i+1],
  PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
 }
-/* Intel IOMMU (VT-d) */
-if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
+
+if (g_strcmp0(MACHINE(qdev_get_machine())->iommu, INTEL_IOMMU_STR) == 0) {
+/* Intel IOMMU (VT-d) */
 mch_init_dmar(mch);
+} else if (g_strcmp0(MACHINE(qdev_get_machine())->iommu, AMD_IOMMU_STR)
+   == 0) {
+AMDIOMMUState *iommu_state;
+PCIDevice *iommu;
+PCIBus *bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
+iommu = pci_create_simple(bus, 0x20, TYPE_AMD_IOMMU_DEVICE);
+iommu_state = AMD_IOMMU_DEVICE(iommu);
+pci_setup_iommu(bus, bridge_host_amd_iommu, iommu_state);
 }
 }
 
-- 
2.1.4




Re: [Qemu-devel] [PATCH 1/8] migration: split hmp_savevm to migrate_savevm and hmp_savevm wrapper

2016-01-18 Thread Markus Armbruster
"Denis V. Lunev"  writes:

> This would be useful in the next step when QMP version of this call will
> be introduced. The patch also moves snapshot name generation to the
> hmp specific code as QMP version of this code will require the name
> on the protocol level.
>
> Addition of migration_savevm to migration/migration.h is temporary. It
> will be removed in the next patch with QMP level change.
>
> Signed-off-by: Denis V. Lunev 
> CC: Juan Quintela 
> CC: Eric Blake 
> CC: Amit Shah 
> CC: Markus Armbruster 
> ---
>  hmp.c | 26 
>  include/migration/migration.h |  3 +++
>  migration/savevm.c| 46 
> +--
>  3 files changed, 47 insertions(+), 28 deletions(-)
>
> diff --git a/hmp.c b/hmp.c
> index c2b2c16..e7bab75 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -18,6 +18,7 @@
>  #include "net/eth.h"
>  #include "sysemu/char.h"
>  #include "sysemu/block-backend.h"
> +#include "sysemu/sysemu.h"
>  #include "qemu/option.h"
>  #include "qemu/timer.h"
>  #include "qmp-commands.h"
> @@ -32,6 +33,7 @@
>  #include "ui/console.h"
>  #include "block/qapi.h"
>  #include "qemu-io.h"
> +#include "migration/migration.h"
>  
>  #ifdef CONFIG_SPICE
>  #include 
> @@ -2386,3 +2388,27 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const 
> QDict *qdict)
>  
>  qapi_free_RockerOfDpaGroupList(list);
>  }
> +
> +void hmp_savevm(Monitor *mon, const QDict *qdict)
> +{
> +Error *local_err = NULL;
> +const char *name = qdict_get_try_str(qdict, "name");
> +char name_buf[64];
> +
> +if (name == NULL) {

I'd prefer !name.

> +qemu_timeval tv;
> +struct tm tm;
> +
> +/* cast below needed for OpenBSD where tv_sec is still 'long' */
> +localtime_r((const time_t *)_sec, );

I'm afraid this uses tv.tv_sec uninitialized.  Not sure how this
survived testing :)

Aside: the ugly cast comes from commit d7d9b52.  It works when the
system conforms to POSIX (tv_sec is time_t), or when tv_sec's type is
essentially the same as time_t (supposedly the case in old OpenBSD).  As
far as I can tell, OpenBSD was fixed in 2013.

> +strftime(name_buf, sizeof(name_buf), "vm-%Y%m%d%H%M%S", );
> +
> +name = name_buf;
> +}
> +
> +migrate_savevm(name, _err);
> +
> +if (local_err != NULL) {
> +error_report_err(local_err);
> +}
> +}
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index d9494b8..73c8bb1 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -277,6 +277,8 @@ int migrate_compress_threads(void);
>  int migrate_decompress_threads(void);
>  bool migrate_use_events(void);
>  
> +void migrate_savevm(const char *name, Error **errp);
> +
>  /* Sending on the return path - generic and then for each message type */
>  void migrate_send_rp_message(MigrationIncomingState *mis,
>   enum mig_rp_message_type message_type,
> @@ -321,4 +323,5 @@ int ram_save_queue_pages(MigrationState *ms, const char 
> *rbname,
>  PostcopyState postcopy_state_get(void);
>  /* Set the state and return the old state */
>  PostcopyState postcopy_state_set(PostcopyState new_state);
> +
>  #endif
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 0ad1b93..308302a 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1905,7 +1905,7 @@ int qemu_loadvm_state(QEMUFile *f)
>  return ret;
>  }
>  
> -void hmp_savevm(Monitor *mon, const QDict *qdict)
> +void migrate_savevm(const char *name, Error **errp)
>  {
>  BlockDriverState *bs, *bs1;
>  QEMUSnapshotInfo sn1, *sn = , old_sn1, *old_sn = _sn1;
> @@ -1914,29 +1914,27 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>  int saved_vm_running;
>  uint64_t vm_state_size;
>  qemu_timeval tv;
> -struct tm tm;
> -const char *name = qdict_get_try_str(qdict, "name");
>  Error *local_err = NULL;
>  AioContext *aio_context;
>  
>  if (!bdrv_all_can_snapshot()) {
> -monitor_printf(mon, "Device '%s' is writable but does not "
> -   "support snapshots.\n", bdrv_get_device_name(bs));
> +error_setg(errp,
> +   "Device '%s' is writable but does not support snapshots",
> +   bdrv_get_device_name(bs));
>  return;
>  }
>  
>  /* Delete old snapshots of the same name */
> -if (name && bdrv_all_delete_snapshot(name, , _err) < 0) {
> -monitor_printf(mon,
> -   "Error while deleting snapshot on device '%s': %s\n",
> -   bdrv_get_device_name(bs1), 
> error_get_pretty(local_err));
> +if (bdrv_all_delete_snapshot(name, , _err) < 0) {
> +error_setg(errp, "Error while deleting snapshot on device '%s': %s",
> +   bdrv_get_device_name(bs1), 

[Qemu-devel] [PATCH] hw/misc: slavepci_passthru driver

2016-01-18 Thread Francesco Zuliani
Hi there,

I'd like to submit this new pci driver ( hw/misc )for inclusion,
if you think it could be useful to other as well as ourself.

The driver "worked for our needs" BUT we haven't done extensive
testing and this is our first attempt to submit a patch so I kindly
ask for extra-forgiveness .

The "slavepci_passthru" driver is useful in the scenario described
below to implement a simplified passthru when the host CPU does not
support IOMMU and one is interested only in pci target-mode (slave
devices).

Embedded system cpu (e.g. Atom, AMD G-Series) often lack the VT-d
extensions (IOMMU) needed to be able to pass-thru pci peripherals to
the guest machine (i.e. the pci pass-thru feature cannot be used).

If one is only interested in using the pci board as a pci-target 
(slave device), this driver mmap(s) the host-pci-bars into the guest
within a virtual pci-device.

This is useful in our case for debugging via qemu gsbserver facility
(i.e. '-s' option in qemu) a system running barebone-executable .

Currently the driver assumes the custom pci card has four 32-bit bars
to be mapped (in current patch this is mandatory)

HowTo:
To use the new driver one shall:
- define two environment variables for assigning proper VID and DID to
  associate to the guest pci card
- give the host pci bar address to map in the guest. 

Example Usage:

Let us suppose that we have in the host a slave pci device with the
following 4 bars (i.e. output of lspci -v -s YOUR-CARD | grep Memory)
  Memory at db80 (32-bit, non-prefetchable) [size=4K]
  Memory at db90 (32-bit, non-prefetchable) [size=8K]
  Memory at dba0 (32-bit, non-prefetchable) [size=4K]
  Memory at dbb0 (32-bit, non-prefetchable) [size=4K]

We can map these bars in a guest-pci with VID=0xe33e DID=0x000a using

SLAVEPASSTHRU_VID="0xe33e" SLAVEPASSTHRU_DID="0xa" qemu-system-x86_64 \
  YOUR-SET-OF-FLAGS \
  -device 
slavepassthru,size1=4096,baseaddr1=0xdb90,size2=8192,baseaddr2=0xdba0,size3=4096,baseaddr3=0xdbd0,size4=4096,baseaddr4=0xdbe0

Please note that if your device has less than four bars you can give
the same size and baseaddress to the unused bars.

Thanks
Francesco Zuliani

Actual commit patch:

>From 1371bc4e4681f43a2d02b91ec5d7b84f7ccb1f32 Mon Sep 17 00:00:00 2001
From: Francesco Zuliani 
Date: Mon, 18 Jan 2016 14:26:54 +0100
Subject: [PATCH] hw/misc: slave_pci_passthru driver
Added a slavepci_passthru hw/misc pci-device-driver.

It enables pass-thru in system missing IOMMU feature (e.g. Intel Atom
lacks Vt-d extension). It maps hosts target-mode pci-board bars onto
the guests virtual pci-device bars.

Signed-off-by: Francesco Zuliani 
---
 default-configs/pci.mak |   1 +
 hw/misc/Makefile.objs   |   1 +
 hw/misc/slavepci_passthru.c | 453 
 3 files changed, 455 insertions(+)
 create mode 100644 hw/misc/slavepci_passthru.c

diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index f250119..699a5a5 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -36,4 +36,5 @@ CONFIG_EDU=y
 CONFIG_VGA=y
 CONFIG_VGA_PCI=y
 CONFIG_IVSHMEM=$(CONFIG_POSIX)
+CONFIG_SLAVEPCIPASSTHRU=$(CONFIG_POSIX)
 CONFIG_ROCKER=y
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index d4765c2..e346a15 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -20,6 +20,7 @@ common-obj-$(CONFIG_PUV3) += puv3_pm.o
 common-obj-$(CONFIG_MACIO) += macio/
 
 obj-$(CONFIG_IVSHMEM) += ivshmem.o
+obj-$(CONFIG_SLAVEPCIPASSTHRU) += slavepci_passthru.o
 
 obj-$(CONFIG_REALVIEW) += arm_sysctl.o
 obj-$(CONFIG_NSERIES) += cbus.o
diff --git a/hw/misc/slavepci_passthru.c b/hw/misc/slavepci_passthru.c
new file mode 100644
index 000..ee709b7
--- /dev/null
+++ b/hw/misc/slavepci_passthru.c
@@ -0,0 +1,453 @@
+/*
+ * Host Device PCI-Card Slave Pass-Thru: based on ivshmem "qemu-device"
+ * 
+ *
+ * Author:
+ *  By Francesco Zuliani AT Neat S.r.l. 
+ *
+ * Based On: ivshmem.c
+ *  Original Author Cam Macdonell 
+ *
+ * This code is licensed under the GNU GPL v2.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
+ */
+
+#include "hw/hw.h"
+#include "hw/i386/pc.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msix.h"
+#include "sysemu/kvm.h"
+#include "migration/migration.h"
+#include "qemu/error-report.h"
+#include "qemu/event_notifier.h"
+#include "sysemu/char.h"
+
+#include 
+#include 
+#include 
+#include 
+
+#define PCI_VENDOR_ID_SLAVEPASSTHRU_DEFAULT PCI_VENDOR_ID_REDHAT_QUMRANET
+#define PCI_DEVICE_ID_SLAVEPASSTHRU_DEFAULT 0x
+
+#define STRINGIFY(a)  #a
+
+//#define DEBUG_SLAVEPASSTHRU
+#ifdef DEBUG_SLAVEPASSTHRU
+#define SLAVEPASSTHRU_DPRINTF(fmt, ...)\
+do {printf("SLAVEPASSTHRU: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define SLAVEPASSTHRU_DPRINTF(fmt, ...)
+#endif
+
+#define 

Re: [Qemu-devel] [PATCH] cadence_gem: fix buffer overflow

2016-01-18 Thread Jason Wang


On 01/18/2016 03:04 PM, Peter Crosthwaite wrote:
> On Sun, Jan 17, 2016 at 10:50 PM, Jason Wang  wrote:
>>
>> On 01/14/2016 05:43 PM, Michael S. Tsirkin wrote:
>>> gem_receive copies a packet received from network into an rxbuf[2048]
>>> array on stack, with size limited by descriptor length set by guest.  If
>>> guest is malicious and specifies a descriptor length that is too large,
>>> and should packet size exceed array size, this results in a buffer
>>> overflow.
>>>
>>> Reported-by: 刘令 
>>> Signed-off-by: Michael S. Tsirkin 
>>> ---
>>>  hw/net/cadence_gem.c | 8 
>>>  1 file changed, 8 insertions(+)
>> Apply to my -net with tweak on commit log (changing receive to transmit
>> as noticed).
>>
> As this is actually an unimplemented feature you should change the
> message to a LOG_UNIMP rather than a debug printf.
>
> Regards,
> Peter

Thanks for the reminding. But we need know the whether real device could
send packet whose length is greater than 2048. Do you know the link to
the manual? (Haven't fond it in cadence page.) A hint is the linux
driver (drivers/net/ethernet/cadence/macb.c), use 1500 as its MTU.

>
>> Thanks
>>
>>> diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
>>> index 3639fc1..15a0786 100644
>>> --- a/hw/net/cadence_gem.c
>>> +++ b/hw/net/cadence_gem.c
>>> @@ -862,6 +862,14 @@ static void gem_transmit(CadenceGEMState *s)
>>>  break;
>>>  }
>>>
>>> +if (tx_desc_get_length(desc) > sizeof(tx_packet) - (p - 
>>> tx_packet)) {
>>> +DB_PRINT("TX descriptor @ 0x%x too large: size 0x%x space 
>>> 0x%x\n",
>>> + (unsigned)packet_desc_addr,
>>> + (unsigned)tx_desc_get_length(desc),
>>> + sizeof(tx_packet) - (p - tx_packet));
>>> +break;
>>> +}
>>> +
>>>  /* Gather this fragment of the packet from "dma memory" to our 
>>> contig.
>>>   * buffer.
>>>   */




Re: [Qemu-devel] [PATCHv7 0/9] slirp: Adding IPv6 support to Qemu -net user mode

2016-01-18 Thread Jason Wang


On 01/12/2016 12:04 PM, Jason Wang wrote:
> On 01/12/2016 10:22 AM, Hailiang Zhang wrote:
>> > Hi,
>> >
>> > It seems that, Jan Kiszka is maintaining SLIRP (From MAINTAINERS file),
>> > Maybe he could make a help to merge this series.
>> >
>> > TO: J. Kiszka 
>> >
>> > Thanks,
>> > Hailiang
> Right, and I can take this series in my tree if Jan doesn't have time to
> do this.
>

Ok, apply to my -net.

Thanks



Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] target-ppc: use cpu_write_xer() helper in cpu_post_load

2016-01-18 Thread Mark Cave-Ayland
On 18/01/16 03:12, David Gibson wrote:

> On Fri, Jan 08, 2016 at 01:25:32PM +1100, Alexey Kardashevskiy wrote:
>> On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:
>>> Otherwise some internal xer variables fail to get set post-migration.
>>>
>>> Signed-off-by: Mark Cave-Ayland 
>>> ---
>>>  target-ppc/machine.c |2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/target-ppc/machine.c b/target-ppc/machine.c
>>> index 98fc63a..322ce84 100644
>>> --- a/target-ppc/machine.c
>>> +++ b/target-ppc/machine.c
>>> @@ -168,7 +168,7 @@ static int cpu_post_load(void *opaque, int version_id)
>>>  env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
>>>  env->lr = env->spr[SPR_LR];
>>>  env->ctr = env->spr[SPR_CTR];
>>> -env->xer = env->spr[SPR_XER];
>>> +cpu_write_xer(env, env->spr[SPR_XER]);
>>>  #if defined(TARGET_PPC64)
>>>  env->cfar = env->spr[SPR_CFAR];
>>>  #endif
>>>
>>
>> Reviewed-by: Alexey Kardashevskiy    
> 
> I've merged just this patch into ppc-for-2.6.  The others in the
> series have some problems which have been pointed out elsewhere.

Thanks David.

Any chance of being able to merge the macio VMStateDescription patchset
too at
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00558.html?
(this was sent without an explicit CC as I didn't know you were still
handling ppc-next).

With the macio fixes in place then Mac machine migration should start to
work once the respin of this series is applied.


Many thanks,

Mark.




Re: [Qemu-devel] [PATCH] migration: not send zero page header in ram bulk stage

2016-01-18 Thread Hailiang Zhang

On 2016/1/16 2:57, Dr. David Alan Gilbert wrote:

* Liang Li (liang.z...@intel.com) wrote:

Now that VM's RAM pages are initialized to zero, (VM's RAM is allcated
with the mmap() and MAP_ANONYMOUS option, or mmap() without MAP_SHARED
if hugetlbfs is used.) so there is no need to send the zero page header
to destination.

For guest just uses a small portions of RAM, this change can avoid
allocating all the guest's RAM pages in the destination node after
live migration. Another benefit is destination QEMU can save lots of
CPU cycles for zero page checking.


I think this would break postcopy, because the zero pages wouldn't be
filled in, so accessing them would still generate a userfault.
So you'd have to disable this optimisation if postcopy is enabled
(even during the precopy bulk stage).

Also, are you sure about the benefits?
  Destination guests RAM should not be allocated on receiving a zero
page; see ram_handle_compressed, it doesn't write to the page if
it's zero, so it shouldn't cause an allocate.  I think you're probably
correct about the zero page test on the destination, I wonder if we
can speed that up.



Yes, we have already optimize the zero page allocation in destination.
but this patch can reduce the amount of data that transferred and the
time of checking zero page, which can reduce the migration time.



Dave



Signed-off-by: Liang Li 
---
  migration/ram.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 4e606ab..c4821d1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -705,10 +705,12 @@ static int save_zero_page(QEMUFile *f, RAMBlock *block, 
ram_addr_t offset,

  if (is_zero_range(p, TARGET_PAGE_SIZE)) {
  acct_info.dup_pages++;
-*bytes_transferred += save_page_header(f, block,
-   offset | 
RAM_SAVE_FLAG_COMPRESS);
-qemu_put_byte(f, 0);
-*bytes_transferred += 1;
+if (!ram_bulk_stage) {
+*bytes_transferred += save_page_header(f, block, offset |
+   RAM_SAVE_FLAG_COMPRESS);
+qemu_put_byte(f, 0);
+*bytes_transferred += 1;
+}
  pages = 1;
  }

--
1.9.1


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

.







Re: [Qemu-devel] [PATCHv3 8/9] pseries: Clean up error reporting in ppc_spapr_init()

2016-01-18 Thread Thomas Huth
On 18.01.2016 05:24, David Gibson wrote:
> This function includes a number of explicit fprintf()s for errors.
> Change these to use error_report() instead.
> 
> Also replace the single exit(EXIT_FAILURE) with an explicit exit(1), since
> the latter is the more usual idiom in qemu by a large margin.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr.c | 25 +
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 148ca5a..58f26cd 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1789,8 +1789,8 @@ static void ppc_spapr_init(MachineState *machine)
>  }
>  
>  if (spapr->rma_size > node0_size) {
> -fprintf(stderr, "Error: Numa node 0 has to span the RMA 
> (%#08"HWADDR_PRIx")\n",
> -spapr->rma_size);
> +error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
> + spapr->rma_size);
>  exit(1);
>  }
>  
> @@ -1856,10 +1856,10 @@ static void ppc_spapr_init(MachineState *machine)
>  ram_addr_t hotplug_mem_size = machine->maxram_size - 
> machine->ram_size;
>  
>  if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
> -error_report("Specified number of memory slots %" PRIu64
> - " exceeds max supported %d",
> - machine->ram_slots, SPAPR_MAX_RAM_SLOTS);
> -exit(EXIT_FAILURE);
> +error_report("Specified number of memory slots %"
> + PRIu64" exceeds max supported %d",
> +machine->ram_slots, SPAPR_MAX_RAM_SLOTS);

Why did you change the indentation of the "machine->ram_slots, ..." line
here? The original looked better to me.

> +exit(1);

EXIT_FAILURE still seems to be used quite often in the QEMU sources...
All in all, this hunk does not really change anything from a functional
point of view, so I'd like to suggest to omit this hunk completely
instead to avoid code churn here.

 Thomas




Re: [Qemu-devel] usb-storage assertions

2016-01-18 Thread Gerd Hoffmann
On Fr, 2016-01-15 at 21:08 +0300, Andrey Korolyov wrote:
> Just checked, Linux usb driver decided to lose a disk during a
> 'stress-test' over unpacking linux source instead of triggering an
> assertion in 2.5 (and to irreparably damage its ext4 as well),

Ok, so behavior changed here from 2.2 -> 2.5.  I don't see disk
corruption in my tests, but linux guest resets the usb-storage device
now and then.  Seems to be able to resume operations though, there are
no disk errors in the logs.

> NetBSD
> 7.0 reboot action hangs on USB_RESET and NetBSD 5.1 triggers second of
> mentioned asserts. Backend itself is a regular USB2-SATA adapter with
> Intel S3500 SSD, hence it should not trigger first timing-related
> assertion at all.

ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
arch is this?  i386?  x86_64?

cheers,
  Gerd




Re: [Qemu-devel] [PATCH] man: virtfs-proxy-helper: Fix 'capbilities' typo

2016-01-18 Thread Christophe Fergeau
On Fri, Jan 15, 2016 at 04:31:45PM -0700, Eric Blake wrote:
> On 01/12/2016 06:39 AM, Christophe Fergeau wrote:
> > Signed-off-by: Christophe Fergeau 
> > ---
> >  fsdev/virtfs-proxy-helper.texi | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fsdev/virtfs-proxy-helper.texi b/fsdev/virtfs-proxy-helper.texi
> > index e60e3b9..1b905a0 100644
> > --- a/fsdev/virtfs-proxy-helper.texi
> > +++ b/fsdev/virtfs-proxy-helper.texi
> > @@ -29,7 +29,7 @@ driver sends filesystem request to proxy helper and 
> > receives the
> >  response from it.
> >  
> >  Proxy helper is designed so that it can drop the root privilege with
> > -retaining capbilities needed for doing filesystem operations only.
> > +retaining capabilities needed for doing filesystem operations only.
> 
> Still reads awkwardly; better might be:
> 
> The proxy helper is designed so that it can drop root privileges except
> for the capabilities needed for doing filesystem operations.

Ok, I'll send a v2 changing the sentence to your suggestion, thanks!

Christophe


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH v2] Add optionrom compatible with fw_cfg DMA version

2016-01-18 Thread Marc Marí
This optionrom is based on linuxboot.S.

Signed-off-by: Marc Marí 
---
 .gitignore|   4 +
 hw/i386/pc.c  |   9 +-
 hw/nvram/fw_cfg.c |   2 +-
 include/hw/nvram/fw_cfg.h |   1 +
 pc-bios/optionrom/Makefile|   8 +-
 pc-bios/optionrom/linuxboot_dma.c | 262 ++
 pc-bios/optionrom/optionrom.h |   4 +-
 7 files changed, 285 insertions(+), 5 deletions(-)
 create mode 100644 pc-bios/optionrom/linuxboot_dma.c

diff --git a/.gitignore b/.gitignore
index 88a80ff..101d1e0 100644
--- a/.gitignore
+++ b/.gitignore
@@ -94,6 +94,10 @@
 /pc-bios/optionrom/linuxboot.bin
 /pc-bios/optionrom/linuxboot.raw
 /pc-bios/optionrom/linuxboot.img
+/pc-bios/optionrom/linuxboot_dma.asm
+/pc-bios/optionrom/linuxboot_dma.bin
+/pc-bios/optionrom/linuxboot_dma.raw
+/pc-bios/optionrom/linuxboot_dma.img
 /pc-bios/optionrom/multiboot.asm
 /pc-bios/optionrom/multiboot.bin
 /pc-bios/optionrom/multiboot.raw
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 459260b..00339fa 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms,
 fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
 
-option_rom[nb_option_roms].name = "linuxboot.bin";
-option_rom[nb_option_roms].bootindex = 0;
+if (fw_cfg_dma_enabled(fw_cfg)) {
+option_rom[nb_option_roms].name = "linuxboot_dma.bin";
+option_rom[nb_option_roms].bootindex = 0;
+} else {
+option_rom[nb_option_roms].name = "linuxboot.bin";
+option_rom[nb_option_roms].bootindex = 0;
+}
 nb_option_roms++;
 }
 
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index a1d650d..d0a5753 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -546,7 +546,7 @@ static bool is_version_1(void *opaque, int version_id)
 return version_id == 1;
 }
 
-static bool fw_cfg_dma_enabled(void *opaque)
+bool fw_cfg_dma_enabled(void *opaque)
 {
 FWCfgState *s = opaque;
 
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 664eaf6..953e58d 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -219,6 +219,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
  hwaddr dma_addr, AddressSpace *dma_as);
 
 FWCfgState *fw_cfg_find(void);
+bool fw_cfg_dma_enabled(void *opaque);
 
 #endif /* NO_QEMU_PROTOS */
 
diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
index ce4852a..076f351 100644
--- a/pc-bios/optionrom/Makefile
+++ b/pc-bios/optionrom/Makefile
@@ -2,6 +2,7 @@ all: build-all
 # Dummy command so that make thinks it has done something
@true
 
+BULD_DIR=$(CURDIR)
 include ../../config-host.mak
 include $(SRC_PATH)/rules.mak
 
@@ -11,15 +12,20 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/optionrom)
 
 CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin
 CFLAGS += -I$(SRC_PATH)
+CFLAGS += -I$(SRC_PATH)/include
 CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector)
 CFLAGS += $(CFLAGS_NOPIE)
+CFLAGS += -m32
 QEMU_CFLAGS = $(CFLAGS)
 
-build-all: multiboot.bin linuxboot.bin kvmvapic.bin
+build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin
 
 # suppress auto-removal of intermediate files
 .SECONDARY:
 
+linuxboot_dma.img: linuxboot_dma.o
+   $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -static -Ttext 
0 -e _start -s -o $@ $<,"  Building $(TARGET_DIR)$@")
+
 %.img: %.o
$(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e _start -s -o $@ 
$<,"  Building $(TARGET_DIR)$@")
 
diff --git a/pc-bios/optionrom/linuxboot_dma.c 
b/pc-bios/optionrom/linuxboot_dma.c
new file mode 100644
index 000..9c355fd
--- /dev/null
+++ b/pc-bios/optionrom/linuxboot_dma.c
@@ -0,0 +1,262 @@
+/*
+ * Linux Boot Option ROM for fw_cfg DMA
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ *
+ * Copyright (c) 2015 Red Hat Inc.
+ *   Authors: Marc Marí 
+ */
+
+asm(
+".text\n"
+".global _start\n"
+"_start:\n"
+"   .short 0xaa55\n"
+"   .byte (_end - _start) / 512\n"
+"   lret\n"
+"   .org 0x18\n"
+"   .short 0\n"
+"   .short _pnph\n"
+"_pnph:\n"
+"   .ascii \"$PnP\"\n"
+"   .byte 0x01\n"
+"   .byte ( _pnph_len / 16 )\n"
+"   .short 0x\n"
+"   .byte 0x00\n"

Re: [Qemu-devel] CMSG_SPACE() causing compile time error on Mac OS X

2016-01-18 Thread Daniel P. Berrange
On Sun, Jan 17, 2016 at 05:23:44PM -0500, Programmingkid wrote:
> I was wondering if you had problems compiling QEMU on Mac OS X recently. On 
> my system, the channel-socket.c file causes this error:
> 
> io/channel-socket.c: In function 'qio_channel_socket_writev':
> io/channel-socket.c:497:18: error: variable-sized object may not be 
> initialized
>  char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)] = { 0 };

Can you try applying this patch:

diff --git a/io/channel-socket.c b/io/channel-socket.c
index eaa411f..bc117b1 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -493,10 +495,12 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
 QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
 ssize_t ret;
 struct msghdr msg = { NULL, };
-char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)] = { 0 };
+char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)];
 size_t fdsize = sizeof(int) * nfds;
 struct cmsghdr *cmsg;
 
+memset(control, 0, CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS));
+
 msg.msg_iov = (struct iovec *)iov;
 msg.msg_iovlen = niov;
 


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH] qemu-char: avoid leak in qemu_chr_open_pp_fd

2016-01-18 Thread Daniel P. Berrange
On Mon, Jan 18, 2016 at 11:25:51AM +0100, Paolo Bonzini wrote:
> drv leaks if qemu_chr_alloc returns an error.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  qemu-char.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v16 14/14] vfio: add 'aer' property to expose aercap

2016-01-18 Thread Marcel Apfelbaum

On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 

add 'aer' property to let user able to decide whether expose
the aer capability. by default we should disable aer feature,
because it needs configuration restrictions.

Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index efa5e01..9afb5e0 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3119,6 +3119,8 @@ static Property vfio_pci_dev_properties[] = {
 sub_vendor_id, PCI_ANY_ID),
  DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice,
 sub_device_id, PCI_ANY_ID),
+DEFINE_PROP_BIT("aer", VFIOPCIDevice, features,
+VFIO_FEATURE_ENABLE_AER_BIT, false),
  /*
   * TODO - support passed fds... is this necessary?
   * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),




Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [Qemu-devel] [Qemu-block] [PATCH 4/4] block/qapi: Emit tray_open only if there is a tray

2016-01-18 Thread Alberto Garcia
On Tue 12 Jan 2016 04:47:54 PM CET, Max Reitz wrote:

> Signed-off-by: Max Reitz 
> ---
>  block/qapi.c   | 2 +-
>  tests/qemu-iotests/067.out | 4 
>  2 files changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/block/qapi.c b/block/qapi.c
> index 58d3975..12a0f25 100644
> --- a/block/qapi.c
> +++ b/block/qapi.c
> @@ -299,7 +299,7 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo 
> **p_info,
>  info->locked = blk_dev_is_medium_locked(blk);
>  info->removable = blk_dev_has_removable_media(blk);
>  
> -if (blk_dev_has_removable_media(blk)) {
> +if (blk_dev_has_tray(blk)) {
>  info->has_tray_open = true;
>  info->tray_open = blk_dev_is_tray_open(blk);
>  }

It probably makes sense to update the documentation as well:

# @tray_open: #optional True if the device has a tray and it is open
# (only present if removable is true)

Berto



Re: [Qemu-devel] [PATCH 03/10] pseries: Clean up hash page table allocation error handling

2016-01-18 Thread David Gibson
On Mon, Jan 18, 2016 at 05:04:55PM +1100, Alexey Kardashevskiy wrote:
> On 01/18/2016 04:35 PM, David Gibson wrote:
> >On Mon, Jan 18, 2016 at 04:17:08PM +1100, Alexey Kardashevskiy wrote:
> >>On 01/18/2016 03:42 PM, David Gibson wrote:
> >>>On Mon, Jan 18, 2016 at 01:44:00PM +1100, Alexey Kardashevskiy wrote:
> On 01/15/2016 11:00 PM, David Gibson wrote:
> >The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
> >all errors with error_setg(_abort, ...).
> >
> >But really, the callers are really better placed to decide on the error
> >handling.  So, instead make the functions use the error propagation
> >infrastructure.
> >
> >In the callers we change to _fatal instead of _abort, since
> >this can be triggered by a bad configuration or kernel error rather than
> >indicating a programming error in qemu.
> >
> >While we're at it improve the messages themselves a bit, and clean up the
> >indentation a little.
> >
> >Signed-off-by: David Gibson 
> >---
> >  hw/ppc/spapr.c | 24 
> >  1 file changed, 16 insertions(+), 8 deletions(-)
> >
> >diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >index b7fd09a..d28e349 100644
> >--- a/hw/ppc/spapr.c
> >+++ b/hw/ppc/spapr.c
> >@@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU 
> >*cpu)
> >  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
> > tswap64(~HPTE64_V_HPTE_DIRTY))
> >  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
> > tswap64(HPTE64_V_HPTE_DIRTY))
> >
> >-static void spapr_alloc_htab(sPAPRMachineState *spapr)
> >+static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
> >  {
> >  long shift;
> >  int index;
> >@@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState 
> >*spapr)
> >   * For HV KVM, host kernel will return -ENOMEM when requested
> >   * HTAB size can't be allocated.
> >   */
> >-error_setg(_abort, "Failed to allocate HTAB of requested 
> >size, try with smaller maxmem");
> >+error_setg_errno(errp, -shift,
> >+ "Error allocating KVM hash page table, try 
> >smaller maxmem");
> >  } else if (shift > 0) {
> >  /*
> >   * Kernel handles htab, we don't need to allocate one
> >@@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState 
> >*spapr)
> >   * but we don't allow booting of such guests.
> >   */
> >  if (shift != spapr->htab_shift) {
> >-error_setg(_abort, "Failed to allocate HTAB of 
> >requested size, try with smaller maxmem");
> >+error_setg(errp,
> >+"Small allocation for KVM hash page table (%ld < %"
> >+PRIu32 "), try smaller maxmem",
> 
> 
> 
> Even though it is not in the CODING_STYLE, I have not seen anyone 
> objecting
> the very good kernel's "never break user-visible strings" rule or 
> rejecting
> patches with user-visible strings failing to fit 80 chars limit.
> >>>
> >>>I'm not.  Or rather, the string is already broken by the PRIu32, so
> >>>the newline doesn't make it any less greppable.
> >>
> >>
> >>"KVM hash page table.*smaller maxmem" stopped working. Not a big deal but I
> >>do not see any win in breaking strings anyway.
> >
> >The problem is that the current pre-commit hooks don't agree with
> >you.  They seem to allow long unbroken strings, but if there's a break
> >like the PRIu32, they won't permit the commit.
> 
> 
> checkpatch.pl reports it as "WARNING: line over 80 characters", not an
> ERROR, so I'd say the hook has a problem.

Well, maybe, but it's not a problem I'm really inclined to tackle at
this time.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 1/7] target-ppc: kvm: fix floating point registers sync on little-endian hosts

2016-01-18 Thread Greg Kurz
On Mon, 18 Jan 2016 13:16:44 +1100
David Gibson  wrote:

> On Fri, Jan 15, 2016 at 04:00:12PM +0100, Greg Kurz wrote:
> > On VSX capable CPUs, the 32 FP registers are mapped to the high-bits
> > of the 32 first VSX registers. So if you have:
> > 
> > VSR31 = (uint128) 0x0102030405060708090a0b0c0d0e0f00
> > 
> > then
> > 
> > FPR31 = (uint64) 0x0102030405060708
> > 
> > The kernel stores the VSX registers in the fp_state struct following the
> > host endian element ordering.
> > 
> > On big-endian:
> > 
> > fp_state.fpr[31][0] = 0x0102030405060708
> > fp_state.fpr[31][1] = 0x090a0b0c0d0e0f00
> > 
> > On little-endian:
> > 
> > fp_state.fpr[31][0] = 0x090a0b0c0d0e0f00
> > fp_state.fpr[31][1] = 0x0102030405060708
> > 
> > The KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctls preserve this ordering, but
> > QEMU considers it as big-endian and always copies element [0] to the
> > fpr[] array and element [1] to the vsr[] array. This does not work with
> > little-endian hosts, and you will get:
> > 
> > (qemu) p $f31
> > 0x90a0b0c0d0e0f00
> > 
> > instead of:
> > 
> > (qemu) p $f31
> > 0x102030405060708
> > 
> > This patch fixes the element ordering for little-endian hosts.
> > 
> > Signed-off-by: Greg Kurz   
> 
> If I'm understanding correctly, the only reason this bug didn't affect
> things other than the gdbstub is because the get and put routines had

Well it is not only gdbstub actually... as showed in the changelog, it also
affects the QEMU monitor which outputs wrong values since it calls kvm_get_fpu()
as well.

> mirrored bugs.  So although qemu ended up with definitely wrong
> information in its internal state, it reshuffled it to be right on
> setting it back into KVM.
> 
> Is that correct?
> 

My guess is that the bug only affects gdbstub and ppc_cpu_dump_state(), because
these are the only cases where QEMU parses the state of FP registers... this
is indeed confirmed by the KVM bug you are referring to, that had no visible
effect for more than a year BTW.

> > ---
> >  target-ppc/kvm.c |   12 
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> > index 9940a9046220..45249990bda1 100644
> > --- a/target-ppc/kvm.c
> > +++ b/target-ppc/kvm.c
> > @@ -650,8 +650,13 @@ static int kvm_put_fp(CPUState *cs)
> >  for (i = 0; i < 32; i++) {
> >  uint64_t vsr[2];
> >  
> > +#ifdef HOST_WORDS_BIGENDIAN
> >  vsr[0] = float64_val(env->fpr[i]);
> >  vsr[1] = env->vsr[i];
> > +#else
> > +vsr[0] = env->vsr[i];
> > +vsr[1] = float64_val(env->fpr[i]);
> > +#endif
> >  reg.addr = (uintptr_t) 
> >  reg.id = vsx ? KVM_REG_PPC_VSR(i) : KVM_REG_PPC_FPR(i);
> >  
> > @@ -721,10 +726,17 @@ static int kvm_get_fp(CPUState *cs)
> >  vsx ? "VSR" : "FPR", i, strerror(errno));
> >  return ret;
> >  } else {
> > +#ifdef HOST_WORDS_BIGENDIAN
> >  env->fpr[i] = vsr[0];
> >  if (vsx) {
> >  env->vsr[i] = vsr[1];
> >  }
> > +#else
> > +env->fpr[i] = vsr[1];
> > +if (vsx) {
> > +env->vsr[i] = vsr[0];
> > +}
> > +#endif
> >  }
> >  }
> >  }
> >   
> 




Re: [Qemu-devel] [PATCH] migration: not send zero page header in ram bulk stage

2016-01-18 Thread Hailiang Zhang

Hi,

On 2016/1/15 18:24, Li, Liang Z wrote:

It seems that this patch is incorrect, if the no-zero pages are zeroed again
during !ram_bulk_stage, we didn't send the new zeroed page, there will be
an error.



If not in ram_bulk_stage, still send the header, could you explain why it's 
wrong?

Liang



I have made a mistake, and yes, this patch can speed up the live migration time,
especially when there are many zero pages, it will be more obvious.
I like this idea. Did you test it with postcopy ? Does it break postcopy ?

Thanks,
zhanghailiang


For guest just uses a small portions of RAM, this change can avoid
allocating all the guest's RAM pages in the destination node after
live migration. Another benefit is destination QEMU can save lots of
CPU cycles for zero page checking.

Signed-off-by: Liang Li 
---
   migration/ram.c | 10 ++
   1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c index 4e606ab..c4821d1
100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -705,10 +705,12 @@ static int save_zero_page(QEMUFile *f,

RAMBlock

*block, ram_addr_t offset,

   if (is_zero_range(p, TARGET_PAGE_SIZE)) {
   acct_info.dup_pages++;
-*bytes_transferred += save_page_header(f, block,
-   offset | 
RAM_SAVE_FLAG_COMPRESS);
-qemu_put_byte(f, 0);
-*bytes_transferred += 1;
+if (!ram_bulk_stage) {
+*bytes_transferred += save_page_header(f, block, offset |
+   RAM_SAVE_FLAG_COMPRESS);
+qemu_put_byte(f, 0);
+*bytes_transferred += 1;
+}
   pages = 1;
   }








.







Re: [Qemu-devel] [PATCH] cadence_gem: fix buffer overflow

2016-01-18 Thread Peter Crosthwaite
On Mon, Jan 18, 2016 at 12:12 AM, Jason Wang  wrote:
>
>
> On 01/18/2016 03:04 PM, Peter Crosthwaite wrote:
>> On Sun, Jan 17, 2016 at 10:50 PM, Jason Wang  wrote:
>>>
>>> On 01/14/2016 05:43 PM, Michael S. Tsirkin wrote:
 gem_receive copies a packet received from network into an rxbuf[2048]
 array on stack, with size limited by descriptor length set by guest.  If
 guest is malicious and specifies a descriptor length that is too large,
 and should packet size exceed array size, this results in a buffer
 overflow.

 Reported-by: 刘令 
 Signed-off-by: Michael S. Tsirkin 
 ---
  hw/net/cadence_gem.c | 8 
  1 file changed, 8 insertions(+)
>>> Apply to my -net with tweak on commit log (changing receive to transmit
>>> as noticed).
>>>
>> As this is actually an unimplemented feature you should change the
>> message to a LOG_UNIMP rather than a debug printf.
>>
>> Regards,
>> Peter
>
> Thanks for the reminding. But we need know the whether real device could
> send packet whose length is greater than 2048. Do you know the link to
> the manual? (Haven't fond it in cadence page.) A hint is the linux

Xilinx UG585 has details:

http://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf

Regards,
Peter

> driver (drivers/net/ethernet/cadence/macb.c), use 1500 as its MTU.
>
>>
>>> Thanks
>>>
 diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
 index 3639fc1..15a0786 100644
 --- a/hw/net/cadence_gem.c
 +++ b/hw/net/cadence_gem.c
 @@ -862,6 +862,14 @@ static void gem_transmit(CadenceGEMState *s)
  break;
  }

 +if (tx_desc_get_length(desc) > sizeof(tx_packet) - (p - 
 tx_packet)) {
 +DB_PRINT("TX descriptor @ 0x%x too large: size 0x%x space 
 0x%x\n",
 + (unsigned)packet_desc_addr,
 + (unsigned)tx_desc_get_length(desc),
 + sizeof(tx_packet) - (p - tx_packet));
 +break;
 +}
 +
  /* Gather this fragment of the packet from "dma memory" to our 
 contig.
   * buffer.
   */
>



Re: [Qemu-devel] [PATCHv3 5/9] pseries: Cleanup error handling in spapr_vga_init()

2016-01-18 Thread Thomas Huth
On 18.01.2016 05:24, David Gibson wrote:
> Use error_setg() to return an error rather than an explicit exit().
> Previously it was an exit(0) instead of a non-zero exit code, which was
> simply a bug.  Also improve the error message.
> 
> While we're at it change the type of spapr_vga_init() to bool since that's
> how we're using it anyway.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 87097bc..bb5eaa5 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1246,7 +1246,7 @@ static void spapr_rtc_create(sPAPRMachineState *spapr)
>  }
>  
>  /* Returns whether we want to use VGA or not */
> -static int spapr_vga_init(PCIBus *pci_bus)
> +static bool spapr_vga_init(PCIBus *pci_bus, Error **errp)
>  {
>  switch (vga_interface_type) {
>  case VGA_NONE:
> @@ -1257,9 +1257,9 @@ static int spapr_vga_init(PCIBus *pci_bus)
>  case VGA_VIRTIO:
>  return pci_vga_init(pci_bus) != NULL;
>  default:
> -fprintf(stderr, "This vga model is not supported,"
> -"currently it only supports -vga std\n");
> -exit(0);
> +error_setg(errp,
> +   "Unsupported VGA mode, only -vga std or -vga virtio is 
> supported");
> +return false;
>  }
>  }
>  
> @@ -1934,7 +1934,7 @@ static void ppc_spapr_init(MachineState *machine)
>  }
>  
>  /* Graphics */
> -if (spapr_vga_init(phb->bus)) {
> +if (spapr_vga_init(phb->bus, _fatal)) {
>  spapr->has_graphics = true;
>  machine->usb |= defaults_enabled() && !machine->usb_disabled;
>  }

Reviewed-by: Thomas Huth 




Re: [Qemu-devel] usb-storage assertions

2016-01-18 Thread Andrey Korolyov
On Mon, Jan 18, 2016 at 12:38 PM, Gerd Hoffmann  wrote:
> On Fr, 2016-01-15 at 21:08 +0300, Andrey Korolyov wrote:
>> Just checked, Linux usb driver decided to lose a disk during a
>> 'stress-test' over unpacking linux source instead of triggering an
>> assertion in 2.5 (and to irreparably damage its ext4 as well),
>
> Ok, so behavior changed here from 2.2 -> 2.5.  I don't see disk
> corruption in my tests, but linux guest resets the usb-storage device
> now and then.  Seems to be able to resume operations though, there are
> no disk errors in the logs.

Thanks for checking, I`ve seen a couple of bus resets as well and
suddenly guest froze with 'fs remounted as ro' message splat over
physical console. Of course I didn`t set any external kernel logging
or pstore at a time and I could only assume that same timeouts was a
general cause of a corruption itself (more likely some inflights was
lucky enough to finish while others did not made it, but I don`t see a
way to corrupt a superblock *that bad*).

>
>> NetBSD
>> 7.0 reboot action hangs on USB_RESET and NetBSD 5.1 triggers second of
>> mentioned asserts. Backend itself is a regular USB2-SATA adapter with
>> Intel S3500 SSD, hence it should not trigger first timing-related
>> assertion at all.
>
> ok.  Had no trouble with freebsd, will go fetch netbsd images.  What
> arch is this?  i386?  x86_64?

i386 7.0 for the reference, but I`m sure that this wouldn`t matter in
any way. Just to mention - ancient 2.6, like 2.6.18, are actually
doing things quite faster using same frontend+backend combination, may
be due to lack of proper timeout checks... Actually there is a very
small chance that the real performance regression was introduced
during further development, so I instead believe in improper
interaction of a newer guest EHCI driver and qemu frontend. Please let
me know if any countable measurements like fio could be a matter of
interest - I don`t think that many people are concerned about USB/USB2
frontend performance at all, since they are bringing in a ton of
unwelcomed wakeups and the one thing which could be a matter of
concern in that case is an emulated xHCI, IMHO.

>
> cheers,
>   Gerd
>



Re: [Qemu-devel] [PATCH] cadence_gem: fix buffer overflow

2016-01-18 Thread Jason Wang


On 01/18/2016 05:08 PM, Peter Crosthwaite wrote:
> On Mon, Jan 18, 2016 at 12:12 AM, Jason Wang  wrote:
>>
>> On 01/18/2016 03:04 PM, Peter Crosthwaite wrote:
>>> On Sun, Jan 17, 2016 at 10:50 PM, Jason Wang  wrote:
 On 01/14/2016 05:43 PM, Michael S. Tsirkin wrote:
> gem_receive copies a packet received from network into an rxbuf[2048]
> array on stack, with size limited by descriptor length set by guest.  If
> guest is malicious and specifies a descriptor length that is too large,
> and should packet size exceed array size, this results in a buffer
> overflow.
>
> Reported-by: 刘令 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/net/cadence_gem.c | 8 
>  1 file changed, 8 insertions(+)
 Apply to my -net with tweak on commit log (changing receive to transmit
 as noticed).

>>> As this is actually an unimplemented feature you should change the
>>> message to a LOG_UNIMP rather than a debug printf.
>>>
>>> Regards,
>>> Peter
>> Thanks for the reminding. But we need know the whether real device could
>> send packet whose length is greater than 2048. Do you know the link to
>> the manual? (Haven't fond it in cadence page.) A hint is the linux
> Xilinx UG585 has details:
>
> http://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf
>
> Regards,
> Peter
>
>

Thanks for the pointer.

In section 16.1.5, it said

"Jumbo frames are not supported."

So it was in fact not an unimplemented feature?



Re: [Qemu-devel] [Qemu-arm] [PATCH] cadence_gem: fix buffer overflow

2016-01-18 Thread Peter Maydell
On 18 January 2016 at 09:57, Jason Wang  wrote:
> Thanks for the pointer.
>
> In section 16.1.5, it said
>
> "Jumbo frames are not supported."
>
> So it was in fact not an unimplemented feature?

I have a vague feeling from the last time I looked at something like
this that what often happens is the hardware happily goes on dumping
data out on the wire but this is a violation of the ethernet protocol,
ie a guest error. But I'm no ethernet expert...

thanks
-- PMM



Re: [Qemu-devel] [PATCHv3 8/9] pseries: Clean up error reporting in ppc_spapr_init()

2016-01-18 Thread Markus Armbruster
Thomas Huth  writes:

> On 18.01.2016 05:24, David Gibson wrote:
>> This function includes a number of explicit fprintf()s for errors.
>> Change these to use error_report() instead.
>> 
>> Also replace the single exit(EXIT_FAILURE) with an explicit exit(1), since
>> the latter is the more usual idiom in qemu by a large margin.
>> 
>> Signed-off-by: David Gibson 
>> ---
>>  hw/ppc/spapr.c | 25 +
>>  1 file changed, 13 insertions(+), 12 deletions(-)
>> 
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 148ca5a..58f26cd 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1789,8 +1789,8 @@ static void ppc_spapr_init(MachineState *machine)
>>  }
>>  
>>  if (spapr->rma_size > node0_size) {
>> -fprintf(stderr, "Error: Numa node 0 has to span the RMA 
>> (%#08"HWADDR_PRIx")\n",
>> -spapr->rma_size);
>> +error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
>> + spapr->rma_size);
>>  exit(1);
>>  }
>>  
>> @@ -1856,10 +1856,10 @@ static void ppc_spapr_init(MachineState *machine)
>>  ram_addr_t hotplug_mem_size = machine->maxram_size - 
>> machine->ram_size;
>>  
>>  if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
>> -error_report("Specified number of memory slots %" PRIu64
>> - " exceeds max supported %d",
>> - machine->ram_slots, SPAPR_MAX_RAM_SLOTS);
>> -exit(EXIT_FAILURE);
>> +error_report("Specified number of memory slots %"
>> + PRIu64" exceeds max supported %d",
>> +machine->ram_slots, SPAPR_MAX_RAM_SLOTS);
>
> Why did you change the indentation of the "machine->ram_slots, ..." line
> here? The original looked better to me.

Agreed.

>> +exit(1);
>
> EXIT_FAILURE still seems to be used quite often in the QEMU sources...
> All in all, this hunk does not really change anything from a functional
> point of view, so I'd like to suggest to omit this hunk completely
> instead to avoid code churn here.

It makes the code locally consistent, so I'd keep it.



Re: [Qemu-devel] [PATCH v2] s390: use FILE instead of QEMUFile for creating text file

2016-01-18 Thread Cornelia Huck
On Mon, 18 Jan 2016 09:50:32 +
"Daniel P. Berrange"  wrote:

> On Fri, Jan 15, 2016 at 04:29:32PM -0700, Eric Blake wrote:
> > On 01/12/2016 05:59 AM, Daniel P. Berrange wrote:

> > > @@ -124,8 +120,14 @@ void qmp_dump_skeys(const char *filename, Error 
> > > **errp)
> > >  return;
> > >  }
> > >  
> > > -f = qemu_fopen(filename, "wb");
> > > +fd = qemu_open(filename, O_WRONLY|O_CREAT|O_TRUNC, 0600);
> > 
> > A strict conversion should probably include O_BINARY for mingw (where
> > "wb" turns on binary mode).  But maybe we should just make qemu_open()
> > itself _always_ provide O_BINARY so that callers don't have to worry
> > about it - do we really have a reason to open a file on mingw where we
> > want \r\n munged into \n due to text mode?
> 
> We're writing text data to the file, so I figured that using binary
> mode would be wrong.
> 
> > 
> > > +if (fd < 0) {
> > > +error_setg_file_open(errp, errno, filename);
> > > +return;
> > > +}
> > > +f = fdopen(fd, "wb");
> > >  if (!f) {
> > > +close(fd);
> > >  error_setg_file_open(errp, errno, filename);
> > 
> > close() may corrupt errno, resulting in a report of the wrong message.
> > Swap these two lines.
> 
> Ok

Daniel, if you send a respin, could you please put Jason on cc: as
well? He can test this quicker than I can :)




Re: [Qemu-devel] [Qemu-block] [PATCH 1/4] block: Add blk_dev_has_tray()

2016-01-18 Thread Alberto Garcia
On Tue 12 Jan 2016 04:47:51 PM CET, Max Reitz  wrote:
> Pull out the check whether a block device has a tray from
> blk_dev_is_tray_open() into an own function so both attributes (whether
> there is a tray vs. whether that tray is open) can be queried
> independently.
>
> Cc: qemu-stable 
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH v4] linux-user: add signalfd/signalfd4 syscalls

2016-01-18 Thread Paolo Bonzini


On 02/10/2015 14:48, Laurent Vivier wrote:
> +target_fd_trans = g_realloc(target_fd_trans,
> +target_fd_max * sizeof(TargetFdTrans));

This should be TargetFdTrans * (reported by Coverity).  Even better you
could use g_renew.

It's harmless because sizeof(TargetFdTrans) > sizeof(TargetFdTrans *),
but it should be fixed nevertheless.

Paolo

> +memset((void *)(target_fd_trans + oldmax), 0,
> +   (target_fd_max - oldmax) * sizeof(TargetFdTrans *));



Re: [Qemu-devel] [PATCH v4] linux-user: add signalfd/signalfd4 syscalls

2016-01-18 Thread Laurent Vivier


Le 18/01/2016 11:30, Paolo Bonzini a écrit :
> 
> 
> On 02/10/2015 14:48, Laurent Vivier wrote:
>> +target_fd_trans = g_realloc(target_fd_trans,
>> +target_fd_max * sizeof(TargetFdTrans));
> 
> This should be TargetFdTrans * (reported by Coverity).  Even better you
> could use g_renew.
> 
> It's harmless because sizeof(TargetFdTrans) > sizeof(TargetFdTrans *),
> but it should be fixed nevertheless.

OK, thank you Paolo.

I will fix that.

Laurent




Re: [Qemu-devel] [PATCH v16 13/14] vfio-pci: pass the aer error to guest

2016-01-18 Thread Marcel Apfelbaum

On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 

when the vfio device encounters an uncorrectable error in host,
the vfio_pci driver will signal the eventfd registered by this
vfio device, the results in the qemu eventfd handler getting


Maybe "the results in" -> resulting in


invoked.

this patch is to pass the error to guest and have the guest driver
recover from the error.


Maybe "Pass the error to... and let the ... "



Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 53 +++--
  1 file changed, 47 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index da4815e..efa5e01 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2553,18 +2553,59 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
  static void vfio_err_notifier_handler(void *opaque)
  {
  VFIOPCIDevice *vdev = opaque;
+PCIDevice *dev = >pdev;
+PCIEAERMsg msg = {
+.severity = 0,
+.source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
+};

  if (!event_notifier_test_and_clear(>err_notifier)) {
  return;
  }

  /*
- * TBD. Retrieve the error details and decide what action
- * needs to be taken. One of the actions could be to pass
- * the error to the guest and have the guest driver recover
- * from the error. This requires that PCIe capabilities be
- * exposed to the guest. For now, we just terminate the
- * guest to contain the error.
+ * in case the real hardware configration has been changed,


configration -> configuration



+ * here we should recheck the bus reset capability.
+ */
+if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
+vfio_check_host_bus_reset(vdev)) {
+goto stop;
+}
+/*
+ * we should read the error details from the real hardware
+ * configuration spaces, here we only need to do is signaling
+ * to guest an uncorrectable error has occurred.
+ */
+if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
+dev->exp.aer_cap) {


Why do we need dev->exp.aer_cap check here? In patch 7/14 we fail the device 
init
process if this happens, right?


+uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
+uint32_t uncor_status;
+bool isfatal;
+
+uncor_status = vfio_pci_read_config(dev,
+   dev->exp.aer_cap + PCI_ERR_UNCOR_STATUS, 4);
+
+/*
+ * if we receive the error signal but not this device, we can


maybe "if the error is not emitted by this device..."


Thanks,
Marcel


+ * just ignore it.
+ */
+if (!(uncor_status & ~0UL)) {
+return;
+}
+
+isfatal = uncor_status & pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
+
+msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
+ PCI_ERR_ROOT_CMD_NONFATAL_EN;
+
+pcie_aer_msg(dev, );
+return;
+}
+
+stop:
+/*
+ * If the aer capability is not exposed to the guest. we just
+ * terminate the guest to contain the error.
   */

  error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected.  "






Re: [Qemu-devel] [PATCH v4 8/9] trace: [tcg] Add per-vCPU tracing states for events with the 'vcpu' property

2016-01-18 Thread Paolo Bonzini


On 15/01/2016 17:38, Lluís Vilanova wrote:
> Each event with the 'vcpu' property gets a per-vCPU dynamic tracing state.
> 
> The set of enabled events with the 'vcpu' and 'tcg' properties is used
> to select a per-vCPU physical TB cache.  The number of events with both
> properties is used to select the number of physical TB caches, and a
> bitmap of the identifiers of such enabled events is used to select a
> physical TB cache.
> 
> Signed-off-by: Lluís Vilanova 

I may be wrong, but this patch and patches 3+5 seem overengineered.  Why
can't you just flush the TB cache entirely whenever the set of trace
events changes, independent of whether they use the vcpu attribute?
That's a very rare event.

Paolo



Re: [Qemu-devel] defining VIXL_DEBUG?

2016-01-18 Thread Peter Maydell
On 18 January 2016 at 10:21, Paolo Bonzini  wrote:
> I've gotten four new Coverity reports in libvixl.  All of them are
> caused by Coverity assuming that LaneSizeInBitsFromFormat can return 0.

Yeah, I fed these back to upstream.

(There are also some others do do with integer overflow etc which
I think are unrelated to LaneSizeInBitsFromFormat.)

>  The actual code in the function is
>
> default: VIXL_UNREACHABLE(); return 0;
>
> so this is obviously a false positive.  Defining VIXL_DEBUG would cause
> VIXL_UNREACHABLE() to call abort().  Any opinion about whether/where to
> do so?

Does defining it to call abort() result in unreachable-code warnings
for the "return 0;" ?

thanks
-- PMM



Re: [Qemu-devel] [PATCHv3 3/9] pseries: Clean up hash page table allocation error handling

2016-01-18 Thread Thomas Huth
On 18.01.2016 05:24, David Gibson wrote:
> The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
> all errors with error_setg(_abort, ...).
> 
> But really, the callers are really better placed to decide on the error
> handling.  So, instead make the functions use the error propagation
> infrastructure.
> 
> In the callers we change to _fatal instead of _abort, since
> this can be triggered by a bad configuration or kernel error rather than
> indicating a programming error in qemu.
> 
> While we're at it improve the messages themselves a bit, and clean up the
> indentation a little.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr.c | 24 
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b7fd09a..d28e349 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
>  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
> tswap64(~HPTE64_V_HPTE_DIRTY))
>  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
> tswap64(HPTE64_V_HPTE_DIRTY))
>  
> -static void spapr_alloc_htab(sPAPRMachineState *spapr)
> +static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
>  {
>  long shift;
>  int index;
> @@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
>   * For HV KVM, host kernel will return -ENOMEM when requested
>   * HTAB size can't be allocated.
>   */
> -error_setg(_abort, "Failed to allocate HTAB of requested size, 
> try with smaller maxmem");
> +error_setg_errno(errp, -shift,
> + "Error allocating KVM hash page table, try smaller 
> maxmem");
>  } else if (shift > 0) {
>  /*
>   * Kernel handles htab, we don't need to allocate one
> @@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
>   * but we don't allow booting of such guests.
>   */
>  if (shift != spapr->htab_shift) {
> -error_setg(_abort, "Failed to allocate HTAB of requested 
> size, try with smaller maxmem");
> +error_setg(errp,
> +"Small allocation for KVM hash page table (%ld < %"
> +PRIu32 "), try smaller maxmem",
> +shift, spapr->htab_shift);

Maybe you should add an "return" statement here - theoretically you do
not want to continue with "kvmppc_kern_htab = true" in case of errors.
(practically this does not happen because errp = error_fatal, but in
case the caller gets changed, this might introduce subtle errors otherwise)

>  }
>  
>  spapr->htab_shift = shift;
> @@ -1064,17 +1068,21 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
>   * If host kernel has allocated HTAB, KVM_PPC_ALLOCATE_HTAB ioctl is
>   * used to clear HTAB. Otherwise QEMU-allocated HTAB is cleared manually.
>   */
> -static void spapr_reset_htab(sPAPRMachineState *spapr)
> +static void spapr_reset_htab(sPAPRMachineState *spapr, Error **errp)
>  {
>  long shift;
>  int index;
>  
>  shift = kvmppc_reset_htab(spapr->htab_shift);
>  if (shift < 0) {
> -error_setg(_abort, "Failed to reset HTAB");
> +error_setg_errno(errp, -shift,
> +   "Error resetting KVM hash page table, try smaller 
> maxmem");

dito, better do an "return" here...

>  } else if (shift > 0) {
>  if (shift != spapr->htab_shift) {
> -error_setg(_abort, "Requested HTAB allocation failed 
> during reset");
> +error_setg(errp,
> +"Reduced size on reset of KVM hash page table (%ld < %"
> +PRIu32 "), try smaller maxmem",
> +shift, spapr->htab_shift);

... and here.

>  }
>  
>  /* Tell readers to update their file descriptor */
> @@ -1145,7 +1153,7 @@ static void ppc_spapr_reset(void)
>  foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
>  
>  /* Reset the hash table & recalc the RMA */
> -spapr_reset_htab(spapr);
> +spapr_reset_htab(spapr, _fatal);
>  
>  qemu_devices_reset();
>  
> @@ -1792,7 +1800,7 @@ static void ppc_spapr_init(MachineState *machine)
>  }
>  spapr->htab_shift++;
>  }
> -spapr_alloc_htab(spapr);
> +spapr_alloc_htab(spapr, _fatal);
>  
>  /* Set up Interrupt Controller before we create the VCPUs */
>  spapr->icp = xics_system_init(machine,
> 

 Thomas




Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

2016-01-18 Thread Jike Song
On 01/18/2016 12:47 PM, Alex Williamson wrote:
> Hi Jike,
> 
> On Mon, 2016-01-18 at 10:39 +0800, Jike Song wrote:
>> Hi Alex, let's continue with a new thread :)
>>
>> Basically we agree with you: exposing vGPU via VFIO can make
>> QEMU share as much code as possible with pcidev(PF or VF) assignment.
>> And yes, different vGPU vendors can share quite a lot of the
>> QEMU part, which will do good for upper layers such as libvirt.
>>
>>
>> To achieve this, there are quite a lot to do, I'll summarize
>> it below. I dived into VFIO for a while but still may have
>> things misunderstood, so please correct me :)
>>
>>
>>
>> First, let me illustrate my understanding of current VFIO
>> framework used to pass through a pcidev to guest:
>>
>>
>>  +--+
>>  |vfio qemu |
>>  +-++---+
>>|DMA  ^  |CFG
>> QEMU   |map   IRQ|  |
>> ---|-|--|---
>> KERNEL+|-|--|--+
>>   | VFIO   | |  |  |
>>   |v |  v  |
>>   |  +---+ +-+---+ |
>> IOMMU |  | vfio iommu driver | | vfio bus driver | |
>> API  <---+   | | | |
>> Layer |  | e.g. type1| | e.g. vfio_pci   | |
>>   |  +---+ +-+ |
>>   ++
>>
>>
>> Here when a particular pcidev is passed-through to a KVM guest,
>> it is attached to vfio_pci driver in host, and guest memory
>> is mapped into IOMMU via the type1 iommu driver.
>>
>>
>> Then, the draft infrastructure of future VFIO-based vgpu:
>>
>>
>>
>>  +-+
>>  |  vfio qemu  |
>>  ++-+--+
>>   |DMA   ^  |CFG
>> QEMU  |mapIRQ|  |
>> --|--|--|---
>> KERNEL|  |  |
>>  +|--|--|--+
>>  |VFIO|  |  |  |
>>  |v  |  v  |
>>  | ++  +-+---+ |
>> DMA  | | vfio iommu driver  |  | vfio bus driver | |
>> API <--+|  | | |
>> Layer| |  e.g. vfio_type2   |  |  e.g. vfio_vgpu | |
>>  | ++  +-+ |
>>  | |  ^  |  ^  |
>>  +-|--|--|--|--+
>>|  |  |  |
>>|  |  v  |
>>  +-|--|--+   +-+
>>  | +---v---+ |   | |
>>  | |   | |   | |
>>  | |  KVMGT| |   | |
>>  | |   | |   |   host gfx driver   |
>>  | +---+ |   | |
>>  |   |   | |
>>  |KVM hypervisor |   | |
>>  +---+   +-+
>>
>> NOTEvfio_type2 and vfio_vgpu are only *logically* parts
>> of VFIO, they may be implemented in KVM hypervisor
>> or host gfx driver.
>>
>>
>>
>> Here we need to implement a new vfio IOMMU driver instead of type1,
>> let's call it vfio_type2 temporarily. The main difference from pcidev
>> assignment is, vGPU doesn't have its own DMA requester id, so it has
>> to share mappings with host and other vGPUs.
>>
>> - type1 iommu driver maps gpa to hpa for passing through;
>>   whereas type2 maps iova to hpa;
>>
>> - hardware iommu is always needed by type1, whereas for
>>   type2, hardware iommu is optional;
>>
>> - type1 will invoke low-level IOMMU API (iommu_map et al) to
>>   setup IOMMU page table directly, whereas type2 dosen't (only
>>   need to invoke higher level DMA API like dma_map_page);
> 
> Yes, the current type1 implementation is not compatible with vgpu since
> there are not separate requester IDs on the bus and you probably don't
> want or need to pin all of guest memory like we do for direct
> assignment.  However, let's separate the type1 user API from the
> current implementation.  It's quite easy within the vfio code to
> consider "type1" to be an API specification that may have multiple
> 

Re: [Qemu-devel] [PATCH v16 07/14] vfio: add aer support for vfio device

2016-01-18 Thread Marcel Apfelbaum

On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 

Calling pcie_aer_init to initilize aer related registers for
vfio device, then reload physical related registers to expose
device capability.

Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 81 ---
  hw/vfio/pci.h |  3 +++
  2 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 64b0867..38b0aa5 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1832,6 +1832,62 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t 
pos)
  return 0;
  }

+static int vfio_setup_aer(VFIOPCIDevice *vdev, uint8_t cap_ver,
+  int pos, uint16_t size)
+{
+PCIDevice *pdev = >pdev;
+PCIDevice *dev_iter;
+uint8_t type;
+uint32_t errcap;
+
+if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
+pcie_add_capability(pdev, PCI_EXT_CAP_ID_ERR,
+cap_ver, pos, size);
+return 0;
+}
+
+dev_iter = pci_bridge_get_device(pdev->bus);
+if (!dev_iter) {
+goto error;
+}
+
+while (dev_iter) {
+type = pcie_cap_get_type(dev_iter);
+if ((type != PCI_EXP_TYPE_ROOT_PORT &&
+ type != PCI_EXP_TYPE_UPSTREAM &&
+ type != PCI_EXP_TYPE_DOWNSTREAM)) {
+goto error;
+}
+
+if (!dev_iter->exp.aer_cap) {
+goto error;
+}
+
+dev_iter = pci_bridge_get_device(dev_iter->bus);
+}
+
+errcap = vfio_pci_read_config(pdev, pos + PCI_ERR_CAP, 4);
+/*
+ * The ability to record multiple headers is depending on
+ * the state of the Multiple Header Recording Capable bit and
+ * enabled by the Multiple Header Recording Enable bit.
+ */
+if ((errcap & PCI_ERR_CAP_MHRC) &&
+(errcap & PCI_ERR_CAP_MHRE)) {
+pdev->exp.aer_log.log_max = PCIE_AER_LOG_MAX_DEFAULT;
+} else {
+pdev->exp.aer_log.log_max = 0;
+}
+
+pcie_cap_deverr_init(pdev);
+return pcie_aer_init(pdev, pos, size);
+
+error:
+error_report("vfio: Unable to enable AER for device %s, parent bus "
+ "does not support AER signaling", vdev->vbasedev.name);
+return -1;
+}
+
  static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
  {
  PCIDevice *pdev = >pdev;
@@ -1839,6 +1895,7 @@ static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
  uint16_t cap_id, next, size;
  uint8_t cap_ver;
  uint8_t *config;
+int ret = 0;

  /*
   * In order to avoid breaking config space, create a copy to
@@ -1860,16 +1917,29 @@ static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
   */
  size = vfio_ext_cap_max_size(config, next);

-pcie_add_capability(pdev, cap_id, cap_ver, next, size);
-pci_set_long(dev->config + next, PCI_EXT_CAP(cap_id, cap_ver, 0));
+switch (cap_id) {
+case PCI_EXT_CAP_ID_ERR:
+ret = vfio_setup_aer(vdev, cap_ver, next, size);
+break;
+default:
+pcie_add_capability(pdev, cap_id, cap_ver, next, size);
+break;
+}
+
+if (ret) {
+goto out;
+}
+
+pci_set_long(pdev->config + next, PCI_EXT_CAP(cap_id, cap_ver, 0));

  /* Use emulated next pointer to allow dropping extended caps */
  pci_long_test_and_set_mask(vdev->emulated_config_bits + next,
 PCI_EXT_CAP_NEXT_MASK);
  }

+out:
  g_free(config);
-return 0;
+return ret;
  }

  static int vfio_add_capabilities(VFIOPCIDevice *vdev)
@@ -2624,6 +2694,11 @@ static int vfio_initfn(PCIDevice *pdev)
  goto out_teardown;
  }

+if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
+!pdev->exp.aer_cap) {


Hi,

I think we need an error_report here, otherwise the init
will fail without knowing the reason.


Thanks,
Marcel



+goto out_teardown;
+}
+
  /* QEMU emulates all of MSI & MSIX */
  if (pdev->cap_present & QEMU_PCI_CAP_MSIX) {
  memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff,
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index f004d52..48c1f69 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -15,6 +15,7 @@
  #include "qemu-common.h"
  #include "exec/memory.h"
  #include "hw/pci/pci.h"
+#include "hw/pci/pci_bridge.h"
  #include "hw/vfio/vfio-common.h"
  #include "qemu/event_notifier.h"
  #include "qemu/queue.h"
@@ -127,6 +128,8 @@ typedef struct VFIOPCIDevice {
  #define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
  #define VFIO_FEATURE_ENABLE_REQ_BIT 1
  #define VFIO_FEATURE_ENABLE_REQ (1 << VFIO_FEATURE_ENABLE_REQ_BIT)
+#define VFIO_FEATURE_ENABLE_AER_BIT 2
+#define VFIO_FEATURE_ENABLE_AER (1 << VFIO_FEATURE_ENABLE_AER_BIT)
  int32_t bootindex;
  uint8_t pm_cap;
  bool has_vga;






Re: [Qemu-devel] [PATCHv3 7/9] pseries: Clean up error handling in xics_system_init()

2016-01-18 Thread Thomas Huth
On 18.01.2016 05:24, David Gibson wrote:
> Use the error handling infrastructure to pass an error out from
> try_create_xics() instead of assuming _abort - the caller is in a
> better position to decide on error handling policy.
> 
> Also change the error handling from an _abort to _fatal, since
> this occurs during the initial machine construction and could be triggered
> by bad configuration rather than a program error.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index bb5eaa5..148ca5a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -111,7 +111,7 @@ static XICSState *try_create_xics(const char *type, int 
> nr_servers,
>  }
>  
>  static XICSState *xics_system_init(MachineState *machine,
> -   int nr_servers, int nr_irqs)
> +   int nr_servers, int nr_irqs, Error **errp)
>  {
>  XICSState *icp = NULL;
>  
> @@ -130,7 +130,7 @@ static XICSState *xics_system_init(MachineState *machine,
>  }
>  
>  if (!icp) {
> -icp = try_create_xics(TYPE_XICS, nr_servers, nr_irqs, _abort);
> +icp = try_create_xics(TYPE_XICS, nr_servers, nr_irqs, errp);
>  }
>  
>  return icp;
> @@ -1813,7 +1813,7 @@ static void ppc_spapr_init(MachineState *machine)
>  spapr->icp = xics_system_init(machine,
>DIV_ROUND_UP(max_cpus * 
> kvmppc_smt_threads(),
> smp_threads),
> -  XICS_IRQS);
> +  XICS_IRQS, _fatal);
>  
>  if (smc->dr_lmb_enabled) {
>  spapr_validate_node_memory(machine, _fatal);

Reviewed-by: Thomas Huth 




Re: [Qemu-devel] [PULL v2 00/15] NBD, chardev, SCSI patches for 2015-01-15

2016-01-18 Thread Peter Maydell
On 15 January 2016 at 18:00, Paolo Bonzini  wrote:
> The following changes since commit 5a57acb66f19ee52723aa05b8afbbc41c3e9ec99:
>
>   Merge remote-tracking branch 
> 'remotes/pmaydell/tags/pull-target-arm-20160115' into staging (2016-01-15 
> 15:49:43 +)
>
> are available in the git repository at:
>
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to fefd749ce29837d399a38d6052ca9968fa7352e7:
>
>   qemu-char: do not leak QemuMutex when freeing a character device 
> (2016-01-15 18:58:02 +0100)
>
> 
> * qemu-char logfile facility
> * NBD coroutine based negotiation
> * bugfixes
>
> 

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] migration: not send zero page header in ram bulk stage

2016-01-18 Thread Dr. David Alan Gilbert
* Li, Liang Z (liang.z...@intel.com) wrote:
> > * Liang Li (liang.z...@intel.com) wrote:
> > > Now that VM's RAM pages are initialized to zero, (VM's RAM is allcated
> > > with the mmap() and MAP_ANONYMOUS option, or mmap() without
> > MAP_SHARED
> > > if hugetlbfs is used.) so there is no need to send the zero page
> > > header to destination.
> > >
> > > For guest just uses a small portions of RAM, this change can avoid
> > > allocating all the guest's RAM pages in the destination node after
> > > live migration. Another benefit is destination QEMU can save lots of
> > > CPU cycles for zero page checking.
> > 
> > I think this would break postcopy, because the zero pages wouldn't be filled
> > in, so accessing them would still generate a userfault.
> > So you'd have to disable this optimisation if postcopy is enabled (even 
> > during
> > the precopy bulk stage).
> > 
> > Also, are you sure about the benefits?
> >  Destination guests RAM should not be allocated on receiving a zero page;
> > see ram_handle_compressed, it doesn't write to the page if it's zero, so it
> > shouldn't cause an allocate.  I think you're probably correct about the zero
> > page test on the destination, I wonder if we can speed that up.
> > 
> > Dave
> 
> I have test the performance, with a 8G guest just booted, this patch can 
> reduce total live migration time about 10%.
> Unfortunately, Paolo said this patch would break LM in some case 
> 
> For the zero page test on the destination, if the page is really a zero page, 
> test is faster than writing a whole page of zero.

There shouldn't be a write on the destination though; it does a check if
the page is already zero and only if it's none-zero does it do the write;
it should rarely be non-zero.

Dave

> 
> Liang
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [PATCH v2] man: virtfs-proxy-helper: Rework awkward sentence

2016-01-18 Thread Christophe Fergeau
There was a 'capbilities' typo in this man page. This commit
reformulates the sentence the typo was in to make it easier to grasp.
This is based on a suggestion from Eric Blake.

Signed-off-by: Christophe Fergeau 
---
Changes since v1:
- rather than just fixing the typo, reformulate the sentence it's in as
  suggested by Eric.


 fsdev/virtfs-proxy-helper.texi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fsdev/virtfs-proxy-helper.texi b/fsdev/virtfs-proxy-helper.texi
index e60e3b9..a2f4ecc 100644
--- a/fsdev/virtfs-proxy-helper.texi
+++ b/fsdev/virtfs-proxy-helper.texi
@@ -28,8 +28,8 @@ QEMU and proxy helper communicate using this socket. QEMU 
proxy fs
 driver sends filesystem request to proxy helper and receives the
 response from it.
 
-Proxy helper is designed so that it can drop the root privilege with
-retaining capbilities needed for doing filesystem operations only.
+The proxy helper is designed so that it can drop root privileges except
+for the capabilities needed for doing filesystem operations.
 
 @end table
 @c man end
-- 
2.5.0




Re: [Qemu-devel] [PATCH v2] s390: use FILE instead of QEMUFile for creating text file

2016-01-18 Thread Daniel P. Berrange
On Fri, Jan 15, 2016 at 04:29:32PM -0700, Eric Blake wrote:
> On 01/12/2016 05:59 AM, Daniel P. Berrange wrote:
> > The s390 skeys monitor command needs to write out a plain text
> > file. Currently it is using the QEMUFile class for this, but
> > work is ongoing to refactor QEMUFile and eliminate much code
> > related to it. The only feature qemu_fopen() gives over fopen()
> > is support for QEMU FD passing, but this can be achieved with
> > qemu_open() + fdopen() too. Switching to regular stdio FILE
> > APIs avoids the need to sprintf via an intermedia buffer which
> 
> s/intermedia/intermediate/
> 
> > slightly simplifies the code.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  hw/s390x/s390-skeys.c | 26 ++
> >  1 file changed, 14 insertions(+), 12 deletions(-)
> > 
> 
> > @@ -124,8 +120,14 @@ void qmp_dump_skeys(const char *filename, Error **errp)
> >  return;
> >  }
> >  
> > -f = qemu_fopen(filename, "wb");
> > +fd = qemu_open(filename, O_WRONLY|O_CREAT|O_TRUNC, 0600);
> 
> A strict conversion should probably include O_BINARY for mingw (where
> "wb" turns on binary mode).  But maybe we should just make qemu_open()
> itself _always_ provide O_BINARY so that callers don't have to worry
> about it - do we really have a reason to open a file on mingw where we
> want \r\n munged into \n due to text mode?

We're writing text data to the file, so I figured that using binary
mode would be wrong.

> 
> > +if (fd < 0) {
> > +error_setg_file_open(errp, errno, filename);
> > +return;
> > +}
> > +f = fdopen(fd, "wb");
> >  if (!f) {
> > +close(fd);
> >  error_setg_file_open(errp, errno, filename);
> 
> close() may corrupt errno, resulting in a report of the wrong message.
> Swap these two lines.

Ok

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCHv3 3/9] pseries: Clean up hash page table allocation error handling

2016-01-18 Thread Markus Armbruster
Thomas Huth  writes:

> On 18.01.2016 05:24, David Gibson wrote:
>> The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
>> all errors with error_setg(_abort, ...).
>> 
>> But really, the callers are really better placed to decide on the error
>> handling.  So, instead make the functions use the error propagation
>> infrastructure.
>> 
>> In the callers we change to _fatal instead of _abort, since
>> this can be triggered by a bad configuration or kernel error rather than
>> indicating a programming error in qemu.
>> 
>> While we're at it improve the messages themselves a bit, and clean up the
>> indentation a little.
>> 
>> Signed-off-by: David Gibson 
>> ---
>>  hw/ppc/spapr.c | 24 
>>  1 file changed, 16 insertions(+), 8 deletions(-)
>> 
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index b7fd09a..d28e349 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
>>  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
>> tswap64(~HPTE64_V_HPTE_DIRTY))
>>  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
>> tswap64(HPTE64_V_HPTE_DIRTY))
>>  
>> -static void spapr_alloc_htab(sPAPRMachineState *spapr)
>> +static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
>>  {
>>  long shift;
>>  int index;
>> @@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
>>   * For HV KVM, host kernel will return -ENOMEM when requested
>>   * HTAB size can't be allocated.
>>   */
>> -error_setg(_abort, "Failed to allocate HTAB of requested 
>> size, try with smaller maxmem");
>> +error_setg_errno(errp, -shift,
>> + "Error allocating KVM hash page table, try smaller 
>> maxmem");
>>  } else if (shift > 0) {
>>  /*
>>   * Kernel handles htab, we don't need to allocate one
>> @@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
>>   * but we don't allow booting of such guests.
>>   */
>>  if (shift != spapr->htab_shift) {
>> -error_setg(_abort, "Failed to allocate HTAB of requested 
>> size, try with smaller maxmem");
>> +error_setg(errp,
>> +"Small allocation for KVM hash page table (%ld < %"
>> +PRIu32 "), try smaller maxmem",
>> +shift, spapr->htab_shift);
>
> Maybe you should add an "return" statement here - theoretically you do
> not want to continue with "kvmppc_kern_htab = true" in case of errors.
> (practically this does not happen because errp = error_fatal, but in
> case the caller gets changed, this might introduce subtle errors otherwise)

Good point.

With abort() / exit(), we don't have to worry about recovery.  In
particular, we don't have to revert half-done changes.

Conversions away from abort() / exit() need to consider error recovery.
We have to make sure the function leaves things in a sane state on
error.  This normally means taking an early return, and often means
reverting some state changes.

[...]



[Qemu-devel] defining VIXL_DEBUG?

2016-01-18 Thread Paolo Bonzini
I've gotten four new Coverity reports in libvixl.  All of them are
caused by Coverity assuming that LaneSizeInBitsFromFormat can return 0.
 The actual code in the function is

default: VIXL_UNREACHABLE(); return 0;

so this is obviously a false positive.  Defining VIXL_DEBUG would cause
VIXL_UNREACHABLE() to call abort().  Any opinion about whether/where to
do so?

Thanks,

Paolo



Re: [Qemu-devel] [PATCHv3 4/9] pseries: Clean up error handling in spapr_validate_node_memory()

2016-01-18 Thread Thomas Huth
On 18.01.2016 05:24, David Gibson wrote:
> Use error_setg() and return an error, rather than using an explicit exit().
> 
> Also improve messages, and be more explicit about which constraint failed.
> 
> Signed-off-by: David Gibson 
> Reviewed-by: Bharata B Rao 
> ---
>  hw/ppc/spapr.c | 37 ++---
>  1 file changed, 22 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d28e349..87097bc 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1699,27 +1699,34 @@ static void 
> spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
>   * to SPAPR_MEMORY_BLOCK_SIZE(256MB), then refuse to start the guest
>   * since we can't support such unaligned sizes with DRCONF_MEMORY.
>   */
> -static void spapr_validate_node_memory(MachineState *machine)
> +static void spapr_validate_node_memory(MachineState *machine, Error **errp)
>  {
>  int i;
>  
> -if (machine->maxram_size % SPAPR_MEMORY_BLOCK_SIZE ||
> -machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) {
> -error_report("Can't support memory configuration where RAM size "
> - "0x" RAM_ADDR_FMT " or maxmem size "
> - "0x" RAM_ADDR_FMT " isn't aligned to %llu MB",
> - machine->ram_size, machine->maxram_size,
> - SPAPR_MEMORY_BLOCK_SIZE/M_BYTE);
> -exit(EXIT_FAILURE);
> +if (machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) {
> +error_setg(errp, "Memory size 0x" RAM_ADDR_FMT
> +   " is not aligned to %llu MiB",
> +   machine->ram_size,
> +   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
> +return;
> +}
> +
> +if (machine->maxram_size % SPAPR_MEMORY_BLOCK_SIZE) {
> +error_setg(errp, "Maximum memory size 0x" RAM_ADDR_FMT
> +   " is not aligned to %llu MiB",
> +   machine->ram_size,
> +   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
> +return;
>  }
>  
>  for (i = 0; i < nb_numa_nodes; i++) {
>  if (numa_info[i].node_mem % SPAPR_MEMORY_BLOCK_SIZE) {
> -error_report("Can't support memory configuration where memory 
> size"
> - " %" PRIx64 " of node %d isn't aligned to %llu MB",
> - numa_info[i].node_mem, i,
> - SPAPR_MEMORY_BLOCK_SIZE/M_BYTE);
> -exit(EXIT_FAILURE);
> +error_setg(errp,
> +   "Node %d memory size 0x" RAM_ADDR_FMT
> +   " is not aligned to %llu MiB",
> +   i, numa_info[i].node_mem,
> +   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
> +return;
>  }
>  }
>  }
> @@ -1809,7 +1816,7 @@ static void ppc_spapr_init(MachineState *machine)
>XICS_IRQS);
>  
>  if (smc->dr_lmb_enabled) {
> -spapr_validate_node_memory(machine);
> +spapr_validate_node_memory(machine, _fatal);
>  }
>  
>  /* init CPUs */
> 

Reviewed-by: Thomas Huth 




Re: [Qemu-devel] [PATCHv3 6/9] pseries: Clean up error handling in spapr_rtas_register()

2016-01-18 Thread Thomas Huth
On 18.01.2016 05:24, David Gibson wrote:
> The errors detected in this function necessarily indicate bugs in the rest
> of the qemu code, rather than an external or configuration problem.
> 
> So, a simple assert() is more appropriate than any more complex error
> reporting.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr_rtas.c | 12 +++-
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 34b12a3..0be52ae 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -648,17 +648,11 @@ target_ulong spapr_rtas_call(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  
>  void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
>  {
> -if (!((token >= RTAS_TOKEN_BASE) && (token < RTAS_TOKEN_MAX))) {
> -fprintf(stderr, "RTAS invalid token 0x%x\n", token);
> -exit(1);
> -}
> +assert((token >= RTAS_TOKEN_BASE) && (token < RTAS_TOKEN_MAX));

While you're at it, you could also get rid of some superfluous
parentheses in that statement:

assert(token >= RTAS_TOKEN_BASE && token < RTAS_TOKEN_MAX);

>  token -= RTAS_TOKEN_BASE;
> -if (rtas_table[token].name) {
> -fprintf(stderr, "RTAS call \"%s\" is registered already as 0x%x\n",
> -rtas_table[token].name, token);
> -exit(1);
> -}
> +
> +assert(!rtas_table[token].name);
>  
>  rtas_table[token].name = name;
>  rtas_table[token].fn = fn;
> 

Anyway, patch sounds reasonable,
Reviewed-by: Thomas Huth 




[Qemu-devel] [PATCH v6 8/8] block: allow to skip block driver in selection of one for VM state saving

2016-01-18 Thread Denis V. Lunev
Some block drives like PFLASH ones in OVFM setup are not suitable for
VM state saving. Allow option to ban them in this selection.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Markus Armbruster 
CC: Eric Blake 
CC: Juan Quintela 
CC: Amit Shah 
---
Changes from v5:
- added to vmstate option to blkdev JSON

 block.c   |  7 +++
 block/snapshot.c  | 10 ++
 include/block/block_int.h |  3 +++
 qapi/block-core.json  |  5 -
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 01655de..08c1db6 100644
--- a/block.c
+++ b/block.c
@@ -895,6 +895,12 @@ static QemuOptsList bdrv_runtime_opts = {
 .type = QEMU_OPT_BOOL,
 .help = "Ignore flush requests",
 },
+{
+.name = "vmstate",
+.type = QEMU_OPT_BOOL,
+.help = "Allow to to save VM state to this block device",
+.def_value_str = "on",
+},
 { /* end of list */ }
 },
 };
@@ -957,6 +963,7 @@ static int bdrv_open_common(BlockDriverState *bs, BdrvChild 
*file,
 bs->zero_beyond_eof = true;
 open_flags = bdrv_open_flags(bs, flags);
 bs->read_only = !(open_flags & BDRV_O_RDWR);
+bs->enable_vmstate = qemu_opt_get_bool(opts, "vmstate", true);
 
 if (use_bdrv_whitelist && !bdrv_is_whitelisted(drv, bs->read_only)) {
 error_setg(errp,
diff --git a/block/snapshot.c b/block/snapshot.c
index 237a015..eb73c76 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -498,6 +498,12 @@ BlockDriverState *bdrv_all_find_vmstate_bs(const char 
*device, Error **errp)
device);
 return NULL;
 }
+if (!bs->enable_vmstate) {
+error_setg(errp,
+   "It is not allowed to save VM state to the device '%s'",
+   device);
+return NULL;
+}
 
 ctx = bdrv_get_aio_context(bs);
 
@@ -515,6 +521,10 @@ BlockDriverState *bdrv_all_find_vmstate_bs(const char 
*device, Error **errp)
 while (not_found && (bs = bdrv_next(bs))) {
 ctx = bdrv_get_aio_context(bs);
 
+if (!bs->enable_vmstate) {
+continue;
+}
+
 aio_context_acquire(ctx);
 not_found = !bdrv_can_snapshot(bs);
 aio_context_release(ctx);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 256609d..a30b576 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -438,6 +438,9 @@ struct BlockDriverState {
 /* do we need to tell the quest if we have a volatile write cache? */
 int enable_write_cache;
 
+/* allow to save VM state on snapshot here */
+bool enable_vmstate;
+
 /* the following member gives a name to every node on the bs graph. */
 char node_name[32];
 /* element of the list of named nodes building the graph */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0a915ed..d04453f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1583,6 +1583,8 @@
 #   statistics, in seconds (default: none) (Since 2.5)
 # @detect-zeroes: #optional detect and optimize zero writes (Since 2.1)
 # (default: off)
+# @vmstate: #optional allow to save VM state to this device (Since 2.6)
+#   (default: on)
 #
 # Since: 1.7
 ##
@@ -1599,7 +1601,8 @@
 '*stats-account-invalid': 'bool',
 '*stats-account-failed': 'bool',
 '*stats-intervals': ['int'],
-'*detect-zeroes': 'BlockdevDetectZeroesOptions' } }
+'*detect-zeroes': 'BlockdevDetectZeroesOptions',
+'*vmstate': 'bool' } }
 
 ##
 # @BlockdevOptionsFile
-- 
2.5.0




Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-18 Thread Jason Wang


On 01/18/2016 03:05 PM, Zhang Chen wrote:
>
>
> On 01/06/2016 01:16 PM, Jason Wang wrote:
>>
>> On 01/04/2016 07:17 PM, Zhang Chen wrote:
>>>
>>> On 01/04/2016 05:46 PM, Jason Wang wrote:
 On 01/04/2016 04:16 PM, Zhang Chen wrote:
> On 01/04/2016 01:37 PM, Jason Wang wrote:
>> On 12/31/2015 04:40 PM, Zhang Chen wrote:
>>> On 12/31/2015 10:36 AM, Jason Wang wrote:
 On 12/22/2015 06:42 PM, Zhang Chen wrote:
> From: zhangchen 
>
> Hi,all
>
> This patch add an colo-proxy object, COLO-Proxy is a part of
> COLO,
> based on qemu netfilter and it's a plugin for qemu netfilter. the
> function
> keep Secondary VM connect normal to Primary VM and compare
> packets
> sent by PVM to sent by SVM.if the packet difference,notify
> COLO do
> checkpoint and send all primary packet has queued.
 Thanks for the work. I don't object this method but still not
 convinced
 that qemu is the best place for this.

 As been raised in the past discussion, it's almost impossible to
 cooperate with vhost backends. If we want this to be used in
 production
 environment, need to think of a solution for vhost. There's no
 such
 worry if we decouple this from qemu.

> You can also get the series from:
>
> https://github.com/zhangckid/qemu/tree/colo-v2.2-periodic-mode-with-colo-proxyV2
>
>
>
>
>
> Usage:
>
> primary:
> -netdev tap,id=bn0 -device e1000,netdev=bn0
> -object
> colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=host:port
>
> secondary:
> -netdev tap,id=bn0 -device e1000,netdev=bn0
> -object
> colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=host:port
>
 Have a quick glance at how secondary mode work. What it does is
 just
 forwarding packets between a nic and a socket, qemu socket
 backend did
 exact the same job. You could even use socket in primary node and
 let
 packet compare module talk to both primary and secondary node.
>>> If we use qemu socket backend , the same netdev will used by qemu
>>> socket and
>>> qemu netfilter. this will against qemu net design. and then, when
>>> colo
>>> do failover,
>>> secondary do not have backend to use. that's the real problem.
>> Then, maybe it's time to implement changing the netdev of a nic. The
>> point here is that what secondary mode did is in fact a netdev
>> backend
>> instead of a filter ...
> Currently, you are right. in colo-proxy V2 code, I just compare IP
> packet to
> decide whether to do checkpoint.
> But, in colo-proxy V3 I will compare tcp,icmp,udp packet to decide
> it.
> because that can reduce frequency of checkpoint and improve
> performance. To keep tcp connection well, colo secondary need to
> record
> primary guest's init seq and adjust secondary guest's ack. if colo do
> failover,
> secondary also need do this to old tcp connection. qemu socket
> can't do this job.
 So a question here: is it a must to do things (e.g TCP analysis
 stuffs)
 at secondary? Looks like we could do this at primary node. And I saw
 you're doing packet comparing in primary node, any advantages of doing
 this in primary instead of secondary?
>>> We think must  to do this in secondary, because if colo do
>>> failover,secondary
>>> must continues do TCP analysis stuffs to before tcp connection(if not,
>>> tcp connection
>>> will disconnect in that time), in this time primary already down or
>>> disconnect to
>>> secondary.so we can't make primary do this  TCP analysis stuffs.it can
>>> not ensure
>>> FT function.
>>>
>>> Thanks
>>> zhangchen
>> Makes sense.
>>
>> Thanks
>
> Hi~, Jason.
> No news for a week.
> Can you give me some comments for code.
> Let's make colo-proxy work well.

Sure.

Two main comments/suggestions:

- TCP analysis is missed in current version, maybe you point a git tree
(or another version of RFC) to me for a better understanding of the
design. (Just a skeleton for TCP should be sufficient to discuss).
- I prefer to make the code as reusable as possible. So it's better to
split/decouple the reusable parts from the codes. So a vague idea is:

1) Decouple the packet comparing from the netfilter. You've achieved
this 99% since the work has been done in a thread. Just let the thread
poll sockets directly, then the comparing have the possibility to be
reused by other kinds of dataplane.
2) Implement traffic mirror/redirector as filter.
3) Implement TCP seq rewriting as a filter.

Then, in primary node, you need just a traffic mirror, which did:
- mirror ingress traffic to secondary node
- mirror 

[Qemu-devel] RFC: running the user interface in a thread ...

2016-01-18 Thread Gerd Hoffmann
  Hi folks,

I'm starting to investigate if and how we can move the user interface
code into its own thread instead of running it in the iothread and
therefore avoid blocking the guest in case some UI actions take a little
longer.

opengl and toolkits tend to be bad at multithreading.  So my idea is to
have a single thread dedicated to all the UI + rendering stuff, possibly
let even the virglrenderer run in ui thread context.

I think we have to make that opt-in per user interface, so we can go
forward step by step.

The ui thread will need quite some stuff provided by the mainloop.  Wait
for all kinds of events (from vnc socket, x11 connection, ...).
Probably timers too.  Wait for events from other threads (guest screen
updates).

Suggestions how to tackle that?
Can I reuse the qemu mainloop code outside of the iothread?
Maybe it'll be better to go straight for a glib main loop?
Is it possible to wait for file handle events and posix condition
variables at the same time?

Other notes / hints / suggestions / ideas?

thanks,
  Gerd





Re: [Qemu-devel] [PATCH] Fix corner-case when using VNC+SASL+SPICE

2016-01-18 Thread Gerd Hoffmann
On Fr, 2016-01-15 at 16:33 -0700, Eric Blake wrote:
> > This commit unconditionnally calls spice_server_set_sasl_appname()
> 
> s/unconditionnally/unconditionally/

Fixed up, no need to resend.

cheers,
  Gerd


signature.asc
Description: This is a digitally signed message part


Re: [Qemu-devel] [PATCH v2] Add optionrom compatible with fw_cfg DMA version

2016-01-18 Thread Paolo Bonzini


On 18/01/2016 10:59, Marc Marí wrote:
> This optionrom is based on linuxboot.S.
> 
> Signed-off-by: Marc Marí 

Reviewed-by: Paolo Bonzini 

> ---
>  .gitignore|   4 +
>  hw/i386/pc.c  |   9 +-
>  hw/nvram/fw_cfg.c |   2 +-
>  include/hw/nvram/fw_cfg.h |   1 +
>  pc-bios/optionrom/Makefile|   8 +-
>  pc-bios/optionrom/linuxboot_dma.c | 262 
> ++
>  pc-bios/optionrom/optionrom.h |   4 +-
>  7 files changed, 285 insertions(+), 5 deletions(-)
>  create mode 100644 pc-bios/optionrom/linuxboot_dma.c
> 
> diff --git a/.gitignore b/.gitignore
> index 88a80ff..101d1e0 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -94,6 +94,10 @@
>  /pc-bios/optionrom/linuxboot.bin
>  /pc-bios/optionrom/linuxboot.raw
>  /pc-bios/optionrom/linuxboot.img
> +/pc-bios/optionrom/linuxboot_dma.asm
> +/pc-bios/optionrom/linuxboot_dma.bin
> +/pc-bios/optionrom/linuxboot_dma.raw
> +/pc-bios/optionrom/linuxboot_dma.img
>  /pc-bios/optionrom/multiboot.asm
>  /pc-bios/optionrom/multiboot.bin
>  /pc-bios/optionrom/multiboot.raw
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 459260b..00339fa 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms,
>  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
>  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
>  
> -option_rom[nb_option_roms].name = "linuxboot.bin";
> -option_rom[nb_option_roms].bootindex = 0;
> +if (fw_cfg_dma_enabled(fw_cfg)) {
> +option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> +option_rom[nb_option_roms].bootindex = 0;
> +} else {
> +option_rom[nb_option_roms].name = "linuxboot.bin";
> +option_rom[nb_option_roms].bootindex = 0;
> +}
>  nb_option_roms++;
>  }
>  
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index a1d650d..d0a5753 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -546,7 +546,7 @@ static bool is_version_1(void *opaque, int version_id)
>  return version_id == 1;
>  }
>  
> -static bool fw_cfg_dma_enabled(void *opaque)
> +bool fw_cfg_dma_enabled(void *opaque)
>  {
>  FWCfgState *s = opaque;
>  
> diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
> index 664eaf6..953e58d 100644
> --- a/include/hw/nvram/fw_cfg.h
> +++ b/include/hw/nvram/fw_cfg.h
> @@ -219,6 +219,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
>   hwaddr dma_addr, AddressSpace *dma_as);
>  
>  FWCfgState *fw_cfg_find(void);
> +bool fw_cfg_dma_enabled(void *opaque);
>  
>  #endif /* NO_QEMU_PROTOS */
>  
> diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
> index ce4852a..076f351 100644
> --- a/pc-bios/optionrom/Makefile
> +++ b/pc-bios/optionrom/Makefile
> @@ -2,6 +2,7 @@ all: build-all
>  # Dummy command so that make thinks it has done something
>   @true
>  
> +BULD_DIR=$(CURDIR)
>  include ../../config-host.mak
>  include $(SRC_PATH)/rules.mak
>  
> @@ -11,15 +12,20 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/optionrom)
>  
>  CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin
>  CFLAGS += -I$(SRC_PATH)
> +CFLAGS += -I$(SRC_PATH)/include
>  CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector)
>  CFLAGS += $(CFLAGS_NOPIE)
> +CFLAGS += -m32
>  QEMU_CFLAGS = $(CFLAGS)
>  
> -build-all: multiboot.bin linuxboot.bin kvmvapic.bin
> +build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin
>  
>  # suppress auto-removal of intermediate files
>  .SECONDARY:
>  
> +linuxboot_dma.img: linuxboot_dma.o
> + $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -static -Ttext 
> 0 -e _start -s -o $@ $<,"  Building $(TARGET_DIR)$@")
> +
>  %.img: %.o
>   $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e _start -s -o $@ 
> $<,"  Building $(TARGET_DIR)$@")
>  
> diff --git a/pc-bios/optionrom/linuxboot_dma.c 
> b/pc-bios/optionrom/linuxboot_dma.c
> new file mode 100644
> index 000..9c355fd
> --- /dev/null
> +++ b/pc-bios/optionrom/linuxboot_dma.c
> @@ -0,0 +1,262 @@
> +/*
> + * Linux Boot Option ROM for fw_cfg DMA
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see .
> + *
> + * Copyright (c) 2015 Red Hat Inc.
> + * 

Re: [Qemu-devel] [Qemu-block] [PATCH 2/4] blockdev: Fix 'change' for slot devices

2016-01-18 Thread Alberto Garcia
On Tue 12 Jan 2016 04:47:52 PM CET, Max Reitz wrote:
> 'change' and related operations did not work when used on guest devices
> featuring removable media but no actual tray, because
> blk_dev_is_tray_open() always returned false for them and the
> blockdev-{insert,remove}-medium commands required it to return true.
>
> Fix this by making blockdev-{insert,remove}-medium work on tray-less
> devices. Also, blockdev-{open,close}-tray are now explicitly no-ops when
> invoked on such devices, and blk_dev_change_media_cb() is instead
> called by blockdev-{insert,remove}-medium (for tray-less devices only).
>
> Reported-by: Peter Maydell 
> Cc: qemu-stable 
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH] qdev: free qemu-opts when the QOM path goes away

2016-01-18 Thread Markus Armbruster
Paolo Bonzini  writes:

> On 15/01/2016 18:03, Andreas Färber wrote:
>> Am 05.11.2015 um 13:47 schrieb Markus Armbruster:
>>> Paolo Bonzini  writes:
 On 05/11/2015 13:06, Andreas Färber wrote:
>> 1. Wouldn't it be cleaner to delete dev-opts *before* sending
>>DEVICE_DELETED?  Like this:
>>
>> +++ b/hw/core/qdev.c
>> @@ -1244,6 +1244,9 @@ static void device_unparent(Object *obj)
>>  dev->parent_bus = NULL;
>>  }
>>
>> +qemu_opts_del(dev->opts);
>> +dev->opts = NULL;
>> +
>>  /* Only send event if the device had been completely realized */
>>  if (dev->pending_deleted_event) {
>>  gchar *path = object_get_canonical_path(OBJECT(dev));
>
> To me this proposal sounds sane, but I did not get to tracing the code
> flow here. Paolo, which approach do you prefer and why?

 It doesn't really matter, because the BQL is being held here.

 On the other hand, if the opts are deleted in finalize, there is an
 arbitrary delay because finalize is typically called after a
 synchronize_rcu period.

>>> 2. If the device is a block device, then unplugging it also deletes its
>>>backend (ugly wart we keep for backward compatibility; *not* for
>>>blockdev-add, though).  This backend also has a QemuOpts.  It gets
>>>deleted in drive_info_del().  Just like device_finalize(), it runs
>>>within object_unref(), i.e. after DEVICE_DELETED is sent.  Same race,
>>>different ID, or am I missing something?
>>>
>>>See also https://bugzilla.redhat.com/show_bug.cgi?id=1256044
>
> If we can leave this patch decoupled from block layer and decide soonish
> on the desired approach, I'd be happy to include it in my upcoming
> qom-devices pull.

 I agree with you, the block layer bug is separate.
>>>
>>> Related, but clearly separate.  Mentioning it in the commit message
>>> would be nice, though.
>> 
>> Paolo, pong: I gathered that I should queue the original patch without
>> Markus's proposed change, correct? And do you want to add some sentence
>> to the commit message as requested by Markus?
>
> Yes, thanks:
>
> 
> Note that similar races exist for other QemuOpts, which this patch
> does not attempt to fix.
>
> For example, if the device is a block device, then unplugging it also
> deletes its backend.  However, this backend's get deleted in
> drive_info_del(), which is only called when properties are
> destroyed.  Just like device_finalize(), drive_info_del() is called
> some time after DEVICE_DELETED is sent.  A separate patch series has
> been sent to plug this other bug.  Character devices also have yet to
> be fixed.

With this addition to the commit message:

Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] Using directory as initrd

2016-01-18 Thread KONRAD Frederic

Hi Kasper,

The normal approach is to use git send email to send your patch on the list
instead of attaching the patch to the email so people can do inline comment.

You can use scripts/checkpatch.pl to check the coding style before sending.
(eg: you miss the signed off line and the description in the patch).

And better use the qemu master branch (you seems to be using qemu-2.0?)

Fred



Le 17/01/2016 22:04, Kasper Dupont a écrit :

I would like to use a directory as initrd file without
having to write it to an initrd file each time I have
changed anything in that directory.

I have written code to pipe an initrd directly from cpio
to qemu. Do you have any feedback on the attached patch?






  1   2   3   4   >