Re: [Qemu-devel] [PATCH 1/2] monitor: Reset HMP mon->rs on CHR_EVENT_CLOSED

2014-09-11 Thread Markus Armbruster
Stratos Psomadakis  writes:

> Commit cdaa86a54 ("Add G_IO_HUP handler for socket chardev") exposed a
> bug in the way the HMP monitor handles its input.  When a client closes
> the connection to the monitor, tcp_chr_read() will catch the HUP
> 'signal' and call tcp_chr_disconnect() to close the server-side
> connection too.

Your wording suggests SIGUP, but that's misleading.  Suggest
"tcp_chr_read() will detect the G_IO_HUP condition, and call".

> Due to the fact that monitor reads 1 byte at a time (for
> each tcp_chr_read()), the monitor readline state / buffers can be left
> in an inconsistent state (i.e. a half-finished command).

The state is not really inconsistent, there's just junk left in
rs->cmd_buf[].

>  Thus, without
> calling readline_restart() on mon->rs upon CHR_EVENT_CLOSED, future HMP
> commands will fail.

To make sure I understand you correctly: when you connect again, any
leftover junk is prepended to your input, which messes up your first
command.  Correct?

> Signed-off-by: Stratos Psomadakis 
> Signed-off-by: Dimitris Aragiorgis 
> ---
>  monitor.c |1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/monitor.c b/monitor.c
> index 34cee74..7857300 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -5252,6 +5252,7 @@ static void monitor_event(void *opaque, int event)
>  break;
>  
>  case CHR_EVENT_CLOSED:
> +readline_restart(mon->rs);
>  mon_refcount--;
>  monitor_fdsets_cleanup();
>  break;

Patch looks good to me.



Re: [Qemu-devel] [PATCH 0/2] Monitor-related fixes

2014-09-11 Thread Markus Armbruster
You neglected to cc maintainers.

Stratos Psomadakis  writes:

> Hi,
>
> the first patch fixes an issue with HMP monitors, which was exposed
> with v2.1.0 (commits cdaa86a and 812c10).

Copying Luiz.

> The second one fixes a typo in a helper C program used in
> qemu-iotests.

Copying Kevin and Stefan.

> We think that they should be cherry-picked for the next stable release.

Copying qemu-sta...@nongnu.org.



Re: [Qemu-devel] [PATCH 1/2] vl: Fix the confused logic for '-m' option

2014-09-11 Thread Li Liu


On 2014/9/12 13:58, zhanghailiang wrote:
> It should be valid for the follow configure:
> -m 256,slots=0
> -m 256,maxmem=256M
> -m 256,slots=0,maxmem=256M
> -m 256,slots=x,maxmem=y  where x > 0 and y > 256M
> 
> Fix the confused code logic and use error_report instead of fprintf.
> 
> Printing the maxmem in hex, same with ram_size.
> 
> Signed-off-by: zhanghailiang 
> ---
>  vl.c | 46 +++---
>  1 file changed, 27 insertions(+), 19 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index 9c9acf5..f547405 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -3306,6 +3306,7 @@ int main(int argc, char **argv, char **envp)
>  break;
>  case QEMU_OPTION_m: {
>  uint64_t sz;
> +uint64_t slots;
>  const char *mem_str;
>  const char *maxmem_str, *slots_str;
>  
> @@ -3353,40 +3354,47 @@ int main(int argc, char **argv, char **envp)
>  
>  maxmem_str = qemu_opt_get(opts, "maxmem");
>  slots_str = qemu_opt_get(opts, "slots");
> -if (maxmem_str && slots_str) {
> -uint64_t slots;
> -
> +if (maxmem_str) {
>  sz = qemu_opt_get_size(opts, "maxmem", 0);
> +}
> +if (slots_str) {
> +slots = qemu_opt_get_number(opts, "slots", 0);
> +}
> +if (maxmem_str && slots_str) {
>  if (sz < ram_size) {
> -fprintf(stderr, "qemu: invalid -m option value: 
> maxmem "
> -"(%" PRIu64 ") <= initial memory ("
> +error_report("qemu: invalid -m option value: maxmem "
> +"(%" PRIx64 ") < initial memory ("
>  RAM_ADDR_FMT ")\n", sz, ram_size);

error_report will add a '\n' automatically. Below lines have the same issue.

>  exit(EXIT_FAILURE);
>  }
> -
> -slots = qemu_opt_get_number(opts, "slots", 0);
> -if ((sz > ram_size) && !slots) {
> -fprintf(stderr, "qemu: invalid -m option value: 
> maxmem "
> -"(%" PRIu64 ") more than initial memory ("
> +if (!slots && (sz != ram_size)) {
> +error_report("qemu: invalid -m option value: maxmem "
> +"(%" PRIx64 ") more than initial memory ("
>  RAM_ADDR_FMT ") but no hotplug slots where "
>  "specified\n", sz, ram_size);
>  exit(EXIT_FAILURE);
>  }
> -
> -if ((sz <= ram_size) && slots) {
> -fprintf(stderr, "qemu: invalid -m option value:  %"
> +if (slots && (sz == ram_size)) {
> +error_report("qemu: invalid -m option value:  %"
>  PRIu64 " hotplug slots where specified but "
> -"maxmem (%" PRIu64 ") <= initial memory ("
> +"maxmem (%" PRIx64 ") = initial memory ("
>  RAM_ADDR_FMT ")\n", slots, sz, ram_size);
>  exit(EXIT_FAILURE);
>  }
>  maxram_size = sz;
>  ram_slots = slots;
> -} else if ((!maxmem_str && slots_str) ||
> -   (maxmem_str && !slots_str)) {
> -fprintf(stderr, "qemu: invalid -m option value: missing "
> -"'%s' option\n", slots_str ? "maxmem" : "slots");
> -exit(EXIT_FAILURE);
> +} else if (!maxmem_str && slots_str) {
> +if (slots > 0) {
> +error_report("qemu: invalid -m option value: missing 
> "
> +"'maxmem' option\n");
> +exit(EXIT_FAILURE);
> +}
> +} else if (maxmem_str && !slots_str) {
> +if (sz != ram_size) {
> +error_report("qemu: invalid -m option value: missing 
> "
> +"'slot' option\n");
> +exit(EXIT_FAILURE);
> +}
>  }
>  break;
>  }
> 




Re: [Qemu-devel] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Gerd Hoffmann
 Hi,

> With OVMF, I have EHCI with the high-speed (and working)
> keyboard+mouse, but ONLY ONE UHCI device (UHCI3). UHCI1 UHCI2 do not
> get detected.

Now *that* is really strange, especially as UHCI1 is pci function 0,
without probing that successfully you wouldn't see the other pci
functions (1+2+7 for uhci2+uhci3+ehci) in the same slot in the first
place.

> Obviously, since QEMU was routing the slow keyboard+mouse to UHCI1,
> which for some reason gets masked when OVMF is used, things weren't
> working.

Try '-device usb-mouse,port=5 -device usb-kbd,port=6'.  That links mouse
+kbd to the ports which are routed to uhci3.  Which of course also only
papers over the root cause.  But should do as workaround for the time
being and is also a useful data point.

cheers,
  Gerd





Re: [Qemu-devel] [PATCH v8 00/30] modify boot order of guest, and take effect after rebooting

2014-09-11 Thread Gonglei (Arei)
> Subject: Re: [Qemu-devel] [PATCH v8 00/30] modify boot order of guest, and
> take effect after rebooting
> 
> Am 11.09.2014 um 07:58 schrieb Gonglei (Arei):
> >>> ./qemu-system-x86_64 -enable-kvm -m 4096 -smp 4 -name redhat6.2
> -drive \
> >>> file=/home/win7_32_2U,if=none,id=drive-ide0-0-0 -device
> ide-hd,bus=ide.0,\
> >>> unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive \
> >>> file=/home/rhel-server-7.0-x86_64-dvd.iso,if=none,id=drive-ide0-0-1 \
> -device
> >>> ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=4 \ 
> >>> -vnc
> >>> 0.0.0.0:10 -netdev type=user,id=net0 -device
> >>> virtio-net-pci,netdev=net0,bootindex=3,id=nic1 \ -drive
> >>>
> file=/mnt/sdb/gonglei/image/virtio-win-1.5.3.vfd,if=none,id=drive-fdc0-0-0,for
> >>> mat=raw \ -device isa-fdc,driveA=drive-fdc0-0-0,bootindexA=5,id=floppy1
> >> -qmp
> >>> unix:/tmp/qmp,server,nowait \ -monitor stdio -netdev type=user,id=net1
> >>> -device e1000,netdev=net1,bootindex=2,id=nic \ -boot menu=on -device
> >>> virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32,if=none,\
> >>> id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \ -device
> >>>
> >>
> scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-
> >>> 0-0-0,bootindex=8
> >>> QEMU 2.1.50 monitor - type 'help' for more information
> >>> (qemu) qom-get nic1 bootindex
> >>> 3 (0x3)
> >>> (qemu) qom-set nic1 bootindex 3
> >>> The bootindex 3 has already been used
> >>> (qemu) qom-set nic1 bootindex 0
> >>> (qemu) qom-set floppy1 bootindexA 3
> >>> (qemu) system_reset
> >>> (qemu) qom-get nic1 bootindex
> >>> 0 (0x0)
> >>> (qemu) qom-get scsi0-0-0-0 bootindex
> >>> 8 (0x8)
> >>> (qemu) qom-set scsi0-0-0-0 bootindex 0
> >>> The bootindex 0 has already been used
> >>> (qemu) qom-set nic1 bootindex -1
> >>> (qemu) qom-set scsi0-0-0-0 bootindex 0
> >>> (qemu) qom-get scsi0-0-0-0 bootindex
> >>> 0 (0x0)
> >>> (qemu)
> >>>
> >>
> >> Hmm..., seems we also need something like this:
> >> (qemu) qom-get bootindex
> >> dev0 bootindex 0
> >> dev1 bootindex 1
> >> dev2 bootindex 2
> >> ...
> > I don't think so. Qom-get interface is ready-to-wear, we must
> > provide both QOM path and QOM property name. This interface
> > is not added by me. Thanks. :)
> 
> Yeah, I don't think we need wildcards for qom-get here. I wouldn't
> oppose a new HMP command though if someone really sees a need.
> 
> > BTW, you can look at with Andreas's patch series:
> > http://thread.gmane.org/gmane.comp.emulators.qemu/271513
> >
> > which haven't applied in mater tree.
> 
> Sigh, many things on the to-do list... :(
> 

Looking forward to your news about this series. :)

Best regards,
-Gonglei

> Regards,
> Andreas
> 
> --
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg


Re: [Qemu-devel] New requirement for getting block layer patches merged

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

>> EOF
>> ---
>> If you have feedback or questions, let us know.  The process can be
>> tweaked as time goes on so we can continue to improve.
>
> Great mail.

Yup.  Let's see how it works out.

> Now we need a wiki entry describing the process.
> Also we need something reminding who is the maintainer of the current week.

Usually obvious from the "applied to my tree" messages.



Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Markus Armbruster
Eric Blake  writes:

> On 09/11/2014 01:34 PM, Benoît Canet wrote:
>> The Thursday 11 Sep 2014 à 21:12:44 (+0200), Markus Armbruster wrote :
>>> Benoît Canet  writes:
>>>
> +   blk_bs(blk_by_legacy_dinfo(dinfo)));

 This seems to be a fairly common pattern: "blk_bs(blk_by_legacy_dinfo())".
 How about a helper function ?
>>>
>>> Yes, except the pattern is going to evaporate in patch 14 :)
>> 
>> haha
>
> Still, worth mentioning this fact in the commit message, rather than
> making reviewers guess at it (that is, add a sentence something like:
> This commit mechanically introduces long lines via repetitive pattern
> replacement that will be shortened in a later patch

I can do that.



Re: [Qemu-devel] [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled

2014-09-11 Thread Hongyang Yang



在 08/01/2014 10:43 PM, Dr. David Alan Gilbert 写道:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:

migrate colo info to migration target to tell the target colo is
enabled.


If I understand this correctly this means that you send a 'colo info' device
information for migrations that don't have COLO enabled; that's bad because
it breaks migration unless the destination has it; I guess it's OK if you
were to guard it with a thing so it didn't do it for old machine-types.

You could use the QEMU_VM_COMMAND sections I've created for postcopy;
( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00889.html ) and
add a QEMU_VM_CMD_COLO to indicate you want the destination to become an SVM,
   then check the capability near the start of migration and send the command.


Thank you for the reference, I've read part of your Postcopy patches, but
haven't into detailed implementation. I will use QEMUSizedBuffer/QEMUFile in
next version. For QEMU_VM_COMMAND sections, can you separate it out so that I
can make use of it? Do you have a public git tree or something?



Or perhaps there's a way to add the colo-info device on the command line so it's
not always there.

Dave


Signed-off-by: Yang Hongyang 
---
  Makefile.objs  |  1 +
  include/migration/migration-colo.h |  3 ++
  migration-colo-comm.c  | 68 ++
  vl.c   |  4 +++
  4 files changed, 76 insertions(+)
  create mode 100644 migration-colo-comm.c

diff --git a/Makefile.objs b/Makefile.objs
index cab5824..1836a68 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
  common-obj-$(CONFIG_LINUX) += fsdev/

  common-obj-y += migration.o migration-tcp.o
+common-obj-y += migration-colo-comm.o
  common-obj-$(CONFIG_COLO) += migration-colo.o
  common-obj-y += vmstate.o
  common-obj-y += qemu-file.o
diff --git a/include/migration/migration-colo.h 
b/include/migration/migration-colo.h
index 35b384c..e3735d8 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -12,6 +12,9 @@
  #define QEMU_MIGRATION_COLO_H

  #include "qemu-common.h"
+#include "migration/migration.h"
+
+void colo_info_mig_init(void);

  bool colo_supported(void);

diff --git a/migration-colo-comm.c b/migration-colo-comm.c
new file mode 100644
index 000..ccbc246
--- /dev/null
+++ b/migration-colo-comm.c
@@ -0,0 +1,68 @@
+/*
+ *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ *  (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ *  Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include 
+
+#define DEBUG_COLO
+
+#ifdef DEBUG_COLO
+#define DPRINTF(fmt, ...) \
+do { fprintf(stdout, "COLO: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+do { } while (0)
+#endif
+
+static bool colo_requested;
+
+/* save */
+
+static bool migrate_use_colo(void)
+{
+MigrationState *s = migrate_get_current();
+return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
+}
+
+static void colo_info_save(QEMUFile *f, void *opaque)
+{
+qemu_put_byte(f, migrate_use_colo());
+}
+
+/* restore */
+
+static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
+{
+int value = qemu_get_byte(f);
+
+if (value && !colo_supported()) {
+fprintf(stderr, "COLO is not supported\n");
+return -EINVAL;
+}
+
+if (value && !colo_requested) {
+DPRINTF("COLO requested!\n");
+}
+
+colo_requested = value;
+
+return 0;
+}
+
+static SaveVMHandlers savevm_colo_info_handlers = {
+.save_state = colo_info_save,
+.load_state = colo_info_load,
+};
+
+void colo_info_mig_init(void)
+{
+register_savevm_live(NULL, "colo info", -1, 1,
+ &savevm_colo_info_handlers, NULL);
+}
diff --git a/vl.c b/vl.c
index fe451aa..1a282d8 100644
--- a/vl.c
+++ b/vl.c
@@ -89,6 +89,7 @@ int main(int argc, char **argv)
  #include "sysemu/dma.h"
  #include "audio/audio.h"
  #include "migration/migration.h"
+#include "migration/migration-colo.h"
  #include "sysemu/kvm.h"
  #include "qapi/qmp/qjson.h"
  #include "qemu/option.h"
@@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)

  blk_mig_init();
  ram_mig_init();
+if (colo_supported()) {
+colo_info_mig_init();
+}

  /* open the virtual block devices */
  if (snapshot)
--
1.9.1


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
.



--
Thanks,
Yang.



Re: [Qemu-devel] ui causes latest master build failure on rhel6

2014-09-11 Thread Gerd Hoffmann
> > pixman-0.21.8 is the oldest release which has PIXMAN_TYPE_RGBA.
> > We don't check for a minimum version.
> > RHEL-6 is probably older.

Uhm, well, it isn't (any more):

[root@rhel6 ~]# rpm -q pixman
pixman-0.26.2-5.1.el6_5.x86_64

Seems to be rebased somewhen (used to be 0.16 for RHEL-6.0 IIRC).
Looks like you should simply update your RHEL-6 machine.

> > Try 'configure --without-system-pixman' as workaround (after checking
> > out the pixman submodule).
> 
> I guess but is it a must?
> We used to have this within an ifdef:
> +#if PIXMAN_VERSION >= PIXMAN_VERSION_ENCODE(0, 21, 8)
> +type = PIXMAN_TYPE_RGBA;
> +#endif
> +}

That ifdef is still there, we got new PIXMAN_TYPE_RGBA references
though.

> can't we keep system pixman working a bit longer?

I'd much prefer to raise the minimum pixman version and ditch the
#ifdefs.  The common kvm case will work fine without RGBA, you may run
into trouble though when emulating bigendian guests on a little endian
host (or visa versa), because pixman will have to deal with uncommon
pixel formats then.

cheers,
  Gerd





Re: [Qemu-devel] [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver

2014-09-11 Thread Hongyang Yang



在 08/01/2014 11:10 PM, Dr. David Alan Gilbert 写道:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:

The ram cache was initially the same as PVM's memory. At
checkpoint, we cache the dirty memory of PVM into ram cache
(so that ram cache always the same as PVM's memory at every
checkpoint), flush cached memory to SVM after we received
all PVM dirty memory(only needed to flush memory that was
both dirty on PVM and SVM since last checkpoint).


(Typo: 'r' on the end of the title)

I think I understand the need for the cache, to be able to restore pages
that the SVM has modified that the PVM hadn't; however, if I understand
the change here, (to host_from_stream_offset) the SVM will load the
snapshot into the ram_cache rather than directly into host memory - why
is this necessary?  If the SVMs CPU is stopped at this point couldn't
it load snapshot pages directly into host memory, clearing pages in the SVMs
bitmap, so that the only pages that then get copied in flush_cache are
the pages that the SVM modified but the PVM *didn't* include in the snapshot?
I can see that you would need to do it the way you've done it if the
snapshot-load could fail (at the sametime the PVM failed) and thus the old SVM
state would be the surviving state, but how could it fail at this point
given the whole stream is in the colo-buffer?


I can see your confusion. Yes, you are right, we can do as what you said, but
at last, we still need to copy the dirty pages into ram cache as well (because
the ram cache is a snapshot and we need to keep this updated). So the question
is whether we load the dirty pages into snapshot first or into host memory
first. I think both methods can work and make no difference...





+static void ram_flush_cache(void);
  static int ram_load(QEMUFile *f, void *opaque, int version_id)
  {
  ram_addr_t addr;
  int flags, ret = 0;
  static uint64_t seq_iter;
+bool need_flush = false;


Probably better as 'ram_cache_needs_flush'

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
.



--
Thanks,
Yang.



Re: [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol

2014-09-11 Thread Hongyang Yang



在 08/01/2014 11:03 PM, Dr. David Alan Gilbert 写道:

* Yang Hongyang (yan...@cn.fujitsu.com) wrote:

implement colo checkpoint protocol.

Checkpoint synchronzing points.

   Primary Secondary
   NEW @
   Suspend
   SUSPENDED   @
   Suspend&Save state
   SEND@
   Send state  Receive state
   RECEIVED@
   Flush network   Load state
   LOADED  @
   Resume  Resume

   Start Comparing
NOTE:
  1) '@' who sends the message
  2) Every sync-point is synchronized by two sides with only
 one handshake(single direction) for low-latency.
 If more strict synchronization is required, a opposite direction
 sync-point should be added.
  3) Since sync-points are single direction, the remote side may
 go forward a lot when this side just receives the sync-point.

Signed-off-by: Yang Hongyang 
---
  migration-colo.c | 268 +--
  1 file changed, 262 insertions(+), 6 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index 2699e77..a708872 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -24,6 +24,41 @@
   */
  #define CHKPOINT_TIMER 1

+enum {
+COLO_READY = 0x46,
+
+/*
+ * Checkpoint synchronzing points.
+ *
+ *  Primary Secondary
+ *  NEW @
+ *  Suspend
+ *  SUSPENDED   @
+ *  Suspend&Save state
+ *  SEND@
+ *  Send state  Receive state
+ *  RECEIVED@
+ *  Flush network   Load state
+ *  LOADED  @
+ *  Resume  Resume
+ *
+ *  Start Comparing
+ * NOTE:
+ * 1) '@' who sends the message
+ * 2) Every sync-point is synchronized by two sides with only
+ *one handshake(single direction) for low-latency.
+ *If more strict synchronization is required, a opposite direction
+ *sync-point should be added.
+ * 3) Since sync-points are single direction, the remote side may
+ *go forward a lot when this side just receives the sync-point.
+ */
+COLO_CHECKPOINT_NEW,
+COLO_CHECKPOINT_SUSPENDED,
+COLO_CHECKPOINT_SEND,
+COLO_CHECKPOINT_RECEIVED,
+COLO_CHECKPOINT_LOADED,
+};
+
  static QEMUBH *colo_bh;

  bool colo_supported(void)
@@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
  .close = colo_close,
  };

+/* colo checkpoint control helper */
+static bool is_master(void);
+static bool is_slave(void);
+
+static void ctl_error_handler(void *opaque, int err)
+{
+if (is_slave()) {
+/* TODO: determine whether we need to failover */
+/* FIXME: we will not failover currently, just kill slave */
+error_report("error: colo transmission failed!\n");
+exit(1);
+} else if (is_master()) {
+/* Master still alive, do not failover */
+error_report("error: colo transmission failed!\n");
+return;
+} else {
+error_report("COLO: Unexpected error happend!\n");
+exit(EXIT_FAILURE);
+}
+}
+
+static int colo_ctl_put(QEMUFile *f, uint64_t request)
+{
+int ret = 0;
+
+qemu_put_be64(f, request);
+qemu_fflush(f);
+
+ret = qemu_file_get_error(f);
+if (ret < 0) {
+ctl_error_handler(f, ret);
+return 1;
+}
+
+return ret;
+}
+
+static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
+{
+int ret = 0;
+uint64_t temp;
+
+temp = qemu_get_be64(f);
+
+ret = qemu_file_get_error(f);
+if (ret < 0) {
+ctl_error_handler(f, ret);
+return 1;
+}
+
+*value = temp;
+return 0;
+}
+
+static int colo_ctl_get(QEMUFile *f, uint64_t require)
+{
+int ret;
+uint64_t value;
+
+ret = colo_ctl_get_value(f, &value);
+if (ret) {
+return ret;
+}
+
+if (value != require) {
+error_report("unexpected state received!\n");


I find it useful to print the expected/received state to
be able to figure out what went wrong.


Good idea!




+exit(1);
+}
+
+return ret;
+}
+
  /* save */

-static __attribute__((unused)) bool is_master(void)
+static bool is_master(void)
  {
  MigrationState *s = migrate_get_current();
  return (s->state == MIG_STATE_COLO);
  }

+static int do_colo_transaction(MigrationState *s, QEMUFile *control,
+   QEMUFile *trans)
+{
+int ret;
+
+ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
+if (ret) {
+goto out;
+}
+
+ret = colo_ctl_get(control, COLO_C

[Qemu-devel] [PATCH 1/2] vl: Fix the confused logic for '-m' option

2014-09-11 Thread zhanghailiang
It should be valid for the follow configure:
-m 256,slots=0
-m 256,maxmem=256M
-m 256,slots=0,maxmem=256M
-m 256,slots=x,maxmem=y  where x > 0 and y > 256M

Fix the confused code logic and use error_report instead of fprintf.

Printing the maxmem in hex, same with ram_size.

Signed-off-by: zhanghailiang 
---
 vl.c | 46 +++---
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/vl.c b/vl.c
index 9c9acf5..f547405 100644
--- a/vl.c
+++ b/vl.c
@@ -3306,6 +3306,7 @@ int main(int argc, char **argv, char **envp)
 break;
 case QEMU_OPTION_m: {
 uint64_t sz;
+uint64_t slots;
 const char *mem_str;
 const char *maxmem_str, *slots_str;
 
@@ -3353,40 +3354,47 @@ int main(int argc, char **argv, char **envp)
 
 maxmem_str = qemu_opt_get(opts, "maxmem");
 slots_str = qemu_opt_get(opts, "slots");
-if (maxmem_str && slots_str) {
-uint64_t slots;
-
+if (maxmem_str) {
 sz = qemu_opt_get_size(opts, "maxmem", 0);
+}
+if (slots_str) {
+slots = qemu_opt_get_number(opts, "slots", 0);
+}
+if (maxmem_str && slots_str) {
 if (sz < ram_size) {
-fprintf(stderr, "qemu: invalid -m option value: maxmem 
"
-"(%" PRIu64 ") <= initial memory ("
+error_report("qemu: invalid -m option value: maxmem "
+"(%" PRIx64 ") < initial memory ("
 RAM_ADDR_FMT ")\n", sz, ram_size);
 exit(EXIT_FAILURE);
 }
-
-slots = qemu_opt_get_number(opts, "slots", 0);
-if ((sz > ram_size) && !slots) {
-fprintf(stderr, "qemu: invalid -m option value: maxmem 
"
-"(%" PRIu64 ") more than initial memory ("
+if (!slots && (sz != ram_size)) {
+error_report("qemu: invalid -m option value: maxmem "
+"(%" PRIx64 ") more than initial memory ("
 RAM_ADDR_FMT ") but no hotplug slots where "
 "specified\n", sz, ram_size);
 exit(EXIT_FAILURE);
 }
-
-if ((sz <= ram_size) && slots) {
-fprintf(stderr, "qemu: invalid -m option value:  %"
+if (slots && (sz == ram_size)) {
+error_report("qemu: invalid -m option value:  %"
 PRIu64 " hotplug slots where specified but "
-"maxmem (%" PRIu64 ") <= initial memory ("
+"maxmem (%" PRIx64 ") = initial memory ("
 RAM_ADDR_FMT ")\n", slots, sz, ram_size);
 exit(EXIT_FAILURE);
 }
 maxram_size = sz;
 ram_slots = slots;
-} else if ((!maxmem_str && slots_str) ||
-   (maxmem_str && !slots_str)) {
-fprintf(stderr, "qemu: invalid -m option value: missing "
-"'%s' option\n", slots_str ? "maxmem" : "slots");
-exit(EXIT_FAILURE);
+} else if (!maxmem_str && slots_str) {
+if (slots > 0) {
+error_report("qemu: invalid -m option value: missing "
+"'maxmem' option\n");
+exit(EXIT_FAILURE);
+}
+} else if (maxmem_str && !slots_str) {
+if (sz != ram_size) {
+error_report("qemu: invalid -m option value: missing "
+"'slot' option\n");
+exit(EXIT_FAILURE);
+}
 }
 break;
 }
-- 
1.7.12.4





[Qemu-devel] [PATCH 2/2] rdma: Fix incorrect description in comments

2014-09-11 Thread zhanghailiang
Since we have supported memory hotplug, VM's ram include pc.ram
and hotplug-memory.

Fix the confused description for rdma migration: pc.ram -> VM's ram

Signed-off-by: zhanghailiang 
---
 docs/rdma.txt| 6 +++---
 migration-rdma.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/rdma.txt b/docs/rdma.txt
index 1f5d9e9..2bdd0a5 100644
--- a/docs/rdma.txt
+++ b/docs/rdma.txt
@@ -18,7 +18,7 @@ Contents:
 * RDMA Migration Protocol Description
 * Versioning and Capabilities
 * QEMUFileRDMA Interface
-* Migration of pc.ram
+* Migration of VM's ram
 * Error handling
 * TODO
 
@@ -149,7 +149,7 @@ The only difference between a SEND message and an RDMA
 message is that SEND messages cause notifications
 to be posted to the completion queue (CQ) on the
 infiniband receiver side, whereas RDMA messages (used
-for pc.ram) do not (to behave like an actual DMA).
+for VM's ram) do not (to behave like an actual DMA).
 
 Messages in infiniband require two things:
 
@@ -355,7 +355,7 @@ If the buffer is empty, then we follow the same steps
 listed above and issue another "QEMU File" protocol command,
 asking for a new SEND message to re-fill the buffer.
 
-Migration of pc.ram:
+Migration of VM's ram:
 
 
 At the beginning of the migration, (migration-rdma.c),
diff --git a/migration-rdma.c b/migration-rdma.c
index d99812c..b32dbdf 100644
--- a/migration-rdma.c
+++ b/migration-rdma.c
@@ -2523,7 +2523,7 @@ static void *qemu_rdma_data_init(const char *host_port, 
Error **errp)
 /*
  * QEMUFile interface to the control channel.
  * SEND messages for control only.
- * pc.ram is handled with regular RDMA messages.
+ * VM's ram is handled with regular RDMA messages.
  */
 static int qemu_rdma_put_buffer(void *opaque, const uint8_t *buf,
 int64_t pos, int size)
@@ -2539,7 +2539,7 @@ static int qemu_rdma_put_buffer(void *opaque, const 
uint8_t *buf,
 
 /*
  * Push out any writes that
- * we're queued up for pc.ram.
+ * we're queued up for VM's ram.
  */
 ret = qemu_rdma_write_flush(f, rdma);
 if (ret < 0) {
-- 
1.7.12.4





Re: [Qemu-devel] [RESEND PATCH v3 5/8] pc-dimm: Add pc_dimm_unrealize() for memory hot unplug support.

2014-09-11 Thread zhanghailiang

On 2014/9/4 21:28, Igor Mammedov wrote:

On Wed, 27 Aug 2014 16:08:36 +0800
Tang Chen  wrote:


From: Hu Tao

Implement unrealize function for pc-dimm device. It delete subregion from

s/delete/removes/


hotplug region, and delete ram address range from guest ram list.

Signed-off-by: Hu Tao
Signed-off-by: Tang Chen
---
  hw/mem/pc-dimm.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 20fe0dc..34109a2 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -270,12 +270,22 @@ static MemoryRegion 
*pc_dimm_get_memory_region(PCDIMMDevice *dimm)
  return host_memory_backend_get_memory(dimm->hostmem,&error_abort);
  }

+static void pc_dimm_unrealize(DeviceState *dev, Error **errp)
+{
+PCDIMMDevice *dimm = PC_DIMM(dev);
+MemoryRegion *mr = pc_dimm_get_memory_region(dimm);
+
+memory_region_del_subregion(mr->container, mr);

usually MemoryRegion is treated as opaque and it's fields
accessed via memory_region_foo() helpers.


+vmstate_unregister_ram(mr, dev);

these 2 lines look like a job for PCMachine which did original mapping/vmstate 
registration



Actually, this patch also fix the bug *when hotplug memory failing in
the place where after pc_dimm_plug but before the end of device_set_realized,
it does not clear the work done by pc_dimm_plug*.

For there is no callback like pc_dimm_plug_fail_cb for us to call when fail,
Maybe pc_dimm_unrealize is the only place where we can do the clear work...

Thanks,
zhanghailiang


+}
+
  static void pc_dimm_class_init(ObjectClass *oc, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(oc);
  PCDIMMDeviceClass *ddc = PC_DIMM_CLASS(oc);

  dc->realize = pc_dimm_realize;
+dc->unrealize = pc_dimm_unrealize;
  dc->props = pc_dimm_properties;

  ddc->get_memory_region = pc_dimm_get_memory_region;











Re: [Qemu-devel] [PATCH] Skip vfio mmap bar regions during memory dump

2014-09-11 Thread Alex Williamson
On Fri, 2014-09-12 at 10:12 +0530, Nikunj A Dadhania wrote:
> The PCI MMIO might be disabled or the device in the reset state.
> Make sure we do not dump these memory regions.
> 
> Signed-off-by: Nikunj A Dadhania 
> ---

If you don't want to dump a memory region then add the ability to set a
no-dump flag on the region and have vfio call it, don't make something
as specific as a vfio mmap region.  Thanks,

Alex

>  hw/misc/vfio.c|  2 +-
>  include/exec/memory.h | 27 +++
>  memory.c  | 16 
>  memory_mapping.c  |  3 ++-
>  4 files changed, 46 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index d69bb29..fd6dbe9 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -2721,7 +2721,7 @@ static int vfio_mmap_bar(VFIODevice *vdev, VFIOBAR *bar,
>  goto empty_region;
>  }
>  
> -memory_region_init_ram_ptr(submem, OBJECT(vdev), name, size, *map);
> +memory_region_init_vfio_mmap(submem, OBJECT(vdev), name, size, *map);
>  } else {
>  empty_region:
>  /* Create a zero sized sub-region to make cleanup easy. */
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index fc6e93d..e184df6 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -144,6 +144,7 @@ struct MemoryRegion {
>  bool terminates;
>  bool romd_mode;
>  bool ram;
> +bool vfio_mmap;
>  bool readonly; /* For RAM regions */
>  bool enabled;
>  bool rom_device;
> @@ -329,6 +330,23 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
>  void *ptr);
>  
>  /**
> + * memory_region_init_vfio_mmap:  Initialize VFIO mmap memory region from a
> + *user-provided pointer.  Accesses into the
> + *region will modify memory directly.
> + *
> + * @mr: the #MemoryRegion to be initialized.
> + * @owner: the object that tracks the region's reference count
> + * @name: the name of the region.
> + * @size: size of the region.
> + * @ptr: memory to be mapped; must contain at least @size bytes.
> + */
> +void memory_region_init_vfio_mmap(MemoryRegion *mr,
> +  struct Object *owner,
> +  const char *name,
> +  uint64_t size,
> +  void *ptr);
> +
> +/**
>   * memory_region_init_alias: Initialize a memory region that aliases all or a
>   *   part of another memory region.
>   *
> @@ -434,6 +452,15 @@ uint64_t memory_region_size(MemoryRegion *mr);
>  bool memory_region_is_ram(MemoryRegion *mr);
>  
>  /**
> + * memory_region_is_vfio_mmap: check whether a memory region is VFIO mmap
> + *
> + * Returns %true is a memory region is VFIO.
> + *
> + * @mr: the memory region being queried
> + */
> +bool memory_region_is_vfio_mmap(MemoryRegion *mr);
> +
> +/**
>   * memory_region_is_romd: check whether a memory region is in ROMD mode
>   *
>   * Returns %true if a memory region is a ROM device and currently set to 
> allow
> diff --git a/memory.c b/memory.c
> index 664d3e6..63eadaa 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -847,6 +847,7 @@ void memory_region_init(MemoryRegion *mr,
>  mr->enabled = true;
>  mr->terminates = false;
>  mr->ram = false;
> +mr->vfio_mmap = false;
>  mr->romd_mode = true;
>  mr->readonly = false;
>  mr->rom_device = false;
> @@ -1032,6 +1033,16 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
>  mr->ram_addr = qemu_ram_alloc_from_ptr(size, ptr, mr);
>  }
>  
> +void memory_region_init_vfio_mmap(MemoryRegion *mr,
> +  Object *owner,
> +  const char *name,
> +  uint64_t size,
> +  void *ptr)
> +{
> +memory_region_init_ram_ptr(mr, owner, name, size, ptr);
> +mr->vfio_mmap = true;
> +}
> +
>  void memory_region_init_alias(MemoryRegion *mr,
>Object *owner,
>const char *name,
> @@ -1129,6 +1140,11 @@ bool memory_region_is_ram(MemoryRegion *mr)
>  return mr->ram;
>  }
>  
> +bool memory_region_is_vfio_mmap(MemoryRegion *mr)
> +{
> +return mr->vfio_mmap;
> +}
> +
>  bool memory_region_is_logging(MemoryRegion *mr)
>  {
>  return mr->dirty_log_mask;
> diff --git a/memory_mapping.c b/memory_mapping.c
> index 87a6ed5..64ef46d 100644
> --- a/memory_mapping.c
> +++ b/memory_mapping.c
> @@ -203,7 +203,8 @@ static void guest_phys_blocks_region_add(MemoryListener 
> *listener,
>  GuestPhysBlock *predecessor;
>  
>  /* we only care about RAM */
> -if (!memory_region_is_ram(section->mr)) {
> +if (!memory_region_is_ram(section->mr) ||
> +memory_region_is_vfio_mmap(section->mr)) {
>  return;
>  }
>  






[Qemu-devel] [PATCH] Skip vfio mmap bar regions during memory dump

2014-09-11 Thread Nikunj A Dadhania
The PCI MMIO might be disabled or the device in the reset state.
Make sure we do not dump these memory regions.

Signed-off-by: Nikunj A Dadhania 
---
 hw/misc/vfio.c|  2 +-
 include/exec/memory.h | 27 +++
 memory.c  | 16 
 memory_mapping.c  |  3 ++-
 4 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index d69bb29..fd6dbe9 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -2721,7 +2721,7 @@ static int vfio_mmap_bar(VFIODevice *vdev, VFIOBAR *bar,
 goto empty_region;
 }
 
-memory_region_init_ram_ptr(submem, OBJECT(vdev), name, size, *map);
+memory_region_init_vfio_mmap(submem, OBJECT(vdev), name, size, *map);
 } else {
 empty_region:
 /* Create a zero sized sub-region to make cleanup easy. */
diff --git a/include/exec/memory.h b/include/exec/memory.h
index fc6e93d..e184df6 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -144,6 +144,7 @@ struct MemoryRegion {
 bool terminates;
 bool romd_mode;
 bool ram;
+bool vfio_mmap;
 bool readonly; /* For RAM regions */
 bool enabled;
 bool rom_device;
@@ -329,6 +330,23 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
 void *ptr);
 
 /**
+ * memory_region_init_vfio_mmap:  Initialize VFIO mmap memory region from a
+ *user-provided pointer.  Accesses into the
+ *region will modify memory directly.
+ *
+ * @mr: the #MemoryRegion to be initialized.
+ * @owner: the object that tracks the region's reference count
+ * @name: the name of the region.
+ * @size: size of the region.
+ * @ptr: memory to be mapped; must contain at least @size bytes.
+ */
+void memory_region_init_vfio_mmap(MemoryRegion *mr,
+  struct Object *owner,
+  const char *name,
+  uint64_t size,
+  void *ptr);
+
+/**
  * memory_region_init_alias: Initialize a memory region that aliases all or a
  *   part of another memory region.
  *
@@ -434,6 +452,15 @@ uint64_t memory_region_size(MemoryRegion *mr);
 bool memory_region_is_ram(MemoryRegion *mr);
 
 /**
+ * memory_region_is_vfio_mmap: check whether a memory region is VFIO mmap
+ *
+ * Returns %true is a memory region is VFIO.
+ *
+ * @mr: the memory region being queried
+ */
+bool memory_region_is_vfio_mmap(MemoryRegion *mr);
+
+/**
  * memory_region_is_romd: check whether a memory region is in ROMD mode
  *
  * Returns %true if a memory region is a ROM device and currently set to allow
diff --git a/memory.c b/memory.c
index 664d3e6..63eadaa 100644
--- a/memory.c
+++ b/memory.c
@@ -847,6 +847,7 @@ void memory_region_init(MemoryRegion *mr,
 mr->enabled = true;
 mr->terminates = false;
 mr->ram = false;
+mr->vfio_mmap = false;
 mr->romd_mode = true;
 mr->readonly = false;
 mr->rom_device = false;
@@ -1032,6 +1033,16 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
 mr->ram_addr = qemu_ram_alloc_from_ptr(size, ptr, mr);
 }
 
+void memory_region_init_vfio_mmap(MemoryRegion *mr,
+  Object *owner,
+  const char *name,
+  uint64_t size,
+  void *ptr)
+{
+memory_region_init_ram_ptr(mr, owner, name, size, ptr);
+mr->vfio_mmap = true;
+}
+
 void memory_region_init_alias(MemoryRegion *mr,
   Object *owner,
   const char *name,
@@ -1129,6 +1140,11 @@ bool memory_region_is_ram(MemoryRegion *mr)
 return mr->ram;
 }
 
+bool memory_region_is_vfio_mmap(MemoryRegion *mr)
+{
+return mr->vfio_mmap;
+}
+
 bool memory_region_is_logging(MemoryRegion *mr)
 {
 return mr->dirty_log_mask;
diff --git a/memory_mapping.c b/memory_mapping.c
index 87a6ed5..64ef46d 100644
--- a/memory_mapping.c
+++ b/memory_mapping.c
@@ -203,7 +203,8 @@ static void guest_phys_blocks_region_add(MemoryListener 
*listener,
 GuestPhysBlock *predecessor;
 
 /* we only care about RAM */
-if (!memory_region_is_ram(section->mr)) {
+if (!memory_region_is_ram(section->mr) ||
+memory_region_is_vfio_mmap(section->mr)) {
 return;
 }
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 4/3] qemu-iotests: Test missing "driver" key for blockdev-add

2014-09-11 Thread Eric Blake
On 09/11/2014 09:04 PM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> ---
>  tests/qemu-iotests/087 | 17 +
>  tests/qemu-iotests/087.out | 13 +
>  2 files changed, 30 insertions(+)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] ohci: Stop OHCI bus when PCI bus master is disabled

2014-09-11 Thread Alexey Kardashevskiy
On 09/11/2014 09:02 PM, Alexey Kardashevskiy wrote:
> On 09/11/2014 08:38 PM, Gerd Hoffmann wrote:
>>   Hi,
>>
>>> Another question - I noticed that XHCI migration is broken in quite recent
>>> upstream QEMU, smells like memory corruption. Is it just me or just PPC or
>>> is it known issue?
>>
>> 2.0 -> 2.1 migration being broken is a known issue (patch for that one
>> was on the list earlier this week, unfortunately missed 2.1.1).
>>
>> Other that that I'm not aware of any issues.
> 
> My bad, it was me.
> 
> I enabled 64bit DMA on pseries (the guest ram is mapped at
> 8000... on the pci bus) and somehow this causes migration
> errors. I thought there is no 64bit DMA-capable device in QEMU, and I was
> wrong :)
> 
> /me is debugging


After all, my issue has nothing to do with XHCI itself. I am implementing
another DMA window (64bit, maps entire guest) which is yet another child
object on the PHB device but it is dynamic - it is created by request from
the guest. And XHCI driver does it so I end up with source QEMU which has
this object and destination QEMU which does not and when I try creating
this object during migration (and this adds yet another SaveStateEntry),
something goes very wrong. QOM puzzle.



-- 
Alexey



Re: [Qemu-devel] [PATCH] block: Make op blockers recursive

2014-09-11 Thread Fam Zheng
On Tue, 09/09 14:28, Benoît Canet wrote:
> On Tue, Sep 09, 2014 at 01:56:46PM +0200, Kevin Wolf wrote:
> > Am 22.08.2014 um 18:11 hat Benoît Canet geschrieben:
> > > Since the block layer code is starting to modify the BDS graph right in 
> > > the
> > > middle of BDS chains (block-mirror's replace parameter for example) QEMU 
> > > needs
> > > to properly block and unblock whole BDS subtrees; recursion is a neat way 
> > > to
> > > achieve this task.
> > > 
> > > This patch also takes care of modifying the op blockers users.
> > > 
> > > Signed-off-by: Benoit Canet 
> > > ---
> > >  block.c   | 69 
> > > ---
> > >  block/blkverify.c | 21 +++
> > >  block/commit.c|  3 +++
> > >  block/mirror.c| 17 
> > >  block/quorum.c| 25 +
> > >  block/stream.c|  3 +++
> > >  block/vmdk.c  | 34 +++
> > >  include/block/block.h |  2 +-
> > >  include/block/block_int.h |  6 +
> > >  9 files changed, 171 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/block.c b/block.c
> > > index 6fa0201..d964b6c 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -5446,7 +5446,9 @@ bool bdrv_op_is_blocked(BlockDriverState *bs, 
> > > BlockOpType op, Error **errp)
> > >  return false;
> > >  }
> > >  
> > > -void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
> > > +/* do the real work of blocking a BDS */
> > > +static void bdrv_do_op_block(BlockDriverState *bs, BlockOpType op,
> > > + Error *reason)
> > >  {
> > >  BdrvOpBlocker *blocker;
> > >  assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
> > > @@ -5456,7 +5458,9 @@ void bdrv_op_block(BlockDriverState *bs, 
> > > BlockOpType op, Error *reason)
> > >  QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
> > >  }
> > >  
> > > -void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
> > > +/* do the real work of unblocking a BDS */
> > > +static void bdrv_do_op_unblock(BlockDriverState *bs, BlockOpType op,
> > > +   Error *reason)
> > >  {
> > >  BdrvOpBlocker *blocker, *next;
> > >  assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
> > > @@ -5468,6 +5472,65 @@ void bdrv_op_unblock(BlockDriverState *bs, 
> > > BlockOpType op, Error *reason)
> > >  }
> > >  }
> > >  
> > > +static bool bdrv_op_is_blocked_by(BlockDriverState *bs, BlockOpType op,
> > > +  Error *reason)
> > > +{
> > > +BdrvOpBlocker *blocker, *next;
> > > +assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
> > > +QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
> > 
> > This doesn't actually need the SAFE version.
> > 
> > > +if (blocker->reason == reason) {
> > > +return true;
> > > +}
> > > +}
> > > +return false;
> > > +}
> > > +
> > > +/* block recursively a BDS */
> > > +void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
> > > +{
> > > +if (!bs) {
> > > +return;
> > > +}
> > > +
> > > +/* prevent recursion loop */
> > > +if (bdrv_op_is_blocked_by(bs, op, reason)) {
> > > +return;
> > > +}
> > > +
> > > +/* block first for recursion loop protection to work */
> > > +bdrv_do_op_block(bs, op, reason);
> > > +
> > > +bdrv_op_block(bs->file, op, reason);
> > > +bdrv_op_block(bs->backing_hd, op, reason);
> > > +
> > > +if (bs->drv && bs->drv->bdrv_op_recursive_block) {
> > > +bs->drv->bdrv_op_recursive_block(bs, op, reason);
> > > +}
> > 
> > Here you block bs->file/bs->backing_hd automatically, no matter whether
> > the block driver implements the callback or not. I'm not sure if we can
> > do it like this for all times, but for now that should be okay.
> > 
> > > +}
> > > +
> > > +/* unblock recursively a BDS */
> > > +void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
> > > +{
> > > +if (!bs) {
> > > +return;
> > > +}
> > > +
> > > +/* prevent recursion loop */
> > > +if (!bdrv_op_is_blocked_by(bs, op, reason)) {
> > > +return;
> > > +}
> > > +
> > > +/* unblock first for recursion loop protection to work */
> > > +bdrv_do_op_unblock(bs, op, reason);
> > > +
> > > +bdrv_op_unblock(bs->file, op, reason);
> > > +bdrv_op_unblock(bs->backing_hd, op, reason);
> > > +
> > > +if (bs->drv && bs->drv->bdrv_op_recursive_unblock) {
> > > +bs->drv->bdrv_op_recursive_unblock(bs, op, reason);
> > > +}
> > > +}
> > > +
> > >  void bdrv_op_block_all(BlockDriverState *bs, Error *reason)
> > >  {
> > >  int i;
> > > @@ -5848,7 +5911,7 @@ BlockDriverState *check_to_replace_node(const char 
> > > *node_name, Error **errp)
> > >  return NULL;
> > >  }
> > >  
> > > -if (bdrv_op_is_blocked(to_replace_bs, BLOCK_OP_TYPE_REPLACE,

Re: [Qemu-devel] [question] virtio-blk performance degradation happened with virito-serial

2014-09-11 Thread Zhang Haoyu
>>> > > If virtio-blk and virtio-serial share an IRQ, the guest operating 
>>> > > system has to check each virtqueue for activity. Maybe there is some 
>>> > > inefficiency doing that.
>>> > > AFAIK virtio-serial registers 64 virtqueues (on 31 ports + console) 
>>> > > even if everything is unused.
>>> > 
>>> > That could be the case if MSI is disabled.
>>> 
>>> Do the windows virtio drivers enable MSIs, in their inf file?
>>
>>It depends on the version of the drivers, but it is a reasonable guess
>>at what differs between Linux and Windows.  Haoyu, can you give us the
>>output of lspci from a Linux guest?
>>
>I made a test with fio on rhel-6.5 guest, the same degradation happened too,  
>this degradation can be reproduced on rhel6.5 guest 100%.
>virtio_console module installed:
>64K-write-sequence: 285 MBPS, 4380 IOPS
>virtio_console module uninstalled:
>64K-write-sequence: 370 MBPS, 5670 IOPS
>
I use top -d 1 -H -p  to monitor the cpu usage, and found that,
virtio_console module installed:
qemu main thread cpu usage: 98%
virtio_console module uninstalled:
qemu main thread cpu usage: 60%

perf top -p  result,
virtio_console module installed:
   PerfTop:9868 irqs/sec  kernel:76.4%  exact:  0.0% [4000Hz cycles],  
(target_pid: 88381)
--

11.80%  [kernel] [k] _raw_spin_lock_irqsave
 8.42%  [kernel] [k] _raw_spin_unlock_irqrestore
 7.33%  [kernel] [k] fget_light
 6.28%  [kernel] [k] fput
 3.61%  [kernel] [k] do_sys_poll
 3.30%  qemu-system-x86_64   [.] qcow2_check_metadata_overlap
 3.10%  [kernel] [k] __pollwait
 2.15%  qemu-system-x86_64   [.] qemu_iohandler_poll
 1.44%  libglib-2.0.so.0.3200.4  [.] g_array_append_vals
 1.36%  libc-2.13.so [.] 0x0011fc2a
 1.31%  libpthread-2.13.so   [.] pthread_mutex_lock
 1.24%  libglib-2.0.so.0.3200.4  [.] 0x0001f961
 1.20%  libpthread-2.13.so   [.] __pthread_mutex_unlock_usercnt
 0.99%  [kernel] [k] eventfd_poll
 0.98%  [vdso]   [.] 0x0771
 0.97%  [kernel] [k] remove_wait_queue
 0.96%  qemu-system-x86_64   [.] qemu_iohandler_fill
 0.95%  [kernel] [k] add_wait_queue
 0.69%  [kernel] [k] __srcu_read_lock
 0.58%  [kernel] [k] poll_freewait
 0.57%  [kernel] [k] _raw_spin_lock_irq
 0.54%  [kernel] [k] __srcu_read_unlock
 0.47%  [kernel] [k] copy_user_enhanced_fast_string
 0.46%  [kvm_intel]  [k] vmx_vcpu_run
 0.46%  [kvm][k] vcpu_enter_guest
 0.42%  [kernel] [k] tcp_poll
 0.41%  [kernel] [k] system_call_after_swapgs
 0.40%  libglib-2.0.so.0.3200.4  [.] g_slice_alloc
 0.40%  [kernel] [k] system_call
 0.38%  libpthread-2.13.so   [.] 0xe18d
 0.38%  libglib-2.0.so.0.3200.4  [.] g_slice_free1
 0.38%  qemu-system-x86_64   [.] address_space_translate_internal
 0.38%  [kernel] [k] _raw_spin_lock
 0.37%  qemu-system-x86_64   [.] phys_page_find
 0.36%  [kernel] [k] get_page_from_freelist
 0.35%  [kernel] [k] sock_poll
 0.34%  [kernel] [k] fsnotify
 0.31%  libglib-2.0.so.0.3200.4  [.] g_main_context_check
 0.30%  [kernel] [k] do_direct_IO
 0.29%  libpthread-2.13.so   [.] pthread_getspecific

virtio_console module uninstalled:
   PerfTop:9138 irqs/sec  kernel:71.7%  exact:  0.0% [4000Hz cycles],  
(target_pid: 88381)
--

 5.72%  qemu-system-x86_64   [.] qcow2_check_metadata_overlap
 4.51%  [kernel] [k] fget_light
 3.98%  [kernel] [k] _raw_spin_lock_irqsave
 2.55%  [kernel] [k] fput
 2.48%  libpthread-2.13.so   [.] pthread_mutex_lock
 2.46%  [kernel] [k] _raw_spin_unlock_irqrestore
 2.21%  libpthread-2.13.so   [.] __pthread_mutex_unlock_usercnt
 1.71%  [vdso]   [.] 0x060c
 1.68%  libc-2.13.so [.] 0x000e751f
 1.64%  libglib-2.0.so.0.3200.4  [.] 0x0004fca0
 1.20%  [kernel] [k] __srcu_read_lock
 1.14%  [kernel] [k] do_sys_poll
 0.96%  [kernel] [k] _raw_spin_lock_irq
 0.95%  [kernel] [k] __pollwait
 0.91%  [kernel] [k] __srcu_read_unlock
 0.78%  [kernel] [k] tcp_poll
 0.74%  [

Re: [Qemu-devel] [PATCH 4/5] pc: add cpu hotplug handler to PC_MACHINE

2014-09-11 Thread Gu Zheng
Hi Igor,
On 09/10/2014 09:55 PM, Igor Mammedov wrote:

> On Wed,  3 Sep 2014 17:06:16 +0800
> Gu Zheng  wrote:
> 
>> Add cpu hotplug handler to PC_MACHINE, which will perform the acpi
>> cpu hotplug callback via hotplug_handler API.
>>
>> Signed-off-by: Gu Zheng 
>> ---
>>  hw/i386/pc.c |   26 +-
>>  qom/cpu.c|1 -
>>  2 files changed, 25 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 8fa8d2f..c2956f9 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -1607,11 +1607,34 @@ out:
>>  error_propagate(errp, local_err);
>>  }
>>  
>> +static void pc_cpu_plug(HotplugHandler *hotplug_dev,
>> + DeviceState *dev, Error **errp)
>> +{
>> +HotplugHandlerClass *hhc;
>> +Error *local_err = NULL;
>> +PCMachineState *pcms = PC_MACHINE(hotplug_dev);
>> +
>> +if (dev->hotplugged) {
> for startup CPUs we set gpe_cpu status bits in AcpiCpuHotplug_init()
> and for hotpluged in AcpiCpuHotplug_add() which is duplicating
> essentially the same code.
> 
> Could you drop above check and make AcpiCpuHotplug_init() take care
> about startup CPUs as well, keeping bit setting in one place.

Yeah, with the above change, code will be more neat, but I think it is not
a serious problem.:)
And pcms->acpi_dev is set when PC hardware initialisation (e.g. pc_init1),
for start up cpus, the plug handler is not valid, so the check is needed
here to avoid trigger error by start cpu.

> 
>> +if (!pcms->acpi_dev) {
>> +error_setg(&local_err,
>> +   "cpu hotplug is not enabled: missing acpi device");
>> +goto out;
>> +}
>> +
>> +hhc = HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev);
>> +hhc->plug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err);
>> +out:
>> +error_propagate(errp, local_err);
>> +}
>> +}
>> +
>>  static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
>>DeviceState *dev, Error **errp)
>>  {
>>  if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>>  pc_dimm_plug(hotplug_dev, dev, errp);
>> +} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
>> +pc_cpu_plug(hotplug_dev, dev, errp);
>>  }
>>  }
>>  
>> @@ -1620,7 +1643,8 @@ static HotplugHandler 
>> *pc_get_hotpug_handler(MachineState *machine,
>>  {
>>  PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
>>  
>> -if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>> +if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)
>> +|| object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
>>  return HOTPLUG_HANDLER(machine);
>>  }
>>  
>> diff --git a/qom/cpu.c b/qom/cpu.c
>> index b32dd0a..af8e83f 100644
>> --- a/qom/cpu.c
>> +++ b/qom/cpu.c
>> @@ -304,7 +304,6 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
>> **errp)
>>  if (dev->hotplugged) {
>>  cpu_synchronize_post_init(cpu);
>>  notifier_list_notify(&cpu_added_notifiers, dev);
>> -cpu_resume(cpu);
>>  }
>>  }
>>  
> 
> I don't see what sets PCMachine as hotplug_handler for CPU,
> maybe series is missing a patch?

Previous memory hotplug has built the frame work, here just adding the case
to handle CPU.

 static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev,
   DeviceState *dev, Error **errp)
 {
 if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
 pc_dimm_plug(hotplug_dev, dev, errp);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
+pc_cpu_plug(hotplug_dev, dev, errp);
 }

Thanks,
Gu

> 
> .
> 





Re: [Qemu-devel] [PATCH 2/2] Revert "virtio: don't call device on !vm_running"

2014-09-11 Thread Jason Wang
On 09/12/2014 12:18 AM, Michael S. Tsirkin wrote:
> This reverts commit a1bc7b827e422e1ff065640d8ec5347c4aadfcd8.
> virtio: don't call device on !vm_running
> It turns out that virtio net assumes that vm_running
> is updated before device status callback in many places,
> so this change leads to asserts.
> Previous commit fixes the root issue that motivated
> a1bc7b827e422e1ff065640d8ec5347c4aadfcd8 differently,
> so there's no longer a need for this change.
>
> In the future, we might be able to drop checking vm_running
> completely, and check vm state directly.

Acked-by: Jason Wang 
> Reported-by: Dietmar Maurer 
> Cc: qemu-sta...@nongnu.org
> Cc: Jason Wang 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/virtio/virtio.c | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index ac22238..5c98180 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1108,10 +1108,7 @@ static void virtio_vmstate_change(void *opaque, int 
> running, RunState state)
>  BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
>  VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
>  bool backend_run = running && (vdev->status & VIRTIO_CONFIG_S_DRIVER_OK);
> -
> -if (running) {
> -vdev->vm_running = running;
> -}
> +vdev->vm_running = running;
>  
>  if (backend_run) {
>  virtio_set_status(vdev, vdev->status);
> @@ -1124,10 +1121,6 @@ static void virtio_vmstate_change(void *opaque, int 
> running, RunState state)
>  if (!backend_run) {
>  virtio_set_status(vdev, vdev->status);
>  }
> -
> -if (!running) {
> -vdev->vm_running = running;
> -}
>  }
>  
>  void virtio_init(VirtIODevice *vdev, const char *name,




Re: [Qemu-devel] [PATCH 1/2] virtio-net: drop assert on vm stop

2014-09-11 Thread Jason Wang
On 09/12/2014 12:18 AM, Michael S. Tsirkin wrote:
> On vm stop, vm_running state set to stopped
> before device is notified, so callbacks can get envoked with
> vm_running = false; and this is not an error.

This is consistent with virtio-blk which also has such kinds of
callbacks. This fixes the issue of qemu crashing when stop during
transmission.

Acked-by: Jason Wang 
> Cc: qemu-sta...@nongnu.org
> Cc: Jason Wang 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/net/virtio-net.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 826a2a5..2040eac 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1125,8 +1125,6 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
>  return num_packets;
>  }
>  
> -assert(vdev->vm_running);
> -
>  if (q->async_tx.elem.out_num) {
>  virtio_queue_set_notification(q->tx_vq, 0);
>  return num_packets;




Re: [Qemu-devel] [PATCH 0/3] qapi: fix crash in dealloc visitor for union types

2014-09-11 Thread Fam Zheng
On Thu, 09/11 18:20, Michael Roth wrote:
> This series introduces visit_start_enum and visit_end_enum as a way
> of allowing visitors to trigger generated code to bail out on visiting
> union fields if the visitor implementation deems doing so to be unsafe.
> 
> See patch 1 for the circumstances that cause the segfault in the
> dealloc visitor.
> 
> This is a spin-off of a patch submitted by Fam Zheng earlier. See the
> thread for additional background on why we're taking this approach:
> 
>   http://thread.gmane.org/gmane.comp.emulators.qemu/296090
> 
> Fam, if you'd like to break out your iotest into another patch I
> can include it as part of this series, otherwise it can be sent as
> a follow-up.

It's a good exercise of this bug from the user interface, I'll reply my patch
to this series.

Thanks for taking care of this, Michael.

Fam



[Qemu-devel] [PATCH 4/3] qemu-iotests: Test missing "driver" key for blockdev-add

2014-09-11 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/087 | 17 +
 tests/qemu-iotests/087.out | 13 +
 2 files changed, 30 insertions(+)

diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087
index 82c56b1..d7454d1 100755
--- a/tests/qemu-iotests/087
+++ b/tests/qemu-iotests/087
@@ -218,6 +218,23 @@ run_qemu <

Re: [Qemu-devel] [PATCH 2/3] qapi: dealloc visitor, implement visit_start_union

2014-09-11 Thread Eric Blake
On 09/11/2014 05:20 PM, Michael Roth wrote:
> If the .data field of a QAPI Union is NULL, we don't need to free
> any of the union fields..

intentional double dot?

> 
> Make use of the new visit_start_union interface to access this
> information and instruct the generated code to not visit these
> fields when this occurs.
> 
> Cc: qemu-sta...@nongnu.org
> Reported-by: Fam Zheng 
> Suggested-by: Paolo Bonzini 
> Signed-off-by: Michael Roth 
> ---
>  qapi/qapi-dealloc-visitor.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
> index dc53545..eb77078 100644
> --- a/qapi/qapi-dealloc-visitor.c
> +++ b/qapi/qapi-dealloc-visitor.c
> @@ -162,6 +162,11 @@ static void qapi_dealloc_type_enum(Visitor *v, int *obj, 
> const char *strings[],
>  {
>  }
>  
> +static bool qapi_dealloc_start_union(Visitor *v, bool data_present, Error 
> **errp)

Long line.

> +{
> +return data_present;

Okay, so this is a case where you can return false without setting errp.
 It would be nice to document the interactions and contract of this
visitor here or in patch 1.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/3] qapi: add visit_start_union and visit_end_union

2014-09-11 Thread Eric Blake
On 09/11/2014 05:20 PM, Michael Roth wrote:
> In some cases an input visitor might bail out on filling out a
> struct for various reasons, such as missing fields when running
> in strict mode. In the case of a QAPI Union type, this may lead
> to cases where the .kind field which encodes the union type
> is uninitialized. Subsequently, other visitors, such as the
> dealloc visitor, may use this .kind value as if it were
> initialized, leading to assumptions about the union type which
> in this case may lead to segfaults. For example, freeing an
> integer value.
> 

>  
> +bool visit_start_union(Visitor *v, bool data_present, Error **errp)
> +{
> +if (v->start_union) {
> +return v->start_union(v, data_present, errp);
> +}
> +return true;
> +}

Any rules on whether errp must be set if returning false, and must not
be set if returning true?  If so, do we need a bool return, or is errp
sufficient?


> +++ b/scripts/qapi-visit.py
> @@ -357,6 +357,9 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, 
> const char *name, Error **e
>  if (err) {
>  goto out_obj;
>  }
> +if (!visit_start_union(m, !!(*obj)->data, &err)) {
> +goto out_obj;
> +}
>  switch ((*obj)->kind) {

and if there aren't rules, then a visitor that sets err but still
returns true would result in this code not exiting early, but passing an
already-set error into the switch, which is probably not desirable.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/5] acpi/cpu: add cpu hotplug callback function to match hotplug_handler API

2014-09-11 Thread Gu Zheng
Hi Igor,
On 09/11/2014 06:12 PM, Igor Mammedov wrote:

> On Thu, 11 Sep 2014 11:04:10 +0800
> Gu Zheng  wrote:
> 
>> Hi Igor,
>> On 09/10/2014 09:28 PM, Igor Mammedov wrote:
>>
>>> On Wed,  3 Sep 2014 17:06:13 +0800
>>> Gu Zheng  wrote:
>>>

 Signed-off-by: Gu Zheng 
 ---
  hw/acpi/cpu_hotplug.c |   17 +
  include/hw/acpi/cpu_hotplug.h |3 +++
  2 files changed, 20 insertions(+), 0 deletions(-)

 diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
 index 2ad83a0..92c189b 100644
 --- a/hw/acpi/cpu_hotplug.c
 +++ b/hw/acpi/cpu_hotplug.c
 @@ -36,6 +36,23 @@ static const MemoryRegionOps AcpiCpuHotplug_ops = {
  },
  };
  
 +void acpi_cpu_plug_cb(ACPIREGS *ar, qemu_irq irq,
 +  AcpiCpuHotplug *g, DeviceState *dev)
>>> wrong indentation ^^^
>>>
>>> it wouldn't hurt to add errp argument here ...
>>>
 +{
 +CPUState *cpu = CPU(dev);
 +CPUClass *k = CPU_GET_CLASS(cpu);
 +int64_t cpu_id;
 +
 +ar->gpe.sts[0] |= ACPI_CPU_HOTPLUG_STATUS;
 +cpu_id = k->get_arch_id(cpu);
 +g_assert((cpu_id / 8) < ACPI_GPE_PROC_LEN);
>>> ... and return error from here instead of aborting
>>
>> Got it.
>>
>>>
 +g->sts[cpu_id / 8] |= (1 << (cpu_id % 8));
 +
 +acpi_update_sci(ar, irq);
 +
 +cpu_resume(cpu);
>>> Why are you adding cpu_resume() here?
>>> check cpu_common_realizefn() which already does it.
>>
>> Because hot added callback is called after cpu_common_realizefn(), so I moved
>> cpu_resume(cpu) from cpu_common_realizefn() here to ensure the guest has 
>> already
>> hot added cpu before we resume it.
> cpu.realize() should create a fully working CPU and moving CPU internals to 
> ACPI
> is not correct. CPU hot-add shouyld not depend on whether guest OS supports 
> it or not.
> 
> More over there is no need to resume CPU when guest OS hot-added it,
> cpu.realize() creates secondary CPU in RESET state, so it does nothing and 
> waits
> for INIT/SIPI sequence (i.e. when guest OS decides to online CPU).

Seems I had a misreading on this, thanks for your explanation.

Best regards,
Gu

> 
>>
>>>
 +}
 +
  void AcpiCpuHotplug_add(ACPIGPE *gpe, AcpiCpuHotplug *g, CPUState *cpu)
  {
  CPUClass *k = CPU_GET_CLASS(cpu);
 diff --git a/include/hw/acpi/cpu_hotplug.h b/include/hw/acpi/cpu_hotplug.h
 index 9e5d30c..d025731 100644
 --- a/include/hw/acpi/cpu_hotplug.h
 +++ b/include/hw/acpi/cpu_hotplug.h
 @@ -20,6 +20,9 @@ typedef struct AcpiCpuHotplug {
  uint8_t sts[ACPI_GPE_PROC_LEN];
  } AcpiCpuHotplug;
  
 +void acpi_cpu_plug_cb(ACPIREGS *ar, qemu_irq irq,
 +  AcpiCpuHotplug *g, DeviceState *dev);
>>> wrong indentation ^^^
>>
>> OK, will fix this too.
>>
>> Thanks,
>> Gu
>>
>>>
 +
  void AcpiCpuHotplug_add(ACPIGPE *gpe, AcpiCpuHotplug *g, CPUState *cpu);
  
  void AcpiCpuHotplug_init(MemoryRegion *parent, Object *owner,
>>>
>>>
>>> .
>>>
>>
>>
> 
> .
> 





Re: [Qemu-devel] [PATCH 10/10] aio-win32: add support for sockets

2014-09-11 Thread TeLeMan
On Wed, Jul 9, 2014 at 5:53 PM, Paolo Bonzini  wrote:
> Uses the same select/WSAEventSelect scheme as main-loop.c.
> WSAEventSelect() is edge-triggered, so it cannot be used
> directly, but it is still used as a way to exit from a
> blocking g_poll().
>
> Before g_poll() is called, we poll sockets with a non-blocking
> select() to achieve the level-triggered semantics we require:
> if a socket is ready, the g_poll() is made non-blocking too.
>
> Based on a patch from Or Goshen.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  aio-win32.c | 150 
> ++--
>  block/Makefile.objs |   2 -
>  include/block/aio.h |   2 -
>  3 files changed, 145 insertions(+), 9 deletions(-)
>
> diff --git a/aio-win32.c b/aio-win32.c
> index 4542270..61e3d2d 100644
> --- a/aio-win32.c
> +++ b/aio-win32.c
>
> @@ -149,10 +279,15 @@ bool aio_poll(AioContext *ctx, bool blocking)
>  {
>  AioHandler *node;
>  HANDLE events[MAXIMUM_WAIT_OBJECTS + 1];
> -bool was_dispatching, progress, first;
> +bool was_dispatching, progress, have_select_revents, first;
have_select_revents has no initial value.

>  int count;
>  int timeout;
>
> +if (aio_prepare(ctx)) {
> +blocking = false;
> +have_select_revents = true;
> +}
> +
>  was_dispatching = ctx->dispatching;
>  progress = false;
>
> @@ -183,6 +318,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
>
>  /* wait until next event */
>  while (count > 0) {
> +HANDLE event;
>  int ret;
>
>  timeout = blocking
> @@ -196,13 +332,17 @@ bool aio_poll(AioContext *ctx, bool blocking)
>  first = false;
>
>  /* if we have any signaled events, dispatch event */
> -if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
> +event = NULL;
> +if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
> +event = events[ret - WAIT_OBJECT_0];
> +} else if (!have_select_revents) {
>  break;
>  }
>
> +have_select_revents = false;
>  blocking = false;
>
> -progress |= aio_dispatch_handlers(ctx, events[ret - WAIT_OBJECT_0]);
> +progress |= aio_dispatch_handlers(ctx, event);
>
>  /* Try again, but only call each handler once.  */
>  events[ret - WAIT_OBJECT_0] = events[--count];
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index fd88c03..07eabf7 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -10,7 +10,6 @@ block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
>  block-obj-$(CONFIG_POSIX) += raw-posix.o
>  block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>
> -ifeq ($(CONFIG_POSIX),y)
>  block-obj-y += nbd.o nbd-client.o sheepdog.o
>  block-obj-$(CONFIG_LIBISCSI) += iscsi.o
>  block-obj-$(CONFIG_LIBNFS) += nfs.o
> @@ -18,7 +17,6 @@ block-obj-$(CONFIG_CURL) += curl.o
>  block-obj-$(CONFIG_RBD) += rbd.o
>  block-obj-$(CONFIG_GLUSTERFS) += gluster.o
>  block-obj-$(CONFIG_LIBSSH2) += ssh.o
> -endif
>
>  common-obj-y += stream.o
>  common-obj-y += commit.o
> diff --git a/include/block/aio.h b/include/block/aio.h
> index d129e22..78fa2ee 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -239,7 +239,6 @@ bool aio_dispatch(AioContext *ctx);
>   */
>  bool aio_poll(AioContext *ctx, bool blocking);
>
> -#ifdef CONFIG_POSIX
>  /* Register a file descriptor and associated callbacks.  Behaves very 
> similarly
>   * to qemu_set_fd_handler2.  Unlike qemu_set_fd_handler2, these callbacks 
> will
>   * be invoked when using aio_poll().
> @@ -252,7 +251,6 @@ void aio_set_fd_handler(AioContext *ctx,
>  IOHandler *io_read,
>  IOHandler *io_write,
>  void *opaque);
> -#endif
>
>  /* Register an event notifier and associated callbacks.  Behaves very 
> similarly
>   * to event_notifier_set_handler.  Unlike event_notifier_set_handler, these 
> callbacks
> --
> 1.9.3
>
>



Re: [Qemu-devel] [PATCH 10/10] aio-win32: add support for sockets

2014-09-11 Thread TeLeMan
On Wed, Jul 9, 2014 at 5:53 PM, Paolo Bonzini  wrote:
> Uses the same select/WSAEventSelect scheme as main-loop.c.
> WSAEventSelect() is edge-triggered, so it cannot be used
> directly, but it is still used as a way to exit from a
> blocking g_poll().
>
> Before g_poll() is called, we poll sockets with a non-blocking
> select() to achieve the level-triggered semantics we require:
> if a socket is ready, the g_poll() is made non-blocking too.
>
> Based on a patch from Or Goshen.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  aio-win32.c | 150 
> ++--
>  block/Makefile.objs |   2 -
>  include/block/aio.h |   2 -
>  3 files changed, 145 insertions(+), 9 deletions(-)
>
> diff --git a/aio-win32.c b/aio-win32.c
> index 4542270..61e3d2d 100644
> --- a/aio-win32.c
> +++ b/aio-win32.c

> @@ -183,6 +318,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
>
>  /* wait until next event */
>  while (count > 0) {
> +HANDLE event;
>  int ret;
>
>  timeout = blocking
> @@ -196,13 +332,17 @@ bool aio_poll(AioContext *ctx, bool blocking)
>  first = false;
>
>  /* if we have any signaled events, dispatch event */
> -if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
> +event = NULL;
> +if ((DWORD) (ret - WAIT_OBJECT_0) < count) {
> +event = events[ret - WAIT_OBJECT_0];
> +} else if (!have_select_revents) {

when (ret - WAIT_OBJECT_0) >= count and have_select_revents is true,
the following events[ret - WAIT_OBJECT_0] will be overflowed.

>  break;
>  }
>
> +have_select_revents = false;
>  blocking = false;
>
> -progress |= aio_dispatch_handlers(ctx, events[ret - WAIT_OBJECT_0]);
> +progress |= aio_dispatch_handlers(ctx, event);
>
>  /* Try again, but only call each handler once.  */
>  events[ret - WAIT_OBJECT_0] = events[--count];



Re: [Qemu-devel] [RFC V2 10/10] cpus: reclaim allocated vCPU objects

2014-09-11 Thread Gu Zheng
Hi Bharata,

On 09/11/2014 08:37 PM, Bharata B Rao wrote:

> On Thu, Sep 11, 2014 at 3:23 PM, Gu Zheng  wrote:
>> On 09/11/2014 05:35 PM, Bharata B Rao wrote:
>>
>>> On Thu, Aug 28, 2014 at 9:06 AM, Gu Zheng  wrote:
 After ACPI get a signal to eject a vCPU, the vCPU must be
 removed from CPU list,before the vCPU really removed,  then
 release the all related vCPU objects.
 But we do not close KVM vcpu fd, just record it into a list, in
 order to reuse it.
>>>
>>> After I add and delete a CPU, "info cpus" from monitor still lists the
>>> removed CPU. Is this expected ?
>>
>> No, you can not see the removed cpu from QEMU (e.g. monitor) or guest side,
>> but from the kernel side it is still there, just no one uses it any more.
> 
> I am trying your patches, but still see the removed CPU after deletion.
> 
> (qemu) info cpus
> * CPU #0: pc=0x8100b82c (halted) thread_id=7812
>   CPU #1: pc=0x8100b82c (halted) thread_id=7813
>   CPU #2: pc=0x8100b82c (halted) thread_id=7814
>   CPU #3: pc=0x8100b82c (halted) thread_id=7815
>   CPU #4: pc=0x8100b82c (halted) thread_id=7816
>   CPU #5: pc=0x8100b82c (halted) thread_id=7817
>   CPU #6: pc=0x8100b82c (halted) thread_id=7818
>   CPU #7: pc=0x8100b82c (halted) thread_id=7819
> (qemu) device_add qemu64-x86_64-cpu,id=cpu8
> (qemu) info cpus
> * CPU #0: pc=0x8100b82c (halted) thread_id=7812
>   CPU #1: pc=0x8100b82c (halted) thread_id=7813
>   CPU #2: pc=0x8100b82c (halted) thread_id=7814
>   CPU #3: pc=0x8100b82c (halted) thread_id=7815
>   CPU #4: pc=0x8100b82c (halted) thread_id=7816
>   CPU #5: pc=0x8100b82c (halted) thread_id=7817
>   CPU #6: pc=0x8100b82c (halted) thread_id=7818
>   CPU #7: pc=0x8100b82c (halted) thread_id=7819
>   CPU #8: pc=0x8100b82c (halted) thread_id=8041
> (qemu) device_del cpu8
> (qemu) info cpus
> * CPU #0: pc=0x8100b82c (halted) thread_id=7812
>   CPU #1: pc=0x8100b82c (halted) thread_id=7813
>   CPU #2: pc=0x8100b82c (halted) thread_id=7814
>   CPU #3: pc=0x8100b82c (halted) thread_id=7815
>   CPU #4: pc=0x8100b82c (halted) thread_id=7816
>   CPU #5: pc=0x8100b82c (halted) thread_id=7817
>   CPU #6: pc=0x8100b82c (halted) thread_id=7818
>   CPU #7: pc=0x8100b82c (halted) thread_id=7819
>   CPU #8: pc=0x81031722 (halted) thread_id=8041
> 
> I applied your patchset on commit 69f87f713069f1f70, may be I should
> try with latest QEMU ?

Is guest os enabled acpi cpu hotplug? What's the guest's cpu info?
Please try latest QEMU, and any feedback is welcome.

Thanks,
Gu

> 
> Regards,
> Bharata.
> .
> 





Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-09-11 Thread Hongyang Yang



在 09/11/2014 09:50 AM, Michael R. Hines 写道:

On 09/10/2014 11:43 PM, Walid Nouri wrote:

Hello Michael, Hello Paolo
i have „studied“ the available documentation/Information and tried to get an
idea of the QEMU live block operation possibilities.

I think the MC protocol doesn’t need synchronous block device replication
because primary and secondary VM are not synchronous. The state of the primary
is allays ahead of the state of the secondary. When the primary is in epoch(n)
the secondary is in epoch(n-1).

What MC needs is a block device agnostic, controlled and asynchronous approach
for replicating the contents of block devices and its state changes to the
secondary VM while the primary VM is running. Asynchronous block transfer is
important to allow maximum performance for the primary VM, while keeping the
secondary VM updated with state changes.

The block device replication should be possible in two stages or modes.

The first stage is the live copy of all block devices of the primary to the
secondary. This is necessary if the secondary doesn’t have an existing image
which is in sync with the primary at the time MC has started. This is not very
convenient but as far as I know actually there is no mechanism for persistent
dirty bitmap in QEMU.

The second stage (mode) is the replication of block device state changes
(modified blocks) to keep the image on the secondary in sync with the primary.
The mirrored blocks must be buffered in ram (block buffer) until the complete
Checkpoint (RAM, vCPU, device state) can be committed.

For keeping the complete system state consistent on the secondary system there
must be a possibility for MC to commit/discard block device state changes. In
normal operation the mirrored block device state changes (block buffer) are
committed to disk when the complete checkpoint is committed. In case of a
crash of the primary system while transferring a checkpoint the data in the
block buffer corresponding to the failed Checkpoint must be discarded.

The storage architecture should be “shared nothing” so that no shared storage
is required and primary/secondary can have separate block device images.

I think this can be achieved by drive-mirror and a filter block driver.
Another approach could be to exploit the block migration functionality of live
migration with a filter block driver.

The drive-mirror (and live migration) does not rely on shared storage and
allow live block device copy and incremental syncing.

A block buffer can be implemented with a QEMU filter block driver. It should
sit at the same position as the Quorum driver in the block driver hierarchy.
When using block filter approach MC will be transparent and block device
agnostic.

The block buffer filter must have an Interface which allows MC control the
commits or discards of block device state changes. I have no idea where to put
such an interface to stay conform with QEMU coding style.


I’m sure there are alternative and better approaches and I’m open for any ideas


Walid

Am 17.08.2014 11:52, schrieb Paolo Bonzini:

Il 11/08/2014 22:15, Michael R. Hines ha scritto:

Excellent question: QEMU does have a feature called "drive-mirror"
in block/mirror.c that was introduced a couple of years ago. I'm not
sure what the
adoption rate of the feature is, but I would start with that one.


block/mirror.c is asynchronous, and there's no support for communicating
checkpoints back to the master. However, the quorum disk driver could
be what you need.

There's also a series on the mailing list that lets quorum read only
from the primary, so that quorum can still do replication and fault
tolerance, but skip fault detection.

Paolo


There is also a second fault tolerance implementation that works a
little differently called
"COLO" - you may have seen those emails on the list too, but their
method does not require a disk replication solution, if I recall correctly.






Nice description of the problem - would you like to put this information on the
MC wiki page? (Just send an email to the list that says "request for wiki
account, please" in the subject - and they will make an account for you.

A drive-mirror + filter driver solution sounds like a good plan overall,
of course the devil is in the details =)


If I understand correctly, this is similar to our approach. Disk replication
is definitely on our plan and we will post RFC patches that include Disk
replication, you can keep an eye on COLO patches:)



I don't know how much time you have to spend on actual code, but even a
description of what a "theoretical" interface between MC and drive-mirror would
look like would go a long way even without code.

Your investigations would also help "drive" a solution to this problem for the
COLO team as well - I believe they need the same thing

- Michael

.



--
Thanks,
Yang.



Re: [Qemu-devel] [PATCH 0/2] vfio: Another try to fix ROM BAR endianness

2014-09-11 Thread Alexey Kardashevskiy
On 09/12/2014 07:20 AM, Alex Williamson wrote:
> On Tue, 2014-09-09 at 21:34 +1000, Alexey Kardashevskiy wrote:
>> I did some tests on LE/BE guest/host combinations and came up with
>> these 2 patches. Please comment. Thanks.
> 
> Have you tried BE guest/LE host?  

Yes.


> IIRC, the g3beige tcg guest on x86
> host used to work with a vfio-pci NIC.  The ROM is of course typically
> useless in this mode, but it would be nice to make sure we haven't
> broken BAR access for that combination.  Do we have any better BE tcg
> targets with PCI support these days?  Thanks,

pseries? PPC64 TCG can do VFIO.


-- 
Alexey



Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-09-11 Thread Hongyang Yang



在 09/12/2014 01:44 AM, Dr. David Alan Gilbert 写道:

(I've cc'd in Fam, Stefan, and Kevin for Block stuff, and
   Yang and Eddie for Colo)

* Walid Nouri (walid.no...@gmail.com) wrote:

Hello Michael, Hello Paolo
i have ???studied??? the available documentation/Information and tried to
get an idea of the QEMU live block operation possibilities.

I think the MC protocol doesn???t need synchronous block device replication
because primary and secondary VM are not synchronous. The state of the
primary is allays ahead of the state of the secondary. When the primary is
in epoch(n) the secondary is in epoch(n-1).

What MC needs is a block device agnostic, controlled and asynchronous
approach for replicating the contents of block devices and its state changes
to the secondary VM while the primary VM is running. Asynchronous block
transfer is important to allow maximum performance for the primary VM, while
keeping the secondary VM updated with state changes.

The block device replication should be possible in two stages or modes.

The first stage is the live copy of all block devices of the primary to the
secondary. This is necessary if the secondary doesn???t have an existing
image which is in sync with the primary at the time MC has started. This is
not very convenient but as far as I know actually there is no mechanism for
persistent dirty bitmap in QEMU.

The second stage (mode) is the replication of block device state changes
(modified blocks)  to keep the image on the secondary in sync with the
primary. The mirrored blocks must be buffered in ram (block buffer) until
the complete Checkpoint (RAM, vCPU, device state) can be committed.

For keeping the complete system state consistent on the secondary system
there must be a possibility for MC to commit/discard block device state
changes. In normal operation the mirrored block device state changes (block
buffer) are committed to disk when the complete checkpoint is committed. In
case of a crash of the primary system while transferring a checkpoint the
data in the block buffer corresponding to the failed Checkpoint must be
discarded.


I think for COLO there's a requirement that the secondary can do reads/writes
in parallel with the primary, and the secondary can discard those reads/writes
- and that doesn't happen in MC (Yang or Eddie should be able to confirm that).


Exactly, COLO need this functionality to ensure consistency.




The storage architecture should be ???shared nothing??? so that no shared
storage is required and primary/secondary can have separate block device
images.


MC/COLO with shared storage still needs some stuff like this; but it's subtely
different.   They still need to be able to buffer/release modifications
to the shared storage; if any of this code can also be used in the
shared-storage configurations it would be good.


Shared-storage is more complicated, we don't support shared-storage currently...




I think this can be achieved by drive-mirror and a filter block driver.
Another approach could be to exploit the block migration functionality of
live migration with a filter block driver.

The drive-mirror (and live migration) does not rely on shared storage and
allow live block device copy and incremental syncing.

A block buffer can be implemented with a QEMU filter block driver. It should
sit at the same position as the Quorum driver in the block driver hierarchy.
When using block filter approach MC will be transparent and block device
agnostic.

The block buffer filter must have an Interface which allows MC control the
commits or discards of block device state changes. I have no idea where to
put such an interface to stay conform with QEMU coding style.


I???m sure there are alternative and better approaches and I???m open for
any ideas


Walid

Am 17.08.2014 11:52, schrieb Paolo Bonzini:

Il 11/08/2014 22:15, Michael R. Hines ha scritto:

Excellent question: QEMU does have a feature called "drive-mirror"
in block/mirror.c that was introduced a couple of years ago. I'm not
sure what the
adoption rate of the feature is, but I would start with that one.


block/mirror.c is asynchronous, and there's no support for communicating
checkpoints back to the master.  However, the quorum disk driver could
be what you need.

There's also a series on the mailing list that lets quorum read only

>from the primary, so that quorum can still do replication and fault

tolerance, but skip fault detection.

Paolo


There is also a second fault tolerance implementation that works a
little differently called
"COLO" - you may have seen those emails on the list too, but their
method does not require a disk replication solution, if I recall correctly.






--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
.



--
Thanks,
Yang.



Re: [Qemu-devel] [PATCH] cpu-exec: Don't mask out external interrupts when single-stepping an invalid instruction.

2014-09-11 Thread Richard Henderson
On 09/11/2014 03:13 PM, Peter Maydell wrote:
>> The long story: Qemu generates an exception_with_syndrome instruction
>> when it realizes the instruction it's trying to translate is invalid.
>> That instruction in turn modifies cs->exception_index and calls 
>> cpu_loop_exit.
>> Normally, the value in cs->exception_index would cause do_interrupt to set
>> the PC to point to the corresponding exception handler.
>> However, since we're masking out IRQs, the PC will never be set correctly.
>> Even worse, since Qemu will have generated an internal exception
>> to return control back to the remote Gdb *after* it generated the syndrome 
>> one,
>> its code will appear after the call to cpu_loop_exit.
>> Since the PC won't have changed, we'll try to excecute
>> the same translation block as before, thus calling cpu_loop_exit
>> again, and so on.

This suggests that target-arm (64 or 32-bit?) isn't implementing illegal
instruction traping in the way that I'd expect.

In particular, I'd expect the invalid exception to be recognized, and then the
cpu loop exited, before the single-step exception could overwrite it.  E.g. how
things work on alpha:

$ cat z.s
.globl _start
_start:
nop
.long (1 << 26)
nop
$ alphaev67-linux-as -o z.o z.s
$ alphaev67-linux-ld -Ttext-segment 0xfc10 -o z z.o
$ ./run/bin/qemu-system-alpha -kernel z -S -gdb tcp::12345 &
$ alphaev67-linux-gdb ./z

Set a breakpoint at _start, continue, swap over to qemu monitor and enable
logging of int,op,in_asm.  Now single-step a couple of times and we get:

IN:
0xfc100078:  nop

OP:
 ld_i32 tmp0,env,$0xfffc
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$0x0
  0xfc100078
 movi_i64 pc,$0xfc10007c
 movi_i32 tmp0,$0x10002
 movi_i32 tmp1,$0x0
 call excp,$0x0,$0,env,tmp0,tmp1
 set_label $0x0
 exit_tb $0x7f54d4fee013

Notice here that we end the single-step of the insn with a call to excp with
0x10002=EXCP_DEBUG.

IN:
0xfc10007c:  .long 0x400

OP:
 ld_i32 tmp0,env,$0xfffc
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$0x0
  0xfc10007c
 movi_i64 pc,$0xfc100080
 movi_i32 tmp0,$0x7
 movi_i32 tmp1,$0x0
 call excp,$0x0,$0,env,tmp0,tmp1
 set_label $0x0
 exit_tb $0x7f54d4fee013

Notice here that we end the single-step of the invalid insn with a call to excp
with 7=EXCP_OPCDEC.  We never attempt to single-step, because we know that
single-step isn't going to be reachable.

INT  1: opcdec(0) pc=fc100080 sp=fc01
IN: Pal_OpcDec
0xfc000380:  hw_mfprt7,0

But gdb does in fact stop at the very next insn, which is of course the PALcode
(bios) OPCDEC exception entry point.

>> +#define SSTEP_EXCEPTION 0x8  /* Don't mask out exception-related
>> IRQs. Set only if
>> +  * we have to process an exception while 
>> single-
>> +  * stepping (such as when
>> single-stepping an invalid
>> +  * instruction).
>> +  */

I'm really not sure what you're doing with this, or why you'd need it.
Probably target-arm wants fixing in some way, but I don't have time to look
into it this evening.


r~



Re: [Qemu-devel] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Gabriel L. Somlo
On Thu, Sep 11, 2014 at 11:34:03PM +0200, Alexander Graf wrote:
> >> With Chameleon (and SeaBIOS), I see all three UHCIs (with no attached
> >> devices), and EHCI (with the now-high-speed keyboard and mouse).
> >> 
> >> With OVMF, I have EHCI with the high-speed (and working)
> >> keyboard+mouse, but ONLY ONE UHCI device (UHCI3). UHCI1 UHCI2 do not
> >> get detected.
> >> 
> >> Obviously, since QEMU was routing the slow keyboard+mouse to UHCI1,
> >> which for some reason gets masked when OVMF is used, things weren't
> >> working.
> >> 
> >> So now the question is, how come Fedora can find UHCI 1&2 on Q35+OVMF,
> >> but OS X can't, and what can we do to OVMF (and/or QEMU) to fix it.
> >> 
> >> Obviously, Jan's patch set woud paper over the issue, as the keyboard
> >> and mouse would go to ehci, but there's still the issue of the
> >> disappearing UHCI's :)
> > 
> 
> XNU also populates its device tree based on the DSDT. Maybe there's a
> subtle difference there?

This was the low hanging fruit, so I checked it first :) Pulled the
DSDT using the OS X version of "DSDTEditor" (found on insanelymac)
on both the Chameleon q35 version which had all three UHCIs and from
the OVMF q35 version which only showed UHCI3. The two DSDTs looked
absolutely identical (no output from diff)...

Thanks,
--Gabriel



[Qemu-devel] [PATCH 0/3] qapi: fix crash in dealloc visitor for union types

2014-09-11 Thread Michael Roth
This series introduces visit_start_enum and visit_end_enum as a way
of allowing visitors to trigger generated code to bail out on visiting
union fields if the visitor implementation deems doing so to be unsafe.

See patch 1 for the circumstances that cause the segfault in the
dealloc visitor.

This is a spin-off of a patch submitted by Fam Zheng earlier. See the
thread for additional background on why we're taking this approach:

  http://thread.gmane.org/gmane.comp.emulators.qemu/296090

Fam, if you'd like to break out your iotest into another patch I
can include it as part of this series, otherwise it can be sent as
a follow-up.

 include/qapi/visitor-impl.h |  2 ++
 include/qapi/visitor.h  |  2 ++
 qapi/qapi-dealloc-visitor.c |  6 ++
 qapi/qapi-visit-core.c  | 15 +++
 scripts/qapi-visit.py   |  6 ++
 tests/qapi-schema/qapi-schema-test.json | 10 ++
 tests/qapi-schema/qapi-schema-test.out  |  3 +++
 tests/test-qmp-input-strict.c   | 17 +
 8 files changed, 61 insertions(+)




[Qemu-devel] [PATCH 3/3] tests: add QMP input visitor test for unions with no discriminator

2014-09-11 Thread Michael Roth
This more of an exercise of the dealloc visitor, where it may
erroneously use an uninitialized discriminator field as indication
that union fields corresponding to that discriminator field/type are
present, which can lead to attempts to free random chunks of heap
memory.

Cc: qemu-sta...@nongnu.org
Reviewed-by: Eric Blake 
Signed-off-by: Michael Roth 
---
 tests/qapi-schema/qapi-schema-test.json | 10 ++
 tests/qapi-schema/qapi-schema-test.out  |  3 +++
 tests/test-qmp-input-strict.c   | 17 +
 3 files changed, 30 insertions(+)

diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index ab4d3d9..d43b5fd 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -33,6 +33,9 @@
 { 'type': 'UserDefB',
   'data': { 'integer': 'int' } }
 
+{ 'type': 'UserDefC',
+  'data': { 'string1': 'str', 'string2': 'str' } }
+
 { 'union': 'UserDefUnion',
   'base': 'UserDefZero',
   'data': { 'a' : 'UserDefA', 'b' : 'UserDefB' } }
@@ -47,6 +50,13 @@
 # FIXME generated struct UserDefFlatUnion has members for direct base
 # UserDefOne, but lacks members for indirect base UserDefZero
 
+# this variant of UserDefFlatUnion defaults to a union that uses fields with
+# allocated types to test corner cases in the cleanup/dealloc visitor
+{ 'union': 'UserDefFlatUnion2',
+  'base': 'UserDefUnionBase',
+  'discriminator': 'enum1',
+  'data': { 'value1' : 'UserDefC', 'value2' : 'UserDefB', 'value3' : 
'UserDefA' } }
+
 { 'union': 'UserDefAnonUnion',
   'discriminator': {},
   'data': { 'uda': 'UserDefA', 's': 'str', 'i': 'int' } }
diff --git a/tests/qapi-schema/qapi-schema-test.out 
b/tests/qapi-schema/qapi-schema-test.out
index 95e9899..08d7304 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -6,9 +6,11 @@
  OrderedDict([('type', 'UserDefNested'), ('data', OrderedDict([('string0', 
'str'), ('dict1', OrderedDict([('string1', 'str'), ('dict2', 
OrderedDict([('userdef1', 'UserDefOne'), ('string2', 'str')])), ('*dict3', 
OrderedDict([('userdef2', 'UserDefOne'), ('string3', 'str')]))]))]))]),
  OrderedDict([('type', 'UserDefA'), ('data', OrderedDict([('boolean', 
'bool')]))]),
  OrderedDict([('type', 'UserDefB'), ('data', OrderedDict([('integer', 
'int')]))]),
+ OrderedDict([('type', 'UserDefC'), ('data', OrderedDict([('string1', 'str'), 
('string2', 'str')]))]),
  OrderedDict([('union', 'UserDefUnion'), ('base', 'UserDefZero'), ('data', 
OrderedDict([('a', 'UserDefA'), ('b', 'UserDefB')]))]),
  OrderedDict([('type', 'UserDefUnionBase'), ('data', OrderedDict([('string', 
'str'), ('enum1', 'EnumOne')]))]),
  OrderedDict([('union', 'UserDefFlatUnion'), ('base', 'UserDefUnionBase'), 
('discriminator', 'enum1'), ('data', OrderedDict([('value1', 'UserDefA'), 
('value2', 'UserDefB'), ('value3', 'UserDefB')]))]),
+ OrderedDict([('union', 'UserDefFlatUnion2'), ('base', 'UserDefUnionBase'), 
('discriminator', 'enum1'), ('data', OrderedDict([('value1', 'UserDefC'), 
('value2', 'UserDefB'), ('value3', 'UserDefA')]))]),
  OrderedDict([('union', 'UserDefAnonUnion'), ('discriminator', OrderedDict()), 
('data', OrderedDict([('uda', 'UserDefA'), ('s', 'str'), ('i', 'int')]))]),
  OrderedDict([('union', 'UserDefNativeListUnion'), ('data', 
OrderedDict([('integer', ['int']), ('s8', ['int8']), ('s16', ['int16']), 
('s32', ['int32']), ('s64', ['int64']), ('u8', ['uint8']), ('u16', ['uint16']), 
('u32', ['uint32']), ('u64', ['uint64']), ('number', ['number']), ('boolean', 
['bool']), ('string', ['str'])]))]),
  OrderedDict([('command', 'user_def_cmd'), ('data', OrderedDict())]),
@@ -32,6 +34,7 @@
  OrderedDict([('type', 'UserDefNested'), ('data', OrderedDict([('string0', 
'str'), ('dict1', OrderedDict([('string1', 'str'), ('dict2', 
OrderedDict([('userdef1', 'UserDefOne'), ('string2', 'str')])), ('*dict3', 
OrderedDict([('userdef2', 'UserDefOne'), ('string3', 'str')]))]))]))]),
  OrderedDict([('type', 'UserDefA'), ('data', OrderedDict([('boolean', 
'bool')]))]),
  OrderedDict([('type', 'UserDefB'), ('data', OrderedDict([('integer', 
'int')]))]),
+ OrderedDict([('type', 'UserDefC'), ('data', OrderedDict([('string1', 'str'), 
('string2', 'str')]))]),
  OrderedDict([('type', 'UserDefUnionBase'), ('data', OrderedDict([('string', 
'str'), ('enum1', 'EnumOne')]))]),
  OrderedDict([('type', 'UserDefOptions'), ('data', OrderedDict([('*i64', 
['int']), ('*u64', ['uint64']), ('*u16', ['uint16']), ('*i64x', 'int'), 
('*u64x', 'uint64')]))]),
  OrderedDict([('type', 'EventStructOne'), ('data', OrderedDict([('struct1', 
'UserDefOne'), ('string', 'str'), ('*enum2', 'EnumOne')]))])]
diff --git a/tests/test-qmp-input-strict.c b/tests/test-qmp-input-strict.c
index 0f77003..d5360c6 100644
--- a/tests/test-qmp-input-strict.c
+++ b/tests/test-qmp-input-strict.c
@@ -260,6 +260,21 @@ static void 
test_validate_fail_union_flat(TestInputVisitorData *data,
 qapi_free_UserDefFlatUnion(tmp);
 }
 
+static void test_vali

[Qemu-devel] [PATCH 2/3] qapi: dealloc visitor, implement visit_start_union

2014-09-11 Thread Michael Roth
If the .data field of a QAPI Union is NULL, we don't need to free
any of the union fields..

Make use of the new visit_start_union interface to access this
information and instruct the generated code to not visit these
fields when this occurs.

Cc: qemu-sta...@nongnu.org
Reported-by: Fam Zheng 
Suggested-by: Paolo Bonzini 
Signed-off-by: Michael Roth 
---
 qapi/qapi-dealloc-visitor.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
index dc53545..eb77078 100644
--- a/qapi/qapi-dealloc-visitor.c
+++ b/qapi/qapi-dealloc-visitor.c
@@ -162,6 +162,11 @@ static void qapi_dealloc_type_enum(Visitor *v, int *obj, 
const char *strings[],
 {
 }
 
+static bool qapi_dealloc_start_union(Visitor *v, bool data_present, Error 
**errp)
+{
+return data_present;
+}
+
 Visitor *qapi_dealloc_get_visitor(QapiDeallocVisitor *v)
 {
 return &v->visitor;
@@ -191,6 +196,7 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void)
 v->visitor.type_str = qapi_dealloc_type_str;
 v->visitor.type_number = qapi_dealloc_type_number;
 v->visitor.type_size = qapi_dealloc_type_size;
+v->visitor.start_union = qapi_dealloc_start_union;
 
 QTAILQ_INIT(&v->stack);
 
-- 
1.9.1




[Qemu-devel] [PATCH 1/3] qapi: add visit_start_union and visit_end_union

2014-09-11 Thread Michael Roth
In some cases an input visitor might bail out on filling out a
struct for various reasons, such as missing fields when running
in strict mode. In the case of a QAPI Union type, this may lead
to cases where the .kind field which encodes the union type
is uninitialized. Subsequently, other visitors, such as the
dealloc visitor, may use this .kind value as if it were
initialized, leading to assumptions about the union type which
in this case may lead to segfaults. For example, freeing an
integer value.

However, we can generally rely on the fact that the always-present
.data void * field that we generate for these union types will
always be NULL in cases where .kind is uninitialized (at least,
there shouldn't be a reason where we'd do this purposefully).

So pass this information on to Visitor implementation via these
optional start_union/end_union interfaces so this information
can be used to guard against the situation above. We will make
use of this information in a subsequent patch for the dealloc
visitor.

Cc: qemu-sta...@nongnu.org
Reported-by: Fam Zheng 
Suggested-by: Paolo Bonzini 
Signed-off-by: Michael Roth 
---
 include/qapi/visitor-impl.h |  2 ++
 include/qapi/visitor.h  |  2 ++
 qapi/qapi-visit-core.c  | 15 +++
 scripts/qapi-visit.py   |  6 ++
 4 files changed, 25 insertions(+)

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index ecc0183..09bb0fd 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -55,6 +55,8 @@ struct Visitor
 void (*type_int64)(Visitor *v, int64_t *obj, const char *name, Error 
**errp);
 /* visit_type_size() falls back to (*type_uint64)() if type_size is unset 
*/
 void (*type_size)(Visitor *v, uint64_t *obj, const char *name, Error 
**errp);
+bool (*start_union)(Visitor *v, bool data_present, Error **errp);
+void (*end_union)(Visitor *v, bool data_present, Error **errp);
 };
 
 void input_type_enum(Visitor *v, int *obj, const char *strings[],
diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 4a0178f..5934f59 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -58,5 +58,7 @@ void visit_type_size(Visitor *v, uint64_t *obj, const char 
*name, Error **errp);
 void visit_type_bool(Visitor *v, bool *obj, const char *name, Error **errp);
 void visit_type_str(Visitor *v, char **obj, const char *name, Error **errp);
 void visit_type_number(Visitor *v, double *obj, const char *name, Error 
**errp);
+bool visit_start_union(Visitor *v, bool data_present, Error **errp);
+void visit_end_union(Visitor *v, bool data_present, Error **errp);
 
 #endif
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 55f8d40..b66b93a 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -58,6 +58,21 @@ void visit_end_list(Visitor *v, Error **errp)
 v->end_list(v, errp);
 }
 
+bool visit_start_union(Visitor *v, bool data_present, Error **errp)
+{
+if (v->start_union) {
+return v->start_union(v, data_present, errp);
+}
+return true;
+}
+
+void visit_end_union(Visitor *v, bool data_present, Error **errp)
+{
+if (v->end_union) {
+v->end_union(v, data_present, errp);
+}
+}
+
 void visit_optional(Visitor *v, bool *present, const char *name,
 Error **errp)
 {
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index c129697..4520da9 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -357,6 +357,9 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, const 
char *name, Error **e
 if (err) {
 goto out_obj;
 }
+if (!visit_start_union(m, !!(*obj)->data, &err)) {
+goto out_obj;
+}
 switch ((*obj)->kind) {
 ''',
  disc_type = disc_type,
@@ -385,6 +388,9 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, const 
char *name, Error **e
 out_obj:
 error_propagate(errp, err);
 err = NULL;
+visit_end_union(m, !!(*obj)->data, &err);
+error_propagate(errp, err);
+err = NULL;
 }
 visit_end_struct(m, &err);
 out:
-- 
1.9.1




Re: [Qemu-devel] [PATCH 0/1] Add support for DRM IOCTLs to QEMU user mode virtualization.

2014-09-11 Thread Aaditya Chandrasekhar Azad
Hi qemu devs,

I was wondering if anyone would be willing to test this patch with intel
or, if it will be useful, I can add support nouveau, radeon, vmxgfx / any
other mesa + drm gfx driver. I won't be able to easily test the latter
config, but given the nature of this patch, adding those IOCTLs should be
similarly mechanical and just work (tm?). The benefit from within
qemu-user-* is pretty awesome - direct rendering and support for all vendor
supported gfx apis (GL 3.3 core + GLES 3.0 on Intel Mesa). I was really
hoping to get X and/or Wayland working within a foreign architecture chroot
from console so I could run KDE4 or Gnome3 as a WM, but they fail on a
linux system call (signalfd, which is currently unimplemented). Perhaps
there isn't much work to be done there, I will take a deeper look. All
chroot apps I tested ran correctly with native X though, even Weston
launched as a window and rendered correctly with accel.

To properly test this patch, one will need to build libdrm,
xf86-video-intel and mesa in a foreign chroot. Building is easy, though one
may need to override the (seeming meaningless) x86 check in
xf86-video-intel's configure script. After building and installing setting
LD_LIBRARY_PATH and LIBGL_DRIVERS_PATH is sufficient to pick up the new
drivers. Radeon and nouveau should not need this work though.

Also this patch is fairly trivial, it would be awesome to get it reviewed
and committed :). I wasn't sure if this was appropriate for trivial patches
since the impact is strong user-visible hence this list.

Thanks,
Aaditya

On Fri, Sep 5, 2014 at 1:46 PM, Aaditya Chandrasekhar Azad <
aadit...@gmail.com> wrote:

> This patch adds initial user-virtualization support for the DRM (type
> 'd') IOCTLs in linux. With it and a corresponding architecture chroot
> (say aarch64), I am able to successfully run a few 2D and 3D
> applications with native graphics acceleration. Some notes/caveats
> are:
>
> 1. It will only work with open drivers as their IOCTLs are documented.
> 2. i965+ is only supported. That is any haswell, ivybridge, etc. GPU will
> work.
> 3. X doesn't start yet, though this patch eliminates all the visible
> unsupported DRM IOCTL calls as observed by setting QEMU_STRACE=1.
> 4. Intel open drivers components except for Beignet are architecture
> independent AFAICT and compile cleanly in foreign architecture
> environments.
> 5. The DRM table provided is current as of linux-3.17-rc2, however,
> compiling it with older kernel headers might require conditional
> guards that this patch doesn't provide yet.
> 6. syscalls.c now includes  files, which should be available
> in any valid linux-user build environment.
>
> Using a Debian aarch64 chroot on ubuntu amd64, I have successfully run
> nexuiz and
> compiled and run qemu-system-i386 with SDL emulation (+ patch for
> forcing OpenGL).
>
> Aaditya Chandrasekhar Azad (1):
>   Add support for DRM IOCTLs to QEMU user mode virtualization.
>
>  linux-user/ioctls.h| 137 ++
>  linux-user/syscall.c   |   2 +
>  linux-user/syscall_defs.h  | 139 ++
>  linux-user/syscall_types.h | 651
> +
>  thunk.c|   2 +-
>  5 files changed, 930 insertions(+), 1 deletion(-)
>
> --
> 2.1.0.60.g85f0837
>



-- 
Aaditya Chandrasekhar Azad
Texas-Ex!
Tel: 650-862-4685


Re: [Qemu-devel] [PATCH v6 00/16] KVM platform device passthrough

2014-09-11 Thread Christoffer Dall
On Thu, Sep 11, 2014 at 04:51:14PM -0600, Alex Williamson wrote:
> On Thu, 2014-09-11 at 15:23 -0700, Christoffer Dall wrote:
> > On Thu, Sep 11, 2014 at 04:14:09PM -0600, Alex Williamson wrote:
> > > On Tue, 2014-09-09 at 08:31 +0100, Eric Auger wrote:
> > > > This RFC series aims at enabling KVM platform device passthrough.
> > > > It implements a VFIO platform device, derived from VFIO PCI device.
> > > > 
> > > > The VFIO platform device uses the host VFIO platform driver which must
> > > > be bound to the assigned device prior to the QEMU system start.
> > > > 
> > > > - the guest can directly access the device register space
> > > > - assigned device IRQs are transparently routed to the guest by
> > > >   QEMU/KVM (3 methods currently are supported: user-level eventfd
> > > >   handling, irqfd, forwarded IRQs)
> > > > - iommu is transparently programmed to prevent the device from
> > > >   accessing physical pages outside of the guest address space
> > > > 
> > > > This patch series is made of the following patch files:
> > > > 
> > > > 1-7) Modifications to PCI code to prepare for VFIO platform device
> > > > 8) split of PCI specific code and generic code (move)
> > > > 9-11) creation of the VFIO calxeda xgmac platform device, without irqfd
> > > >   support (MMIO direct access and IRQ assignment).
> > > > 12) fake injection test modality (to test multiple IRQ)
> > > > 13) addition of irqfd/virqfd support
> > > > 14-16) forwarded IRQ
> > > > 
> > > > Dependency List:
> > > > 
> > > > QEMU dependencies:
> > > > [1] [PATCH v2 0/9] Dynamic sysbus device allocation support, Alex Graf
> > > > http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
> > > > [2] [RFC v3] machvirt dynamic sysbus device instantiation, Eric Auger
> > > > [3] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and 
> > > > KVM_CAP_IRQFD_RESAMPLE,
> > > > Eric Auger
> > > > 
> > > > http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
> > > > [4] [RFC] vfio: migration to trace points, Eric Auger
> > > > 
> > > > http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00569.html
> > > > 
> > > > Kernel Dependencies:
> > > > [5] [RFC Patch v6 0/20] VFIO support for platform devices, Antonios 
> > > > Motakis
> > > > https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
> > > > [6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
> > > > https://lkml.org/lkml/2014/9/1/141
> > > > [7] arm/arm64: KVM: Various VGIC cleanups and improvements, Christoffer 
> > > > Dall
> > > > http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
> > > > [8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
> > > > https://lkml.org/lkml/2014/9/1/344
> > > > [9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
> > > > Marc Zyngier
> > > > http://lwn.net/Articles/603514/
> > > > 
> > > > kernel pieces can be found at:
> > > > http://git.linaro.org/people/eric.auger/linux.git
> > > > (branch 3.17rc3_irqfd_forward_integ_v2)
> > > > QEMU pieces can be found at:
> > > > http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v6)
> > > > 
> > > > The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
> > > > is assigned to KVM host while the second one is assigned to the guest.
> > > > Reworked PCI device is not tested.
> > > > 
> > > > Wiki for Calxeda Midway setup:
> > > > https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway
> > > > 
> > > > History:
> > > > 
> > > > v5->v6:
> > > > - rebase on 2.1rc5 PCI code
> > > > - forwarded IRQ first integraton
> > > 
> > > Why?  Are there acceleration paths that you're concerned cannot be
> > > implemented or we do not already have a proof of concept for?  The base
> > > kernel patch series you depend on is 3 months old yet this series
> > > continues to grow and add new dependencies.  Please let's prioritize
> > > getting something upstream instead of adding more blockers to prevent
> > > that.  Thanks,
> > > 
> > I'm not exactly sure what this changelog line was referring to
> > (depending on Marc's forwarding IRQ patches?), but just want to add that
> > there are a number of dependencies for the GIC that need to go in as
> > well (should happen within a few weeks), but I think it's unlikely that
> > the IRQ forwarding stuff goes in for v3.18 at this point.
> > 
> > It may make sense as you suggest to keep that part out of this patch set
> > and something merged sooner as opposed to later, but I'm too jet-lagged
> > to completely understand if that's going to be a horrible mess.
> 
> The point is that we're on v6 of a patch series and its first non-RFC
> posting and we're rolling in a first pass at a QEMU implementation that
> depends on a contested kernel RFC, which depends on another stagnant
> kernel RFC.  I'm fine with working on it in parallel, but give me some
> light at the end of the tunnel as a reviewer and maintainer that this
> code

Re: [Qemu-devel] [PATCH] qapi: Fix crash with enum dealloc when kind is invalid

2014-09-11 Thread Michael Roth
Quoting Paolo Bonzini (2014-09-11 09:35:58)
> Il 11/09/2014 16:26, Michael Roth ha scritto:
> > Also, the .kind field of a QAPI Union type is something we generate for use
> > by the generated visitor code. In the case of an unspecified discriminator
> > we generated the enum type for that field internally. In the case where it's
> > specified, we use an existing enum instead...
> > 
> > But nothing stops us from generating a new "shadow" enum in this case as 
> > well,
> > with the indexes/integer values of the corresponding strings shifted by one 
> > so
> > we can reserve the 0 index for _INVALID. I think we can reasonably expect 
> > that
> > nothing outside the generated code makes use of those integer values in this
> > special case, and don't have to change all enum types to make that work.
> 
> But how would users fill in structs if you have to use a different enum?

Argh, of course, we do still make direct use of these going in the other
direction. Those users would need to use the "shadow" enum values to make
it work, which is probably way too messy.

> 
> What about making adding visit_start_union/visit_end_union?
> visit_start_union can return false if the visit of the union has to be
> skipped.
> 
> The dealloc visitor can skip it if the data field is NULL; everything
> else can just use a default implementation which always returns true.

I forgot we had a void *data there as well. So we're basically relying
on .data != NULL implying that .kind has been properly initialized,
rather than needing to encode anything into .kind... nice.

I can imagine a case where we allocate memory for a set of union fields
(so .data != NULL) and then leave .kind uninitialized, which can still
lead to segfaults due to improper casts in the dealloc visitor, but I
don't really see a way around that. Even if we reserve .kind == 0 for
this purpose, it's still up to the user or visitor implementation to
0-initialize everything (though that's a bit easier to enforce).

So this seems like a good approach. I've ahead and hacked something up
which I'll send out shortly.

> 
> Paolo




Re: [Qemu-devel] [PATCH v6 00/16] KVM platform device passthrough

2014-09-11 Thread Alex Williamson
On Thu, 2014-09-11 at 15:23 -0700, Christoffer Dall wrote:
> On Thu, Sep 11, 2014 at 04:14:09PM -0600, Alex Williamson wrote:
> > On Tue, 2014-09-09 at 08:31 +0100, Eric Auger wrote:
> > > This RFC series aims at enabling KVM platform device passthrough.
> > > It implements a VFIO platform device, derived from VFIO PCI device.
> > > 
> > > The VFIO platform device uses the host VFIO platform driver which must
> > > be bound to the assigned device prior to the QEMU system start.
> > > 
> > > - the guest can directly access the device register space
> > > - assigned device IRQs are transparently routed to the guest by
> > >   QEMU/KVM (3 methods currently are supported: user-level eventfd
> > >   handling, irqfd, forwarded IRQs)
> > > - iommu is transparently programmed to prevent the device from
> > >   accessing physical pages outside of the guest address space
> > > 
> > > This patch series is made of the following patch files:
> > > 
> > > 1-7) Modifications to PCI code to prepare for VFIO platform device
> > > 8) split of PCI specific code and generic code (move)
> > > 9-11) creation of the VFIO calxeda xgmac platform device, without irqfd
> > >   support (MMIO direct access and IRQ assignment).
> > > 12) fake injection test modality (to test multiple IRQ)
> > > 13) addition of irqfd/virqfd support
> > > 14-16) forwarded IRQ
> > > 
> > > Dependency List:
> > > 
> > > QEMU dependencies:
> > > [1] [PATCH v2 0/9] Dynamic sysbus device allocation support, Alex Graf
> > > http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
> > > [2] [RFC v3] machvirt dynamic sysbus device instantiation, Eric Auger
> > > [3] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and 
> > > KVM_CAP_IRQFD_RESAMPLE,
> > > Eric Auger
> > > http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
> > > [4] [RFC] vfio: migration to trace points, Eric Auger
> > > http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00569.html
> > > 
> > > Kernel Dependencies:
> > > [5] [RFC Patch v6 0/20] VFIO support for platform devices, Antonios 
> > > Motakis
> > > https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
> > > [6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
> > > https://lkml.org/lkml/2014/9/1/141
> > > [7] arm/arm64: KVM: Various VGIC cleanups and improvements, Christoffer 
> > > Dall
> > > http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
> > > [8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
> > > https://lkml.org/lkml/2014/9/1/344
> > > [9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
> > > Marc Zyngier
> > > http://lwn.net/Articles/603514/
> > > 
> > > kernel pieces can be found at:
> > > http://git.linaro.org/people/eric.auger/linux.git
> > > (branch 3.17rc3_irqfd_forward_integ_v2)
> > > QEMU pieces can be found at:
> > > http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v6)
> > > 
> > > The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
> > > is assigned to KVM host while the second one is assigned to the guest.
> > > Reworked PCI device is not tested.
> > > 
> > > Wiki for Calxeda Midway setup:
> > > https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway
> > > 
> > > History:
> > > 
> > > v5->v6:
> > > - rebase on 2.1rc5 PCI code
> > > - forwarded IRQ first integraton
> > 
> > Why?  Are there acceleration paths that you're concerned cannot be
> > implemented or we do not already have a proof of concept for?  The base
> > kernel patch series you depend on is 3 months old yet this series
> > continues to grow and add new dependencies.  Please let's prioritize
> > getting something upstream instead of adding more blockers to prevent
> > that.  Thanks,
> > 
> I'm not exactly sure what this changelog line was referring to
> (depending on Marc's forwarding IRQ patches?), but just want to add that
> there are a number of dependencies for the GIC that need to go in as
> well (should happen within a few weeks), but I think it's unlikely that
> the IRQ forwarding stuff goes in for v3.18 at this point.
> 
> It may make sense as you suggest to keep that part out of this patch set
> and something merged sooner as opposed to later, but I'm too jet-lagged
> to completely understand if that's going to be a horrible mess.

The point is that we're on v6 of a patch series and its first non-RFC
posting and we're rolling in a first pass at a QEMU implementation that
depends on a contested kernel RFC, which depends on another stagnant
kernel RFC.  I'm fine with working on it in parallel, but give me some
light at the end of the tunnel as a reviewer and maintainer that this
code isn't going to live indefinitely on the mailing list.  Do we really
need those GIC patches do be able to have non-KVM accelerated VFIO
platform device assignment?  We certainly don't need IRQ forwarding.
Thanks,

Alex




Re: [Qemu-devel] [PATCH v6 00/16] KVM platform device passthrough

2014-09-11 Thread Christoffer Dall
On Thu, Sep 11, 2014 at 04:14:09PM -0600, Alex Williamson wrote:
> On Tue, 2014-09-09 at 08:31 +0100, Eric Auger wrote:
> > This RFC series aims at enabling KVM platform device passthrough.
> > It implements a VFIO platform device, derived from VFIO PCI device.
> > 
> > The VFIO platform device uses the host VFIO platform driver which must
> > be bound to the assigned device prior to the QEMU system start.
> > 
> > - the guest can directly access the device register space
> > - assigned device IRQs are transparently routed to the guest by
> >   QEMU/KVM (3 methods currently are supported: user-level eventfd
> >   handling, irqfd, forwarded IRQs)
> > - iommu is transparently programmed to prevent the device from
> >   accessing physical pages outside of the guest address space
> > 
> > This patch series is made of the following patch files:
> > 
> > 1-7) Modifications to PCI code to prepare for VFIO platform device
> > 8) split of PCI specific code and generic code (move)
> > 9-11) creation of the VFIO calxeda xgmac platform device, without irqfd
> >   support (MMIO direct access and IRQ assignment).
> > 12) fake injection test modality (to test multiple IRQ)
> > 13) addition of irqfd/virqfd support
> > 14-16) forwarded IRQ
> > 
> > Dependency List:
> > 
> > QEMU dependencies:
> > [1] [PATCH v2 0/9] Dynamic sysbus device allocation support, Alex Graf
> > http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
> > [2] [RFC v3] machvirt dynamic sysbus device instantiation, Eric Auger
> > [3] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and 
> > KVM_CAP_IRQFD_RESAMPLE,
> > Eric Auger
> > http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
> > [4] [RFC] vfio: migration to trace points, Eric Auger
> > http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00569.html
> > 
> > Kernel Dependencies:
> > [5] [RFC Patch v6 0/20] VFIO support for platform devices, Antonios Motakis
> > https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
> > [6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
> > https://lkml.org/lkml/2014/9/1/141
> > [7] arm/arm64: KVM: Various VGIC cleanups and improvements, Christoffer Dall
> > http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
> > [8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
> > https://lkml.org/lkml/2014/9/1/344
> > [9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
> > Marc Zyngier
> > http://lwn.net/Articles/603514/
> > 
> > kernel pieces can be found at:
> > http://git.linaro.org/people/eric.auger/linux.git
> > (branch 3.17rc3_irqfd_forward_integ_v2)
> > QEMU pieces can be found at:
> > http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v6)
> > 
> > The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
> > is assigned to KVM host while the second one is assigned to the guest.
> > Reworked PCI device is not tested.
> > 
> > Wiki for Calxeda Midway setup:
> > https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway
> > 
> > History:
> > 
> > v5->v6:
> > - rebase on 2.1rc5 PCI code
> > - forwarded IRQ first integraton
> 
> Why?  Are there acceleration paths that you're concerned cannot be
> implemented or we do not already have a proof of concept for?  The base
> kernel patch series you depend on is 3 months old yet this series
> continues to grow and add new dependencies.  Please let's prioritize
> getting something upstream instead of adding more blockers to prevent
> that.  Thanks,
> 
I'm not exactly sure what this changelog line was referring to
(depending on Marc's forwarding IRQ patches?), but just want to add that
there are a number of dependencies for the GIC that need to go in as
well (should happen within a few weeks), but I think it's unlikely that
the IRQ forwarding stuff goes in for v3.18 at this point.

It may make sense as you suggest to keep that part out of this patch set
and something merged sooner as opposed to later, but I'm too jet-lagged
to completely understand if that's going to be a horrible mess.

Thanks,
-Christoffer



Re: [Qemu-devel] [PATCH v6 00/16] KVM platform device passthrough

2014-09-11 Thread Alex Williamson
On Tue, 2014-09-09 at 08:31 +0100, Eric Auger wrote:
> This RFC series aims at enabling KVM platform device passthrough.
> It implements a VFIO platform device, derived from VFIO PCI device.
> 
> The VFIO platform device uses the host VFIO platform driver which must
> be bound to the assigned device prior to the QEMU system start.
> 
> - the guest can directly access the device register space
> - assigned device IRQs are transparently routed to the guest by
>   QEMU/KVM (3 methods currently are supported: user-level eventfd
>   handling, irqfd, forwarded IRQs)
> - iommu is transparently programmed to prevent the device from
>   accessing physical pages outside of the guest address space
> 
> This patch series is made of the following patch files:
> 
> 1-7) Modifications to PCI code to prepare for VFIO platform device
> 8) split of PCI specific code and generic code (move)
> 9-11) creation of the VFIO calxeda xgmac platform device, without irqfd
>   support (MMIO direct access and IRQ assignment).
> 12) fake injection test modality (to test multiple IRQ)
> 13) addition of irqfd/virqfd support
> 14-16) forwarded IRQ
> 
> Dependency List:
> 
> QEMU dependencies:
> [1] [PATCH v2 0/9] Dynamic sysbus device allocation support, Alex Graf
> http://lists.gnu.org/archive/html/qemu-ppc/2014-07/msg00047.html
> [2] [RFC v3] machvirt dynamic sysbus device instantiation, Eric Auger
> [3] [PATCH v2 0/2] actual checks of KVM_CAP_IRQFD and KVM_CAP_IRQFD_RESAMPLE,
> Eric Auger
> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00589.html
> [4] [RFC] vfio: migration to trace points, Eric Auger
> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg00569.html
> 
> Kernel Dependencies:
> [5] [RFC Patch v6 0/20] VFIO support for platform devices, Antonios Motakis
> https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
> [6] [PATCH v3] ARM: KVM: add irqfd support, Eric Auger
> https://lkml.org/lkml/2014/9/1/141
> [7] arm/arm64: KVM: Various VGIC cleanups and improvements, Christoffer Dall
> http://comments.gmane.org/gmane.linux.ports.arm.kernel/340430
> [8] [RFC v2 0/9] KVM-VFIO IRQ forward control, Eric Auger
> https://lkml.org/lkml/2014/9/1/344
> [9] [RFC PATCH 0/9] ARM: Forwarding physical interrupts to a guest VM,
> Marc Zyngier
> http://lwn.net/Articles/603514/
> 
> kernel pieces can be found at:
> http://git.linaro.org/people/eric.auger/linux.git
> (branch 3.17rc3_irqfd_forward_integ_v2)
> QEMU pieces can be found at:
> http://git.linaro.org/people/eric.auger/qemu.git (branch vfio_integ_v6)
> 
> The patch series was tested on Calxeda Midway (ARMv7) where one xgmac
> is assigned to KVM host while the second one is assigned to the guest.
> Reworked PCI device is not tested.
> 
> Wiki for Calxeda Midway setup:
> https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway
> 
> History:
> 
> v5->v6:
> - rebase on 2.1rc5 PCI code
> - forwarded IRQ first integraton

Why?  Are there acceleration paths that you're concerned cannot be
implemented or we do not already have a proof of concept for?  The base
kernel patch series you depend on is 3 months old yet this series
continues to grow and add new dependencies.  Please let's prioritize
getting something upstream instead of adding more blockers to prevent
that.  Thanks,

Alex

> - vfio_device property renamed into host property
> - split IRQ setup in different functions that match the 3 supported
>   injection techniques (user handled eventfd, irqfd, forwarded IRQ):
>   removes dynamic switch between injection methods
> - introduce fake interrupts as a test modality:
>   x makes possible to test multiple IRQ user-side handling.
>   x this is a test feature only: enable to trigger a fd as if the
> real physical IRQ hit. No virtual IRQ is injected into the guest
> but handling is simulated so that the state machine can be tested
> - user handled eventfd:
>   x add mutex to protect IRQ state & list manipulation,
>   x correct misleading comment in vfio_intp_interrupt.
>   x Fix bugs using fake interrupt modality
> - irqfd no more advertised in this patchset (handled in [3])
> - VFIOPlatformDeviceClass becomes abstract and Calxeda xgmac device
>   and class is re-introduced (as per v4)
> - all DPRINTF removed in platform and replaced by trace-points
> - corrects compilation with configure --disable-kvm
> - simplifies the split for vfio_get_device and introduce a unique
>   specialized function named vfio_populate_device
> - group_list renamed into vfio_group_list
> - hw/arm/dyn_sysbus_devtree.c currently only support vfio-calxeda-xgmac
>   instantiation. Needs to be specialized for other VFIO devices
> - fix 2 bugs in dyn_sysbus_devtree(reg_attr index and compat)
> 
> v4->v5:
> - rebase on v2.1.0 PCI code
> - take into account Alex Williamson comments on PCI code rework
>   - trace updates in vfio_region_write/read
>   - remove fd from VFIORegion
>   - get/put ckeanup
> - bug f

Re: [Qemu-devel] [PATCH] cpu-exec: Don't mask out external interrupts when single-stepping an invalid instruction.

2014-09-11 Thread Peter Maydell
[cc'ing RTH who may have a better grasp on how the builtin single step is
supposed to work.]

On 11 September 2014 22:02, Martin Galvan
 wrote:
> When using Gdb to remote-debug a program, if we try to single-step an
> invalid instruction,
> Qemu will never return control to the remote Gdb.
> The source of this problem is external interrupts being masked out
> in cpu_exec if cpu->singlestep_enabled has the SSTEP_NOIRQ flag set.
> To solve this I've added an additional flag, SSTEP_EXCEPTION,
> that will be set in the exception_with_syndrome instruction generated
> when trying to translate the invalid instruction and will be cleared
> after checking for cpu->singlestep_enabled in cpu_exec.
>
> Signed-off-by: Martin Galvan 
> ---
>
> The long story: Qemu generates an exception_with_syndrome instruction
> when it realizes the instruction it's trying to translate is invalid.
> That instruction in turn modifies cs->exception_index and calls cpu_loop_exit.
> Normally, the value in cs->exception_index would cause do_interrupt to set
> the PC to point to the corresponding exception handler.
> However, since we're masking out IRQs, the PC will never be set correctly.
> Even worse, since Qemu will have generated an internal exception
> to return control back to the remote Gdb *after* it generated the syndrome 
> one,
> its code will appear after the call to cpu_loop_exit.
> Since the PC won't have changed, we'll try to excecute
> the same translation block as before, thus calling cpu_loop_exit
> again, and so on.
>
> Here's the bug tracker link: https://bugs.launchpad.net/qemu/+bug/1364501
>
>  cpu-exec.c | 6 +-
>  include/qom/cpu.h  | 5 +
>  target-arm/op_helper.c | 5 +
>  3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 5fa172c..5d9f231 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -448,9 +448,13 @@ int cpu_exec(CPUArchState *env)
>  for(;;) {
>  interrupt_request = cpu->interrupt_request;
>  if (unlikely(interrupt_request)) {
> -if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
> +if (unlikely((cpu->singlestep_enabled & SSTEP_NOIRQ) &&
> +!(cpu->singlestep_enabled & SSTEP_EXCEPTION))) {
>  /* Mask out external interrupts for this step. */
>  interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
> +} else { /* An exception occured; don't mask out
> this one */
> +/* Mask out any future external interrupts */
> +cpu->singlestep_enabled &= ~SSTEP_EXCEPTION;
>  }
>  if (interrupt_request & CPU_INTERRUPT_DEBUG) {
>  cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index f2df033..a57800f 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -607,6 +607,11 @@ void qemu_init_vcpu(CPUState *cpu);
>  #define SSTEP_ENABLE0x1  /* Enable simulated HW single stepping */
>  #define SSTEP_NOIRQ 0x2  /* Do not use IRQ while single stepping */
>  #define SSTEP_NOTIMER   0x4  /* Do not Timers while single stepping */
> +#define SSTEP_EXCEPTION 0x8  /* Don't mask out exception-related
> IRQs. Set only if
> +  * we have to process an exception while single-
> +  * stepping (such as when
> single-stepping an invalid
> +  * instruction).
> +  */
>
>  /**
>   * cpu_single_step:
> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
> index fe40358..2139ea6 100644
> --- a/target-arm/op_helper.c
> +++ b/target-arm/op_helper.c
> @@ -251,6 +251,11 @@ void HELPER(exception_with_syndrome)(CPUARMState
> *env, uint32_t excp,
>  CPUState *cs = CPU(arm_env_get_cpu(env));
>
>  assert(!excp_is_internal(excp));
> +
> +if (unlikely(cs->singlestep_enabled & SSTEP_NOIRQ)) {
> +cs->singlestep_enabled |= SSTEP_EXCEPTION;
> +}
> +
>  cs->exception_index = excp;
>  env->exception.syndrome = syndrome;
>  cpu_loop_exit(cs);
> --
> 1.9.1
>
> --
>
> Martín Galván
>
> Software Engineer
>
> Taller Technologies Argentina
>
> San Lorenzo 47, 3rd Floor, Office 5
>
> Córdoba, Argentina
>
> Phone: 54 351 4217888 / +54 351 4218211


thanks
-- PMM



Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-09-11 Thread Walid Nouri

Am 11.09.2014 19:44, schrieb Dr. David Alan Gilbert:


For keeping the complete system state consistent on the secondary system
there must be a possibility for MC to commit/discard block device state
changes. In normal operation the mirrored block device state changes (block
buffer) are committed to disk when the complete checkpoint is committed. In
case of a crash of the primary system while transferring a checkpoint the
data in the block buffer corresponding to the failed Checkpoint must be
discarded.


I think for COLO there's a requirement that the secondary can do reads/writes
in parallel with the primary, and the secondary can discard those reads/writes
- and that doesn't happen in MC (Yang or Eddie should be able to confirm that).


The storage architecture should be ???shared nothing??? so that no shared
storage is required and primary/secondary can have separate block device
images.


I admit that my formulation was unintentionally a bit ambiguous :)
I should have written that a shared storage should not be mandatory.
I'm comming from an SMB environment and (redundant) shared storage 
systems are still not usual in small companies :)


I looked for a storage agnostic approach which allows the number of 
system components to be as low as possible and still get redundancy and 
fault tolerance.




MC/COLO with shared storage still needs some stuff like this; but it's subtely
different.   They still need to be able to buffer/release modifications
to the shared storage; if any of this code can also be used in the
shared-storage configurations it would be good.


The proposed approach with block filter and the commit/discard protocol 
should be storage agnostic and will also work in a shared storage 
environment, but only with distinct images (because of the protocol).


In case of a shared storage and a common image used by the primary and 
secondary another storage protocol must be used.


It's not commit/discard but commit/rollback

The primary still sends asynchronously the block state changes. The 
secondary buffers block device state changes but doesn't apply them in 
normal operation. When the next checkpoint is complete the secondary 
clears the buffer and forgets about the old block state data.


If the primary fails the secondary must rollback the common image with 
the block state data corresponding to the actual checkpoint.
Otherwise the state of the image and rest of the system state on the 
secondary will not be in sync.


When there is no block state data corresponding to the actual 
checkpoint, then there is nothing to do on the storage for the secondary :)


There is a little danger in this though. When the secondary fails during 
rollback, the common image will be left in an inconsistent state.
I think this risk cannot be avoided when using a common image. But this 
unfortunate situation can also happen in other scenarios.


Sharing a common immage with this protocol will lead to a longer fail 
over time in case of existing block device state data for the actual 
checkpoint. The secondary must initiate the rollback and wait until all 
blocks of the actual checkpoint are commited to the common immage before 
taking over the active role.



Walid






[Qemu-devel] [PATCH] vhost-user: fix VIRTIO_NET_F_MRG_RXBUF negotiation

2014-09-11 Thread damarion
From: Damjan Marion 

Header length check should happen only if backend is kernel. For user
backend there is no reason to reset this bit.

vhost-user code does not define .has_vnet_hdr_len so
VIRTIO_NET_F_MRG_RXBUF cannot be negotiated even if both sides
support it.

Signed-off-by: Damjan Marion 
---
 hw/net/vhost_net.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index f87c798..9d1172a 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -162,11 +162,11 @@ struct vhost_net *vhost_net_init(VhostNetOptions *options)
 if (r < 0) {
 goto fail;
 }
-if (!qemu_has_vnet_hdr_len(options->net_backend,
-   sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
-net->dev.features &= ~(1 << VIRTIO_NET_F_MRG_RXBUF);
-}
 if (backend_kernel) {
+if (!qemu_has_vnet_hdr_len(options->net_backend,
+   sizeof(struct virtio_net_hdr_mrg_rxbuf))) {
+net->dev.features &= ~(1 << VIRTIO_NET_F_MRG_RXBUF);
+}
 if (~net->dev.features & net->dev.backend_features) {
 fprintf(stderr, "vhost lacks feature mask %" PRIu64
" for backend\n",
-- 
1.9.1




Re: [Qemu-devel] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Alexander Graf


> Am 11.09.2014 um 22:46 schrieb Laszlo Ersek :
> 
>> On 09/11/14 22:16, Gabriel L. Somlo wrote:
>>> On Thu, Sep 11, 2014 at 06:40:38PM +0200, Paolo Bonzini wrote:
>>> Il 11/09/2014 18:35, Gabriel L. Somlo ha scritto:
>> Can you configure Chamaleon to avoid the boot prompt?
 Yes. After doing that, usb starts working once OS X is fully booted.
 
 Works with either piix or q35 just fine.
 
 Does this mean it's likely to be an OVMF uhci/ehci issue specific to Q35 ?
 (one from which Fedora can recover but OS X can't) ?
>>> 
>>> Yes, that's my interpretation too.
>>> 
>>> You did test an UHCI controller, I think, but I don't remember---did you
>>> test an EHCI controller without companions, using something like
>>> "-device ich9-usb-ehci1,id=myehci -device usb-keyboard,bus=myehci.0"?
>> 
>> Not only does that work (after applying Jan Vesely's 3-patch series from
>> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02175.html),
>> but my "regular" command line starts working as well, since now both
>> keyboard and mouse get routed to ehci instead of uhci.
>> 
>>> If that works, the issue would be specific to EHCI companion
>>> controllers.  If that doesn't work, there is at least a generic in the
>>> EHCI driver---of course there could possibly be another in the companion
>>> controllers, but I'd try getting EHCI alone to work.
>> 
>> So, here's my command line again:
>> 
>> bin/qemu-system-x86_64 -enable-kvm -m 2048 -cpu core2duo -smp 4,cores=2 \
>>  -machine q35 -device ide-drive,bus=ide.2,drive=MacHDD \
>>  -drive id=MacHDD,if=none,snapshot=on,file=./mac_10.9.img \
>>  -device isa-applesmc,osk="our...Inc" \
>>  -netdev user,id=usr0 -device e1000-82545em,netdev=usr0,id=vnet0 \
>>  -usb -device usb-kbd -device usb-mouse \
>>  [ -bios OVMF.fd  |  -kernel chameleon_boot ]
>> 
>> 
>> With this, I was able to get a better idea what the problem is. With
>> either OVMF or Chameleon, I still see UHCI 1..3 and EHCI in qtree,
>> with the mouse and keyboard hanging off of EHCI.
>> 
>> However, from inside the OS X guest's system profiler, I see a
>> difference:
>> 
>> With Chameleon (and SeaBIOS), I see all three UHCIs (with no attached
>> devices), and EHCI (with the now-high-speed keyboard and mouse).
>> 
>> With OVMF, I have EHCI with the high-speed (and working)
>> keyboard+mouse, but ONLY ONE UHCI device (UHCI3). UHCI1 UHCI2 do not
>> get detected.
>> 
>> Obviously, since QEMU was routing the slow keyboard+mouse to UHCI1,
>> which for some reason gets masked when OVMF is used, things weren't
>> working.
>> 
>> So now the question is, how come Fedora can find UHCI 1&2 on Q35+OVMF,
>> but OS X can't, and what can we do to OVMF (and/or QEMU) to fix it.
>> 
>> Obviously, Jan's patch set woud paper over the issue, as the keyboard
>> and mouse would go to ehci, but there's still the issue of the
>> disappearing UHCI's :)
> 
> Please save the OVMF debug log, maybe we can catch something in it.
> 
> -global isa-debugcon.iobase=0x402 -debugcon file:ovmf.log
> 
> It could be also worthwhile to boot OVMF (not OSX, just OVMF) on both
> i440fx and q35, with the otherwise same command line, and compare the
> two debug logs.

XNU also populates its device tree based on the DSDT. Maybe there's a subtle 
difference there?

Alex

> 
> Thanks
> Laszlo



Re: [Qemu-devel] New requirement for getting block layer patches merged

2014-09-11 Thread Benoît Canet
> EOF
> ---
> If you have feedback or questions, let us know.  The process can be
> tweaked as time goes on so we can continue to improve.

Great mail.

Now we need a wiki entry describing the process.
Also we need something reminding who is the maintainer of the current week.

Best regards

Benoît



Re: [Qemu-devel] [PATCH 0/2] vfio: Another try to fix ROM BAR endianness

2014-09-11 Thread Alex Williamson
On Tue, 2014-09-09 at 21:34 +1000, Alexey Kardashevskiy wrote:
> I did some tests on LE/BE guest/host combinations and came up with
> these 2 patches. Please comment. Thanks.

Have you tried BE guest/LE host?  IIRC, the g3beige tcg guest on x86
host used to work with a vfio-pci NIC.  The ROM is of course typically
useless in this mode, but it would be nice to make sure we haven't
broken BAR access for that combination.  Do we have any better BE tcg
targets with PCI support these days?  Thanks,

Alex




[Qemu-devel] [PATCH] cpu-exec: Don't mask out external interrupts when single-stepping an invalid instruction.

2014-09-11 Thread Martin Galvan
When using Gdb to remote-debug a program, if we try to single-step an
invalid instruction,
Qemu will never return control to the remote Gdb.
The source of this problem is external interrupts being masked out
in cpu_exec if cpu->singlestep_enabled has the SSTEP_NOIRQ flag set.
To solve this I've added an additional flag, SSTEP_EXCEPTION,
that will be set in the exception_with_syndrome instruction generated
when trying to translate the invalid instruction and will be cleared
after checking for cpu->singlestep_enabled in cpu_exec.

Signed-off-by: Martin Galvan 
---

The long story: Qemu generates an exception_with_syndrome instruction
when it realizes the instruction it's trying to translate is invalid.
That instruction in turn modifies cs->exception_index and calls cpu_loop_exit.
Normally, the value in cs->exception_index would cause do_interrupt to set
the PC to point to the corresponding exception handler.
However, since we're masking out IRQs, the PC will never be set correctly.
Even worse, since Qemu will have generated an internal exception
to return control back to the remote Gdb *after* it generated the syndrome one,
its code will appear after the call to cpu_loop_exit.
Since the PC won't have changed, we'll try to excecute
the same translation block as before, thus calling cpu_loop_exit
again, and so on.

Here's the bug tracker link: https://bugs.launchpad.net/qemu/+bug/1364501

 cpu-exec.c | 6 +-
 include/qom/cpu.h  | 5 +
 target-arm/op_helper.c | 5 +
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 5fa172c..5d9f231 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -448,9 +448,13 @@ int cpu_exec(CPUArchState *env)
 for(;;) {
 interrupt_request = cpu->interrupt_request;
 if (unlikely(interrupt_request)) {
-if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
+if (unlikely((cpu->singlestep_enabled & SSTEP_NOIRQ) &&
+!(cpu->singlestep_enabled & SSTEP_EXCEPTION))) {
 /* Mask out external interrupts for this step. */
 interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
+} else { /* An exception occured; don't mask out
this one */
+/* Mask out any future external interrupts */
+cpu->singlestep_enabled &= ~SSTEP_EXCEPTION;
 }
 if (interrupt_request & CPU_INTERRUPT_DEBUG) {
 cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index f2df033..a57800f 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -607,6 +607,11 @@ void qemu_init_vcpu(CPUState *cpu);
 #define SSTEP_ENABLE0x1  /* Enable simulated HW single stepping */
 #define SSTEP_NOIRQ 0x2  /* Do not use IRQ while single stepping */
 #define SSTEP_NOTIMER   0x4  /* Do not Timers while single stepping */
+#define SSTEP_EXCEPTION 0x8  /* Don't mask out exception-related
IRQs. Set only if
+  * we have to process an exception while single-
+  * stepping (such as when
single-stepping an invalid
+  * instruction).
+  */

 /**
  * cpu_single_step:
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index fe40358..2139ea6 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -251,6 +251,11 @@ void HELPER(exception_with_syndrome)(CPUARMState
*env, uint32_t excp,
 CPUState *cs = CPU(arm_env_get_cpu(env));

 assert(!excp_is_internal(excp));
+
+if (unlikely(cs->singlestep_enabled & SSTEP_NOIRQ)) {
+cs->singlestep_enabled |= SSTEP_EXCEPTION;
+}
+
 cs->exception_index = excp;
 env->exception.syndrome = syndrome;
 cpu_loop_exit(cs);
-- 
1.9.1

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211



Re: [Qemu-devel] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Laszlo Ersek
On 09/11/14 22:16, Gabriel L. Somlo wrote:
> On Thu, Sep 11, 2014 at 06:40:38PM +0200, Paolo Bonzini wrote:
>> Il 11/09/2014 18:35, Gabriel L. Somlo ha scritto:
> Can you configure Chamaleon to avoid the boot prompt?
>>> Yes. After doing that, usb starts working once OS X is fully booted.
>>>
>>> Works with either piix or q35 just fine.
>>>
>>> Does this mean it's likely to be an OVMF uhci/ehci issue specific to Q35 ?
>>> (one from which Fedora can recover but OS X can't) ? 
>>
>> Yes, that's my interpretation too.
>>
>> You did test an UHCI controller, I think, but I don't remember---did you
>> test an EHCI controller without companions, using something like
>> "-device ich9-usb-ehci1,id=myehci -device usb-keyboard,bus=myehci.0"?
> 
> Not only does that work (after applying Jan Vesely's 3-patch series from
> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02175.html),
> but my "regular" command line starts working as well, since now both
> keyboard and mouse get routed to ehci instead of uhci.
> 
>> If that works, the issue would be specific to EHCI companion
>> controllers.  If that doesn't work, there is at least a generic in the
>> EHCI driver---of course there could possibly be another in the companion
>> controllers, but I'd try getting EHCI alone to work.
> 
> So, here's my command line again:
> 
> bin/qemu-system-x86_64 -enable-kvm -m 2048 -cpu core2duo -smp 4,cores=2 \
>   -machine q35 -device ide-drive,bus=ide.2,drive=MacHDD \
>   -drive id=MacHDD,if=none,snapshot=on,file=./mac_10.9.img \
>   -device isa-applesmc,osk="our...Inc" \
>   -netdev user,id=usr0 -device e1000-82545em,netdev=usr0,id=vnet0 \
>   -usb -device usb-kbd -device usb-mouse \
>   [ -bios OVMF.fd  |  -kernel chameleon_boot ]
> 
> 
> With this, I was able to get a better idea what the problem is. With
> either OVMF or Chameleon, I still see UHCI 1..3 and EHCI in qtree,
> with the mouse and keyboard hanging off of EHCI.
> 
> However, from inside the OS X guest's system profiler, I see a
> difference:
> 
> With Chameleon (and SeaBIOS), I see all three UHCIs (with no attached
> devices), and EHCI (with the now-high-speed keyboard and mouse).
> 
> With OVMF, I have EHCI with the high-speed (and working)
> keyboard+mouse, but ONLY ONE UHCI device (UHCI3). UHCI1 UHCI2 do not
> get detected.
> 
> Obviously, since QEMU was routing the slow keyboard+mouse to UHCI1,
> which for some reason gets masked when OVMF is used, things weren't
> working.
> 
> So now the question is, how come Fedora can find UHCI 1&2 on Q35+OVMF,
> but OS X can't, and what can we do to OVMF (and/or QEMU) to fix it.
> 
> Obviously, Jan's patch set woud paper over the issue, as the keyboard
> and mouse would go to ehci, but there's still the issue of the
> disappearing UHCI's :)

Please save the OVMF debug log, maybe we can catch something in it.

-global isa-debugcon.iobase=0x402 -debugcon file:ovmf.log

It could be also worthwhile to boot OVMF (not OSX, just OVMF) on both
i440fx and q35, with the otherwise same command line, and compare the
two debug logs.

Thanks
Laszlo



Re: [Qemu-devel] [PATCH 04/23] block: Connect BlockBackend and DriveInfo

2014-09-11 Thread Eric Blake
On 09/11/2014 12:03 PM, Markus Armbruster wrote:
> Benoît Canet  writes:
> 
>> The Wednesday 10 Sep 2014 à 10:13:33 (+0200), Markus Armbruster wrote :
>>> Make the BlockBackend own the DriveInfo.  Change blockdev_init() to
>>> return the BlockBackend instead of the DriveInfo.
>>>
>>> Signed-off-by: Markus Armbruster 
>>> ---


>>> +static void drive_info_del(DriveInfo *dinfo);
>>
>> Is there any technical reason not to just put the
>> drive_info_del above the blk_delete function ?
>> I don't see any possible circular references between the two.
>>
>> Some people like Eric frown upon static function prototypes
>> in final code that's why I am asking.

I dislike it in code I write, and so I point it out in reviews, but I
also concede that it is a style, not technical issue, so I will never
reject a patch that uses forward declarations if the author thinks that
makes the presentation of the overall file easier to follow.

> 
> Placing functions before their callers makes the program easier to read
> when you need to see the functions definition before you can understand
> their use.
> 
> Placing the functions after callers makes the program easier to read
> when the gist of what they do is obvious from the call.  You're omitting
> unnecessary detail there, to be flesh it own further down.  Saving a
> function declaration is immaterial compared to that.
> 
> Before I put the function where I don't want it, I'd inline it :)

Then I won't try to convince you to paint the bikeshed any other color.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] New requirement for getting block layer patches merged

2014-09-11 Thread Stefan Hajnoczi
The time it takes for block layer patches to be reviewed and merged is a
concern.  Kevin and I want to discuss this issue, which was raised by
Benoit, on the mailing list.

The volume of patches is so high that Kevin and I cannot review them
all in a timely manner.  We are adjusting the process as a result.

Summary
---
From now on, 1 Reviewed-by from another contributor is required
before Kevin and I will look at patches.

  Hint: CC a regular contributor (not Kevin or me) who you think might
  be able to review your patches.

This will:
  1. Spread the load of patch review.
  2. Let Kevin and me focus on patches that are ready to be merged.

Caveats:
  1. Expect a few weeks before everyone has adjusted and improvement
 is visible.
  2. Exceptions may be made such as for security or critical bug fixes.

Read on for the rationale and more details...

Background
--
Both regular contributors and newcomers should experience quick code
review feedback so patches can be iterated and merged at a rapid pace.

Kevin and I co-maintain the block layer.  Each week one of us is on
full-time duty to review patches, investigate qemu.git build or test
failures, and participate in the release process.

The volume of patches is so high that we have a hard time consistently
providing review feedback within a reasonable timeframe.

A reminder why maintainers review patches: they are familiar with the
code and often identify bugs before the patches are committed.  They
also take a big picture view of the codebase and can advise on the
design of code, whereas individual contributors may not be aware of the
activities in other areas of the code.

The challenge is making this scale.  There are two ways to achieve this:

1. Increase Kevin and my productivity
2. Add more resources

Increasing productivity
---
There are 3 time-consuming activities:
1. Replying to patches that are fundamentally broken
2. Screening patches that do not meet the code submission guidelines
   http://qemu-project.org/Contribute/SubmitAPatch
3. Bisecting build or test failures

If Kevin and I spend our time dealing with patches that cannot be merged
we have less time to spend on realistic candidates.

#1 can be solved by requiring review from the community.  This gets
patches out of the way that need to be redone with more care.

#2 and #3 can be solved by automated builds and tests.  I have a plan
for a bot that evaluates every patch series on the mailing list.  This
way no human time is spent verifying that the patch meets the basic
requirements.

As a result of eliminating these time-consuming activities, Kevin and I
can focus on merging the good patch series.

Adding more resources
-
Unfortunately, it is not easy to add more co-maintainers without losing
the big picture and sacrificing code quality.  Instead the focus is to
delegate sections of the codebase, as was done successfully with nbd,
sheepdog, vmdk, etc.

Delegating has natural boundaries like block drivers or source files.
It is not possible to add more people arbitrarily and still get good
results.

The alternative is to scale patch review by requiring 1 Reviewed-by
before Kevin and I will look at a patch.  Contributors need to
participate in code review amongst each other in order to get code
merged.

To keep Reviewed-by tags meaningful, they can only be accepted from
regular contributors.  If Kevin or I find the review quality from a
person is consistently poor, we will talk to them and may not count
them.

Patch review priorities
---
Benoit asked what factors influence which patch series get reviewed.  I
think this is less relevant with the new 1 R-b rule, but others may be
wondering the same thing.

Kevin and I use multiple factors and make a subjective decision when
selecting patches to review.  This includes:
 * Is it a bug fix?
 * Is the patch series short?
 * Does it touch familiar parts of the codebase?
 * Does the contributor usually submit high-quality code?
 * Is the design straightforward?
 * Does the contributor participate in code review?

There is no formula or algorithm that Kevin or I want to commit to,
because it would make the already tough review process even less
motivating.

The problem is that traffic is too high for the current process, not
that the patch selection algorithm isn't fair.  That is why it's
necessary to scale the review process (what the rest of this email is
about).

EOF
---
If you have feedback or questions, let us know.  The process can be
tweaked as time goes on so we can continue to improve.


pgpzg8ICD4GqW.pgp
Description: PGP signature


Re: [Qemu-devel] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Gabriel L. Somlo
On Thu, Sep 11, 2014 at 06:40:38PM +0200, Paolo Bonzini wrote:
> Il 11/09/2014 18:35, Gabriel L. Somlo ha scritto:
> >> > Can you configure Chamaleon to avoid the boot prompt?
> > Yes. After doing that, usb starts working once OS X is fully booted.
> > 
> > Works with either piix or q35 just fine.
> > 
> > Does this mean it's likely to be an OVMF uhci/ehci issue specific to Q35 ?
> > (one from which Fedora can recover but OS X can't) ? 
> 
> Yes, that's my interpretation too.
> 
> You did test an UHCI controller, I think, but I don't remember---did you
> test an EHCI controller without companions, using something like
> "-device ich9-usb-ehci1,id=myehci -device usb-keyboard,bus=myehci.0"?

Not only does that work (after applying Jan Vesely's 3-patch series from
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02175.html),
but my "regular" command line starts working as well, since now both
keyboard and mouse get routed to ehci instead of uhci.

> If that works, the issue would be specific to EHCI companion
> controllers.  If that doesn't work, there is at least a generic in the
> EHCI driver---of course there could possibly be another in the companion
> controllers, but I'd try getting EHCI alone to work.

So, here's my command line again:

bin/qemu-system-x86_64 -enable-kvm -m 2048 -cpu core2duo -smp 4,cores=2 \
  -machine q35 -device ide-drive,bus=ide.2,drive=MacHDD \
  -drive id=MacHDD,if=none,snapshot=on,file=./mac_10.9.img \
  -device isa-applesmc,osk="our...Inc" \
  -netdev user,id=usr0 -device e1000-82545em,netdev=usr0,id=vnet0 \
  -usb -device usb-kbd -device usb-mouse \
  [ -bios OVMF.fd  |  -kernel chameleon_boot ]


With this, I was able to get a better idea what the problem is. With
either OVMF or Chameleon, I still see UHCI 1..3 and EHCI in qtree,
with the mouse and keyboard hanging off of EHCI.

However, from inside the OS X guest's system profiler, I see a
difference:

With Chameleon (and SeaBIOS), I see all three UHCIs (with no attached
devices), and EHCI (with the now-high-speed keyboard and mouse).

With OVMF, I have EHCI with the high-speed (and working)
keyboard+mouse, but ONLY ONE UHCI device (UHCI3). UHCI1 UHCI2 do not
get detected.

Obviously, since QEMU was routing the slow keyboard+mouse to UHCI1,
which for some reason gets masked when OVMF is used, things weren't
working.

So now the question is, how come Fedora can find UHCI 1&2 on Q35+OVMF,
but OS X can't, and what can we do to OVMF (and/or QEMU) to fix it.

Obviously, Jan's patch set woud paper over the issue, as the keyboard
and mouse would go to ehci, but there's still the issue of the
disappearing UHCI's :)

Thanks much,
--Gabriel




Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Eric Blake
On 09/11/2014 01:34 PM, Benoît Canet wrote:
> The Thursday 11 Sep 2014 à 21:12:44 (+0200), Markus Armbruster wrote :
>> Benoît Canet  writes:
>>
 +   blk_bs(blk_by_legacy_dinfo(dinfo)));
>>>
>>> This seems to be a fairly common pattern: "blk_bs(blk_by_legacy_dinfo())".
>>> How about a helper function ?
>>
>> Yes, except the pattern is going to evaporate in patch 14 :)
> 
> haha

Still, worth mentioning this fact in the commit message, rather than
making reviewers guess at it (that is, add a sentence something like:
This commit mechanically introduces long lines via repetitive pattern
replacement that will be shortened in a later patch


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Benoît Canet
The Thursday 11 Sep 2014 à 21:12:44 (+0200), Markus Armbruster wrote :
> Benoît Canet  writes:
> 
> >> +   blk_bs(blk_by_legacy_dinfo(dinfo)));
> >
> > This seems to be a fairly common pattern: "blk_bs(blk_by_legacy_dinfo())".
> > How about a helper function ?
> 
> Yes, except the pattern is going to evaporate in patch 14 :)

haha



[Qemu-devel] [PATCH 1/2] target-ppc : Allow fc[tf]id[*] mnemonics for non TARGET_PPC64

2014-09-11 Thread Pierre Mallard
This patch remove limitation for fc[tf]id[*] on 32 bits targets and
add a new insn flag for signed integer 64 conversion PPC2_FP_CVT_S64
---
 target-ppc/cpu.h|5 -
 target-ppc/fpu_helper.c |6 --
 target-ppc/helper.h |4 +---
 target-ppc/translate.c  |   18 +++---
 target-ppc/translate_init.c |9 ++---
 5 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index b64c652..fa50c32 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -2008,13 +2008,16 @@ enum {
 PPC2_ALTIVEC_207   = 0x4000ULL,
 /* PowerISA 2.07 Book3s specification*/
 PPC2_ISA207S   = 0x8000ULL,
+/* Double precision floating point conversion for signed integer 64  */
+PPC2_FP_CVT_S64= 0x0001ULL,
 
 #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \
 PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \
 PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206 | \
 PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206 | \
 PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \
-PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP)
+PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \
+PPC2_FP_CVT_S64)
 };
 
 /*/
diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
index da93d12..7f74466 100644
--- a/target-ppc/fpu_helper.c
+++ b/target-ppc/fpu_helper.c
@@ -649,14 +649,10 @@ FPU_FCTI(fctiw, int32, 0x8000U)
 FPU_FCTI(fctiwz, int32_round_to_zero, 0x8000U)
 FPU_FCTI(fctiwu, uint32, 0xU)
 FPU_FCTI(fctiwuz, uint32_round_to_zero, 0xU)
-#if defined(TARGET_PPC64)
 FPU_FCTI(fctid, int64, 0x8000ULL)
 FPU_FCTI(fctidz, int64_round_to_zero, 0x8000ULL)
 FPU_FCTI(fctidu, uint64, 0xULL)
 FPU_FCTI(fctiduz, uint64_round_to_zero, 0xULL)
-#endif
-
-#if defined(TARGET_PPC64)
 
 #define FPU_FCFI(op, cvtr, is_single)  \
 uint64_t helper_##op(CPUPPCState *env, uint64_t arg)   \
@@ -678,8 +674,6 @@ FPU_FCFI(fcfids, int64_to_float32, 1)
 FPU_FCFI(fcfidu, uint64_to_float64, 0)
 FPU_FCFI(fcfidus, uint64_to_float32, 1)
 
-#endif
-
 static inline uint64_t do_fri(CPUPPCState *env, uint64_t arg,
   int rounding_mode)
 {
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 509eae5..52402ef 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -67,16 +67,14 @@ DEF_HELPER_2(fctiw, i64, env, i64)
 DEF_HELPER_2(fctiwu, i64, env, i64)
 DEF_HELPER_2(fctiwz, i64, env, i64)
 DEF_HELPER_2(fctiwuz, i64, env, i64)
-#if defined(TARGET_PPC64)
 DEF_HELPER_2(fcfid, i64, env, i64)
 DEF_HELPER_2(fcfidu, i64, env, i64)
 DEF_HELPER_2(fcfids, i64, env, i64)
 DEF_HELPER_2(fcfidus, i64, env, i64)
 DEF_HELPER_2(fctid, i64, env, i64)
-DEF_HELPER_2(fctidu, i64, env, i64)
 DEF_HELPER_2(fctidz, i64, env, i64)
+DEF_HELPER_2(fctidu, i64, env, i64)
 DEF_HELPER_2(fctiduz, i64, env, i64)
-#endif
 DEF_HELPER_2(frsp, i64, env, i64)
 DEF_HELPER_2(frin, i64, env, i64)
 DEF_HELPER_2(friz, i64, env, i64)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index c07bb01..1fe82ce 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2246,9 +2246,8 @@ GEN_FLOAT_B(ctiwz, 0x0F, 0x00, 0, PPC_FLOAT);
 GEN_FLOAT_B(ctiwuz, 0x0F, 0x04, 0, PPC2_FP_CVT_ISA206);
 /* frsp */
 GEN_FLOAT_B(rsp, 0x0C, 0x00, 1, PPC_FLOAT);
-#if defined(TARGET_PPC64)
 /* fcfid */
-GEN_FLOAT_B(cfid, 0x0E, 0x1A, 1, PPC_64B);
+GEN_FLOAT_B(cfid, 0x0E, 0x1A, 1, PPC2_FP_CVT_S64);
 /* fcfids */
 GEN_FLOAT_B(cfids, 0x0E, 0x1A, 0, PPC2_FP_CVT_ISA206);
 /* fcfidu */
@@ -2256,14 +2255,13 @@ GEN_FLOAT_B(cfidu, 0x0E, 0x1E, 0, PPC2_FP_CVT_ISA206);
 /* fcfidus */
 GEN_FLOAT_B(cfidus, 0x0E, 0x1E, 0, PPC2_FP_CVT_ISA206);
 /* fctid */
-GEN_FLOAT_B(ctid, 0x0E, 0x19, 0, PPC_64B);
+GEN_FLOAT_B(ctid, 0x0E, 0x19, 0, PPC2_FP_CVT_S64);
+/* fctidz */
+GEN_FLOAT_B(ctidz, 0x0F, 0x19, 0, PPC2_FP_CVT_S64);
 /* fctidu */
 GEN_FLOAT_B(ctidu, 0x0E, 0x1D, 0, PPC2_FP_CVT_ISA206);
-/* fctidz */
-GEN_FLOAT_B(ctidz, 0x0F, 0x19, 0, PPC_64B);
 /* fctidu */
 GEN_FLOAT_B(ctiduz, 0x0F, 0x1D, 0, PPC2_FP_CVT_ISA206);
-#endif
 
 /* frin */
 GEN_FLOAT_B(rin, 0x08, 0x0C, 1, PPC_FLOAT_EXT);
@@ -10050,16 +10048,14 @@ GEN_HANDLER_E(fctiwu, 0x3F, 0x0E, 0x04, 0, PPC_NONE, 
PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(ctiwz, 0x0F, 0x00, 0, PPC_FLOAT),
 GEN_HANDLER_E(fctiwuz, 0x3F, 0x0F, 0x04, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_FLOAT_B(rsp, 0x0C, 0x00, 1, PPC_FLOAT),
-#if defined(TARGET_PPC64)
-GEN_FLOAT_B(cfid, 0x0E, 0x1A, 1, PPC_64B),
+GEN_HANDLER_E(fcfid, 0x3F, 0x0E, 0x1A, 0x001F, PPC_NONE, PPC2_FP_CVT_S64),
 GEN_HANDLER_E(fcfids, 0x3B, 0x0E, 0x1A, 0, PPC_NONE, PPC2_FP_CVT_ISA206),
 GEN_HANDLER_E(fcfidu, 0x3F, 0

Re: [Qemu-devel] [PATCH 00/23] Split BlockBackend off BDS with an axe

2014-09-11 Thread Markus Armbruster
Review led to a couple of changes with ripple effects on later patches.
I suggest y'all await my v2 before you continue.

Thanks a lot for the review so far!



[Qemu-devel] [PATCH 2/2] target-ppc : Add new processor type 440x5wDFPU

2014-09-11 Thread Pierre Mallard
This patch add a new processor type 440x5wDFPU for Virtex 5 PPC440
with an external APU FPU in double precision mode
---
 target-ppc/cpu-models.c |3 +++
 target-ppc/translate_init.c |   38 ++
 2 files changed, 41 insertions(+)

diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
index 52ac6ec..91e9fac 100644
--- a/target-ppc/cpu-models.c
+++ b/target-ppc/cpu-models.c
@@ -309,6 +309,9 @@
 #endif
 POWERPC_DEF("440-Xilinx",CPU_POWERPC_440_XILINX, 440x5,
 "PowerPC 440 Xilinx 5")
+
+POWERPC_DEF("440-Xilinx-w-dfpu",CPU_POWERPC_440_XILINX, 
440x5wDFPU,
+"PowerPC 440 Xilinx 5 With a Double Prec. FPU")
 #if defined(TODO)
 POWERPC_DEF("440A5", CPU_POWERPC_440A5,  440x5,
 "PowerPC 440 A5")
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index ac4d12a..7d7dce7 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -3923,6 +3923,44 @@ POWERPC_FAMILY(440x5)(ObjectClass *oc, void *data)
  POWERPC_FLAG_DE | POWERPC_FLAG_BUS_CLK;
 }
 
+POWERPC_FAMILY(440x5wDFPU)(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
+
+dc->desc = "PowerPC 440x5 with double precision FPU";
+pcc->init_proc = init_proc_440x5;
+pcc->check_pow = check_pow_nocheck;
+pcc->insns_flags = PPC_INSNS_BASE | PPC_STRING |
+   PPC_FLOAT | PPC_FLOAT_FSQRT | 
+   PPC_FLOAT_STFIWX |
+   PPC_DCR | PPC_WRTEE | PPC_RFMCI |
+   PPC_CACHE | PPC_CACHE_ICBI |
+   PPC_CACHE_DCBZ | PPC_CACHE_DCBA |
+   PPC_MEM_TLBSYNC | PPC_MFTB |
+   PPC_BOOKE | PPC_4xx_COMMON | PPC_405_MAC |
+   PPC_440_SPEC;
+pcc->insns_flags2 = PPC2_FP_CVT_S64;
+pcc->msr_mask = (1ull << MSR_POW) |
+(1ull << MSR_CE) |
+(1ull << MSR_EE) |
+(1ull << MSR_PR) |
+(1ull << MSR_FP) |
+(1ull << MSR_ME) |
+(1ull << MSR_FE0) |
+(1ull << MSR_DWE) |
+(1ull << MSR_DE) |
+(1ull << MSR_FE1) |
+(1ull << MSR_IR) |
+(1ull << MSR_DR);
+pcc->mmu_model = POWERPC_MMU_BOOKE;
+pcc->excp_model = POWERPC_EXCP_BOOKE;
+pcc->bus_model = PPC_FLAGS_INPUT_BookE;
+pcc->bfd_mach = bfd_mach_ppc_403;
+pcc->flags = POWERPC_FLAG_CE | POWERPC_FLAG_DWE |
+ POWERPC_FLAG_DE | POWERPC_FLAG_BUS_CLK;
+}
+
 static void init_proc_460 (CPUPPCState *env)
 {
 /* Time base */
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH 11/23] block: Rename BlockDriverAIOCB* to BlockAIOCB*

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

> I think you should rebase this on top of Fam's null driver.

This patch is prone to conflicts, but I expect them to be trivial.  I'll
rebase onto whatever is current when we're ready to merge.



[Qemu-devel] [PATCH 0/2] Enabling floating point instruction to 440x5 CPUs

2014-09-11 Thread Pierre Mallard
This patch series enable floating point instruction in 440x5 CPUs
which have the capabilities to have optional APU FPU in double precision mode.

1) Allow fc[tf]id[*] mnemonics for non TARGET_PPC64 with a new insn2 flag
2) Create a new 440x5 implementing floating point instructions

Pierre Mallard (2):
  target-ppc : Allow fc[tf]id[*] mnemonics for non TARGET_PPC64
  target-ppc : Add new processor type 440x5wDFPU

 target-ppc/cpu-models.c |3 +++
 target-ppc/cpu.h|5 -
 target-ppc/fpu_helper.c |6 --
 target-ppc/helper.h |4 +---
 target-ppc/translate.c  |   18 +++--
 target-ppc/translate_init.c |   47 ---
 6 files changed, 59 insertions(+), 24 deletions(-)

-- 
1.7.10.4




Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

> I have trouble to review this as I don't understand the qdevs parts.
> Maybe someone else could have a look at it.

The patch is big, but all it really does is replacing

dinfo->bdrv

by

blk_bs(blk_legacy_dinfo(dinfo))

Line wrapping muddies the waters a bit.  I also omit tests whether
dinfo->bdrv is null, because it never is.



Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

>> +   blk_bs(blk_by_legacy_dinfo(dinfo)));
>
> This seems to be a fairly common pattern: "blk_bs(blk_by_legacy_dinfo())".
> How about a helper function ?

Yes, except the pattern is going to evaporate in patch 14 :)



Re: [Qemu-devel] [PATCH 08/23] block: Eliminate BlockDriverState member device_name[]

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

> The Thursday 11 Sep 2014 à 07:00:41 (-0600), Eric Blake wrote :
>> On 09/11/2014 05:34 AM, Benoît Canet wrote:
>> > The Wednesday 10 Sep 2014 à 10:13:37 (+0200), Markus Armbruster wrote :
>> >> device_name[] is can become non-empty only in bdrv_new_named() and
>> >> bdrv_move_feature_fields().  The latter is used only to undo damage
>> >> done by bdrv_swap().  The former is called only by blk_new_with_bs().
>> >> Therefore, when a BlockDriverState's device_name[] is non-empty, then
>> >> it's owned by a BlockBackend.
>> 
>> [lots of lines trimmed - it's not only okay, but desirable to trim out
>> portions of a patch that you are okay with, in order to call attention
>> to the problem spots that you are commenting on without making the
>> reader have to scroll through pages of quoted context]
>> 
>> >>  
>> >> -const char *bdrv_get_device_name(BlockDriverState *bs)
>> >> +const char *bdrv_get_device_name(const BlockDriverState *bs)
>> >>  {
>> >> -return bs->device_name;
>> >> +const char *name = bs->blk ? blk_name(bs->blk) : NULL;
>> >> +return name ?: "";
>> >>  }
>> > 
>> > Why not ?
>> > 
>> > return bs->blk ? blk_name(bs->blk) : "";
>> 
>> If I understand right, it was because blk_name(bs->blk) may return NULL,
>
> It think it can't: see patch 2 extract:
>
>> +BlockBackend *blk_new(const char *name, Error **errp)
>> +{
>> +BlockBackend *blk = g_new0(BlockBackend, 1);
>> + 
>> +assert(name && name[0]);
>
>> but this function is guaranteed to return non-NULL.

You're right, blk_new() always returns a non-null, non-empty string.

The condition to check here is "bs is not owned by a BB".  Benoît's

bs->blk ? blk_name(bs->blk) : ""

does that nicely.

Of course, Eric is right in that returning NULL on "not owned by a BB"
would be cleaner than returning "".  However, doing that in the middle
of a series with a four-digit diffstat doesn't strike me as a good
idea.  Better do it on top.



[Qemu-devel] [Bug 1361912] Re: qemu-mips64 Segmentation fault

2014-09-11 Thread Hill
I can see the problem with any simple program:
1. cat t.c
#include 
int main()
{
  printf("Hello QEMU.\n");
}

2. mips64-gcc -static t.c -o t
3. qemu-mips64 t
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

I built QEMU on Ubuntu 12.04 with the system GCC compiler, and the commands are:
./configure --enable-linux-user
make


** Attachment added: "the mips executable file"
   https://bugs.launchpad.net/qemu/+bug/1361912/+attachment/4201676/+files/t

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1361912

Title:
  qemu-mips64 Segmentation fault

Status in QEMU:
  New

Bug description:
  When I ran qemu-mips64 for any mips 64 executable , I got this error:

  $ ./qemu-mips64  ../lang
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  Is this a known issue?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1361912/+subscriptions



Re: [Qemu-devel] [PATCH 08/23] block: Eliminate BlockDriverState member device_name[]

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

> The Wednesday 10 Sep 2014 à 10:13:37 (+0200), Markus Armbruster wrote :
>> device_name[] is can become non-empty only in bdrv_new_named() and
>> bdrv_move_feature_fields().  The latter is used only to undo damage
>> done by bdrv_swap().  The former is called only by blk_new_with_bs().
>> Therefore, when a BlockDriverState's device_name[] is non-empty, then
>> it's owned by a BlockBackend.
>> 
>> The converse is also true, because blk_attach_bs() is called only by
>> blk_new_with_bs() so far.
>> 
>> Furthermore, blk_new_with_bs() keeps the two names equal.
>> 
>> Therefore, device_name[] is redundant.  Eliminate it.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  block-migration.c | 12 +
>>  block.c   | 63 
>> ++-
>>  block/block-backend.c | 12 -
>>  block/cow.c   |  2 +-
>>  block/mirror.c|  3 ++-
>>  block/qapi.c  |  6 ++---
>>  block/qcow.c  |  4 +--
>>  block/qcow2.c |  4 +--
>>  block/qed.c   |  2 +-
>>  block/quorum.c|  4 +--
>>  block/vdi.c   |  2 +-
>>  block/vhdx.c  |  2 +-
>>  block/vmdk.c  |  4 +--
>>  block/vpc.c   |  2 +-
>>  block/vvfat.c |  2 +-
>>  blockjob.c|  3 ++-
>>  include/block/block.h |  3 +--
>>  include/block/block_int.h |  2 --
>>  18 files changed, 53 insertions(+), 79 deletions(-)
>> 
>> diff --git a/block-migration.c b/block-migration.c
>> index cb3e16c..da30e93 100644
>> --- a/block-migration.c
>> +++ b/block-migration.c
>> @@ -14,7 +14,9 @@
>>   */
>>  
>>  #include "qemu-common.h"
>> -#include "block/block_int.h"
>> +#include "block/block.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu/main-loop.h"
>>  #include "hw/hw.h"
>>  #include "qemu/queue.h"
>>  #include "qemu/timer.h"
>> @@ -130,9 +132,9 @@ static void blk_send(QEMUFile *f, BlkMigBlock * blk)
>>   | flags);
>>  
>>  /* device name */
>> -len = strlen(blk->bmds->bs->device_name);
>> +len = strlen(bdrv_get_device_name(blk->bmds->bs));
>>  qemu_put_byte(f, len);
>> -qemu_put_buffer(f, (uint8_t *)blk->bmds->bs->device_name, len);
>> +qemu_put_buffer(f, (uint8_t *)bdrv_get_device_name(blk->bmds->bs), len);
>>  
>>  /* if a block is zero we need to flush here since the network
>>   * bandwidth is now a lot higher than the storage device bandwidth.
>> @@ -382,9 +384,9 @@ static void init_blk_migration(QEMUFile *f)
>>  
>>  if (bmds->shared_base) {
>>  DPRINTF("Start migration for %s with shared base image\n",
>> -bs->device_name);
>> +bdrv_get_device_name(bs));
>>  } else {
>> -DPRINTF("Start full migration for %s\n", bs->device_name);
>> +DPRINTF("Start full migration for %s\n", 
>> bdrv_get_device_name(bs));
>>  }
>>  
>>  QSIMPLEQ_INSERT_TAIL(&block_mig_state.bmds_list, bmds, entry);
>> diff --git a/block.c b/block.c
>> index 593d89b..61ea15d 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -332,31 +332,6 @@ void bdrv_register(BlockDriver *bdrv)
>>  QLIST_INSERT_HEAD(&bdrv_drivers, bdrv, list);
>>  }
>>  
>> -/* create a new block device (by default it is empty) */
>> -BlockDriverState *bdrv_new_named(const char *device_name, Error **errp)
>> -{
>> -BlockDriverState *bs;
>> -
>> -assert(*device_name);
>> -
>> -if (bdrv_find(device_name)) {
>> -error_setg(errp, "Device with id '%s' already exists",
>> -   device_name);
>> -return NULL;
>> -}
>> -if (bdrv_find_node(device_name)) {
>> -error_setg(errp, "Device with node-name '%s' already exists",
>> -   device_name);
>> -return NULL;
>> -}
>> -
>> -bs = bdrv_new();
>> -
>> -pstrcpy(bs->device_name, sizeof(bs->device_name), device_name);
>> -
>> -return bs;
>> -}
>> -
>>  BlockDriverState *bdrv_new(void)
>>  {
>>  BlockDriverState *bs;
>> @@ -1159,7 +1134,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, 
>> BlockDriverState *backing_hd)
>>  } else if (backing_hd) {
>>  error_setg(&bs->backing_blocker,
>> "device is used as backing hd of '%s'",
>> -   bs->device_name);
>> +   bdrv_get_device_name(bs));
>>  }
>>  
>>  bs->backing_hd = backing_hd;
>> @@ -1533,7 +1508,7 @@ int bdrv_open(BlockDriverState **pbs, const char 
>> *filename,
>>  } else {
>>  error_setg(errp, "Block format '%s' used by device '%s' doesn't 
>> "
>> "support the option '%s'", drv->format_name,
>> -   bs->device_name, entry->key);
>> +   bdrv_get_device_name(bs), entry->key);
>>  }
>>  
>>  ret = -EINVAL;
>> @@ -1740,7 +1715,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
>> 

Re: [Qemu-devel] [virtio-dev] Re: [PATCH 1/2] virtio-gpu/2d: add hardware spec include file

2014-09-11 Thread Paolo Bonzini
> On Do, 2014-09-11 at 16:20 +0100, Peter Maydell wrote:
> > On 11 September 2014 16:09, Gerd Hoffmann  wrote:
> > > This patch adds the header file with structs and defines for
> > > the virtio based gpu device.  Covers 2d operations only.
> > 
> > Please don't cc subscriber only mailing lists
> > (virtio-...@lists.oasis-open.org) on posts to qemu-devel;
> > it just means everybody responding gets mailer bounce
> > junkmail.
> 
> I don't feel like splitting the discussion though.  Suggestions how to
> deal with that?  I think I've seen most active virtio-dev folks on the
> virtualization list too, is it ok to just drop virtio-dev for the next
> round and hope that folks see it via virtualization list?

There's also virtio-comment.

Paolo



Re: [Qemu-devel] [PATCH 08/23] block: Eliminate BlockDriverState member device_name[]

2014-09-11 Thread Markus Armbruster
Eric Blake  writes:

> On 09/10/2014 02:13 AM, Markus Armbruster wrote:
>> device_name[] is can become non-empty only in bdrv_new_named() and
>
> s/is //

Fixing, thanks!

>> bdrv_move_feature_fields().  The latter is used only to undo damage
>> done by bdrv_swap().  The former is called only by blk_new_with_bs().
>> Therefore, when a BlockDriverState's device_name[] is non-empty, then
>> it's owned by a BlockBackend.
>> 
>> The converse is also true, because blk_attach_bs() is called only by
>> blk_new_with_bs() so far.
>> 
>> Furthermore, blk_new_with_bs() keeps the two names equal.
>> 
>> Therefore, device_name[] is redundant.  Eliminate it.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---



Re: [Qemu-devel] [PATCH 07/23] block: Eliminate bdrv_iterate(), use bdrv_next()

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

> The Wednesday 10 Sep 2014 à 10:13:36 (+0200), Markus Armbruster wrote :
[...]
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -4208,24 +4208,6 @@ static void file_completion(Monitor *mon, const char 
>> *input)
>>  closedir(ffs);
>>  }
>>  
>> -typedef struct MonitorBlockComplete {
>> -Monitor *mon;
>> -const char *input;
>> -} MonitorBlockComplete;
>> -
>> -static void block_completion_it(void *opaque, BlockDriverState *bs)
>> -{
>> -const char *name = bdrv_get_device_name(bs);
>> -MonitorBlockComplete *mbc = opaque;
>> -Monitor *mon = mbc->mon;
>> -const char *input = mbc->input;
>> -
>> -if (input[0] == '\0' ||
>> -!strncmp(name, (char *)input, strlen(input))) {
>> -readline_add_completion(mon->rs, name);
>> -}
>> -}
>> -
>>  static const char *next_arg_type(const char *typestr)
>>  {
>>  const char *p = strchr(typestr, ':');
>> @@ -4663,9 +4645,9 @@ static void monitor_find_completion_by_table(Monitor 
>> *mon,
>>  {
>>  const char *cmdname;
>>  int i;
>> -const char *ptype, *str;
>> +const char *ptype, *str, *name;
>>  const mon_cmd_t *cmd;
>> -MonitorBlockComplete mbs;
>> +BlockDriverState *bs;
>>  
>>  if (nb_args <= 1) {
>>  /* command completion */
>> @@ -4717,10 +4699,13 @@ static void monitor_find_completion_by_table(Monitor 
>> *mon,
>>  break;
>>  case 'B':
>>  /* block device name completion */
>> -mbs.mon = mon;
>> -mbs.input = str;
>
>> -readline_set_completion_index(mon->rs, strlen(str));
>
> Why is this line removed ?

Accident!  Fixing...

> In monitor readline_set_completion_index seems to work in pair with
> readline_add_completion.
> Either this line should be removed and readline_add_completion too or
> the oposite.
>
>> -bdrv_iterate(block_completion_it, &mbs);
>> +for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
>> +name = bdrv_get_device_name(bs);
>> +if (str[0] == '\0' ||
>> +!strncmp(name, str, strlen(str))) {
>> +readline_add_completion(mon->rs, name);
>> +}
>> +}
>>  break;
>>  case 's':
>>  case 'S':
>> -- 
>> 1.9.3
>> 
>> 



Re: [Qemu-devel] [PATCH v5] virtio-pci: enable bus master for old guests

2014-09-11 Thread Greg Kurz
On Thu, 11 Sep 2014 21:20:03 +0300
"Michael S. Tsirkin"  wrote:

> commit cc943c36faa192cd4b32af8fe5edb31894017d35
> pci: Use bus master address space for delivering MSI/MSI-X messages
> breaks virtio-net for rhel6.[56] x86 guests because they don't
> enable bus mastering for virtio PCI devices. For the same reason,
> rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.
> 
> Old guests forgot to enable bus mastering, enable it automatically on
> DRIVER (guests use some devices before DRIVER_OK).
> 
> Reported-by: Greg Kurz 
> Signed-off-by: Michael S. Tsirkin 
> ---
> Sorry, v4 was broken - forgot to commit before git format-patch.
> Here's the correct version.
> Still not tested - will do next week.
> Testing reports appreciated meanwhile.
> 

Nice patch indeed ! :)

Reviewed-by: Greg Kurz 
Tested-by: Greg Kurz 

Cc'ing qemu-stable as Michael said he would queue the initial
bus master patch for 2.1.2.

>  hw/virtio/virtio-pci.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index ddb5da1..a827cd4 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -314,6 +314,16 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> addr, uint32_t val)
>  msix_unuse_all_vectors(&proxy->pci_dev);
>  }
> 
> +/* Linux before 2.6.34 drives the device without enabling
> +   the PCI device bus master bit. Enable it automatically
> +   for the guest. This is a PCI spec violation but so is
> +   initiating DMA with bus master bit clear. */
> +if (val == (VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER)) {
> +pci_default_write_config(&proxy->pci_dev, PCI_COMMAND,
> + proxy->pci_dev.config[PCI_COMMAND] |
> + PCI_COMMAND_MASTER, 1);
> +}
> +
>  /* Linux before 2.6.34 sets the device as OK without enabling
> the PCI device bus master bit. In this case we need to disable
> some safety checks. */



-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

"Anarchy is about taking complete responsibility for yourself."
Alan Moore.




[Qemu-devel] [Bug 1361912] Re: qemu-mips64 Segmentation fault

2014-09-11 Thread Leon Alrae
I forgot to add that qemu-mips64 works for me, that's why I asked for
the details to reproduce the issue (i.e. what is "lang", what tools you
used to build it, command line etc.)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1361912

Title:
  qemu-mips64 Segmentation fault

Status in QEMU:
  New

Bug description:
  When I ran qemu-mips64 for any mips 64 executable , I got this error:

  $ ./qemu-mips64  ../lang
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  Is this a known issue?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1361912/+subscriptions



Re: [Qemu-devel] [PATCH 05/23] block: Make BlockBackend own its BlockDriverState

2014-09-11 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 10.09.2014 um 10:13 hat Markus Armbruster geschrieben:
>> On BlockBackend destruction, unref its BlockDriverState.  Replaces the
>> callers' unrefs.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  block/block-backend.c |  9 ++---
>>  blockdev.c| 11 +++
>>  hw/block/xen_disk.c   |  6 +++---
>>  qemu-img.c| 35 +--
>>  qemu-io.c |  5 -
>>  5 files changed, 9 insertions(+), 57 deletions(-)
>> 
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index 2a22660..ae51f7f 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -58,10 +58,7 @@ BlockBackend *blk_new(const char *name, Error **errp)
>>   * @errp: return location for an error to be set on failure, or %NULL
>>   *
>>   * Create a new BlockBackend, with a reference count of one, and
>> - * attach a new BlockDriverState to it, also with a reference count of
>> - * one.  Caller owns *both* references.
>> - * TODO Let caller own only the BlockBackend reference
>> - * Fail if @name already exists.
>> + * a new BlockDriverState attached.  Fail if @name already exists.
>>   *
>>   * Returns: the BlockBackend on success, %NULL on error
>>   */
>> @@ -88,6 +85,7 @@ BlockBackend *blk_new_with_bs(const char *name, Error 
>> **errp)
>>  static void blk_delete(BlockBackend *blk)
>>  {
>>  assert(!blk->refcnt);
>> +bdrv_unref(blk->bs);
>>  blk_detach_bs(blk);
>
> I think the bdrv_unref() should really be part of blk_detach_bs().
>
> The same way it would be more logical to have bdrv_ref() as part of
> blk_attach_bs(). For blk_new_with_bs() this might mean bdrv_new,
> blk_attach_bs, bdrv_unref, which looks a bit odd, but if blk_attach_bs()
> is ever called from somewhere else, it probably makes more sense (if it
> isn't, it should be static).

My thinking was that you usually want to acquire a reference when you
detach, and you're usually ready to give yours up when you attach.

However, I now think I got a use-after-free hidden around there.  I'll
look into it tomorrow with a fresh mind.  Might lead to me accepting
your suggestion without further ado :)



[Qemu-devel] [PATCH v5] virtio-pci: enable bus master for old guests

2014-09-11 Thread Michael S. Tsirkin
commit cc943c36faa192cd4b32af8fe5edb31894017d35
pci: Use bus master address space for delivering MSI/MSI-X messages
breaks virtio-net for rhel6.[56] x86 guests because they don't
enable bus mastering for virtio PCI devices. For the same reason,
rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.

Old guests forgot to enable bus mastering, enable it automatically on
DRIVER (guests use some devices before DRIVER_OK).

Reported-by: Greg Kurz 
Signed-off-by: Michael S. Tsirkin 
---
Sorry, v4 was broken - forgot to commit before git format-patch.
Here's the correct version.
Still not tested - will do next week.
Testing reports appreciated meanwhile.

 hw/virtio/virtio-pci.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ddb5da1..a827cd4 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -314,6 +314,16 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 msix_unuse_all_vectors(&proxy->pci_dev);
 }
 
+/* Linux before 2.6.34 drives the device without enabling
+   the PCI device bus master bit. Enable it automatically
+   for the guest. This is a PCI spec violation but so is
+   initiating DMA with bus master bit clear. */
+if (val == (VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER)) {
+pci_default_write_config(&proxy->pci_dev, PCI_COMMAND,
+ proxy->pci_dev.config[PCI_COMMAND] |
+ PCI_COMMAND_MASTER, 1);
+}
+
 /* Linux before 2.6.34 sets the device as OK without enabling
the PCI device bus master bit. In this case we need to disable
some safety checks. */
-- 
MST



Re: [Qemu-devel] ui causes latest master build failure on rhel6

2014-09-11 Thread Michael S. Tsirkin
On Thu, Sep 11, 2014 at 05:21:13PM +0200, Gerd Hoffmann wrote:
> On Do, 2014-09-11 at 17:59 +0300, Michael S. Tsirkin wrote:
> > I observe two errors with latest master:
> > 
> > ui/qemu-pixman.c: In function ‘qemu_pixelformat_from_pixman’:
> > ui/qemu-pixman.c:42: error: ‘PIXMAN_TYPE_RGBA’ undeclared (first use in 
> > this function)
> > ui/qemu-pixman.c:42: error: (Each undeclared identifier is reported only 
> > once
> > ui/qemu-pixman.c:42: error: for each function it appears in.)
> > make: *** [ui/qemu-pixman.o] Error 1
> 
> Guess we need to raise the minimum pixman version in configure ...
> 
> pixman-0.21.8 is the oldest release which has PIXMAN_TYPE_RGBA.
> We don't check for a minimum version.
> RHEL-6 is probably older.
> 
> Try 'configure --without-system-pixman' as workaround (after checking
> out the pixman submodule).

I guess but is it a must?
We used to have this within an ifdef:
+#if PIXMAN_VERSION >= PIXMAN_VERSION_ENCODE(0, 21, 8)
+type = PIXMAN_TYPE_RGBA;
+#endif
+}

can't we keep system pixman working a bit longer?


> I'll cook up a patch,
>   Gerd
> 



[Qemu-devel] [PATCH v4] virtio-pci: enable bus master for old guests

2014-09-11 Thread Michael S. Tsirkin
commit cc943c36faa192cd4b32af8fe5edb31894017d35
pci: Use bus master address space for delivering MSI/MSI-X messages
breaks virtio-net for rhel6.[56] x86 guests because they don't
enable bus mastering for virtio PCI devices. For the same reason,
rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.

Old guests forgot to enable bus mastering, enable it automatically on
DRIVER (guests use some devices before DRIVER_OK).

Reported-by: Greg Kurz 
Signed-off-by: Michael S. Tsirkin 
---

OK this is the simpler patch that I had in mind.
Could you try this out please?  If this works we could drastically
simplify a bunch of code dealing with bus master.

 hw/virtio/virtio-pci.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ddb5da1..4e2ace9 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -314,6 +314,16 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 msix_unuse_all_vectors(&proxy->pci_dev);
 }
 
+/* Linux before 2.6.34 drives the device without enabling
+   the PCI device bus master bit. Enable it automatically
+   for the guest. This is a PCI spec violation but so is
+   initiating DMA with bus master bit clear. */
+if (val == VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER) {
+pci_default_write_config(pci_dev, PCI_COMMAND,
+ proxy->pci_dev.config[PCI_COMMAND] |
+ PCI_COMMAND_MASTER, 1);
+}
+
 /* Linux before 2.6.34 sets the device as OK without enabling
the PCI device bus master bit. In this case we need to disable
some safety checks. */
-- 
MST



Re: [Qemu-devel] [PATCH 3/3] xenfb: Add "feature-no-abs-rescale" for Windows PV frontend

2014-09-11 Thread Stefano Stabellini
On Thu, 11 Sep 2014, Owen Smith wrote:
> > -Original Message-
> > From: Stefano Stabellini [mailto:stefano.stabell...@eu.citrix.com]
> > Sent: 11 September 2014 02:23
> > To: Owen Smith
> > Cc: qemu-devel@nongnu.org; xen-de...@lists.xen.org; Stefano Stabellini
> > Subject: Re: [PATCH 3/3] xenfb: Add "feature-no-abs-rescale" for Windows
> > PV frontend
> > 
> > On Wed, 10 Sep 2014, Owen smith wrote:
> > > Windows PV frontend requires absolute mouse values in the range [0,
> > > 0x7fff]. Add a feature to not rescale the axes to DisplaySurface size.
> > > Also allows Windows PV frontend to connect without a vfb device. Moves
> > > the rescaling of axes to a seperate function with additional
> > > null-pointer checks.
> > >
> > > Signed-off-by: Owen smith 
> > > ---
> > >  hw/display/xenfb.c | 52
> > > +---
> > >  1 file changed, 37 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c index
> > > c4375c8..65c373b 100644
> > > --- a/hw/display/xenfb.c
> > > +++ b/hw/display/xenfb.c
> > > @@ -61,6 +61,7 @@ struct common {
> > >  struct XenInput {
> > >  struct common c;
> > >  int abs_pointer_wanted; /* Whether guest supports absolute
> > > pointer */
> > > +int no_abs_rescale; /* Whether guest wants rescaled abs axes */
> > >  int button_state;   /* Last seen pointer button state */
> > >  int extended;
> > >  QEMUPutKbdEntry *qkbd;
> > > @@ -325,6 +326,25 @@ static void xenfb_key_event(void *opaque, int
> > scancode)
> > >  xenfb_send_key(xenfb, down, scancode2linux[scancode]);  }
> > >
> > > +static void xenfb_rescale_mouse(struct XenInput *xenfb, int *dx, int
> > > +*dy) {
> > > +DisplaySurface *surface;
> > > +int dw, dh;
> > > +
> > > +if (xenfb->c.con == NULL) {
> > > +return;
> > > +}
> > > +surface = qemu_console_surface(xenfb->c.con);
> > > +if (surface == NULL) {
> > > +return;
> > > +}
> > > +dw = surface_width(surface);
> > > +dh = surface_height(surface);
> > > +
> > > +*dx = *dx * (dw - 1) / 0x7fff;
> > > +*dy = *dy * (dh - 1) / 0x7fff;
> > > +}
> > > +
> > >  /*
> > >   * Send a mouse event from the client to the guest OS
> > >   *
> > > @@ -338,18 +358,16 @@ static void xenfb_mouse_event(void *opaque,
> > > int dx, int dy, int dz, int button_state)  {
> > >  struct XenInput *xenfb = opaque;
> > > -DisplaySurface *surface = qemu_console_surface(xenfb->c.con);
> > > -int dw = surface_width(surface);
> > > -int dh = surface_height(surface);
> > >  int i;
> > >
> > > -if (xenfb->abs_pointer_wanted)
> > > - xenfb_send_position(xenfb,
> > > - dx * (dw - 1) / 0x7fff,
> > > - dy * (dh - 1) / 0x7fff,
> > > - dz);
> > > -else
> > > +if (xenfb->abs_pointer_wanted) {
> > > +if (!xenfb->no_abs_rescale) {
> > > +xenfb_rescale_mouse(xenfb, &dx, &dy);
> > > +}
> > > +xenfb_send_position(xenfb, dx, dy, dz);
> > > +} else {
> > >   xenfb_send_motion(xenfb, dx, dy, dz);
> > > +}
> > >
> > >  for (i = 0 ; i < 8 ; i++) {
> > >   int lastDown = xenfb->button_state & (1 << i); @@ -366,6 +382,7
> > @@
> > > static void xenfb_mouse_event(void *opaque,  static int
> > > input_init(struct XenDevice *xendev)  {
> > >  xenstore_write_be_int(xendev, "feature-abs-pointer", 1);
> > > +xenstore_write_be_int(xendev, "feature-no-abs-rescale", 1);
> > >  return 0;
> > >  }
> > >
> > > @@ -374,7 +391,17 @@ static int input_initialise(struct XenDevice
> > *xendev)
> > >  struct XenInput *in = container_of(xendev, struct XenInput, 
> > > c.xendev);
> > >  int rc;
> > >
> > > -if (!in->c.con) {
> > > +if (xenstore_read_fe_int(xendev, "request-abs-pointer",
> > > + &in->abs_pointer_wanted) == -1) {
> > > +in->abs_pointer_wanted = 0;
> > > +}
> > > +
> > > +if (xenstore_read_fe_int(xendev, "request-no-abs-rescale",
> > > + &in->no_abs_rescale) == -1) {
> > > +in->no_abs_rescale = 0;
> > > +}
> > 
> > I don't think is safe to move checking for "request-abs-pointer" so
> > early: what if the guest writes request-abs-pointer after initialise but 
> > before
> > connect?
> > 
> > Also it would be nice to document this protocol extension somewhere. I do
> > realize that the existing documentation on vfb is somewhat lacking.
> > 
> 
> I'll change this to read the feature flags during the input_connected for v2. 
> I've spotted that neither feature has anything in the xen header, so I'll 
> include 
> a patch to document this with v2.

Thanks


> > > +if (!in->c.con && !in->no_abs_rescale) {
> > 
> > Why this change?
> > 
> 
> Without an additional check here, the frontend would not connect. in->c.con is
> not initialised without a vfb device. Overloading no_abs_rescale here is

Re: [Qemu-devel] [PATCH 2/3] xenfb: Add option to use grant ref for shared page.

2014-09-11 Thread Stefano Stabellini
On Thu, 11 Sep 2014, Owen Smith wrote:
> > -Original Message-
> > From: Stefano Stabellini [mailto:stefano.stabell...@eu.citrix.com]
> > Sent: 11 September 2014 02:01
> > To: Owen Smith
> > Cc: qemu-devel@nongnu.org; xen-de...@lists.xen.org; Stefano Stabellini
> > Subject: Re: [PATCH 2/3] xenfb: Add option to use grant ref for shared page.
> > 
> > On Wed, 10 Sep 2014, Owen smith wrote:
> > > The vkbd device should be able to use a grant reference to map the
> > > shared page.
> > >
> > > Signed-off-by: Owen smith 
> > > ---
> > >  hw/display/xenfb.c | 39 +--
> > >  1 file changed, 29 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c index
> > > 2c39753..c4375c8 100644
> > > --- a/hw/display/xenfb.c
> > > +++ b/hw/display/xenfb.c
> > > @@ -54,6 +54,7 @@
> > >  struct common {
> > >  struct XenDevice  xendev;  /* must be first */
> > >  void  *page;
> > > +int   page_gref;
> > >  QemuConsole   *con;
> > >  };
> > >
> > > @@ -96,22 +97,36 @@ static int common_bind(struct common *c)  {
> > >  uint64_t mfn;
> > >
> > > -if (xenstore_read_fe_uint64(&c->xendev, "page-ref", &mfn) == -1)
> > > - return -1;
> > > -assert(mfn == (xen_pfn_t)mfn);
> > > -
> > >  if (xenstore_read_fe_int(&c->xendev, "event-channel", &c-
> > >xendev.remote_port) == -1)
> > >   return -1;
> > >
> > > -c->page = xc_map_foreign_range(xen_xc, c->xendev.dom,
> > > -XC_PAGE_SIZE,
> > > -PROT_READ | PROT_WRITE, mfn);
> > > +if (xenstore_read_fe_uint64(&c->xendev, "page-ref", &mfn) == -1) {
> > > +if (xenstore_read_fe_int(&c->xendev, "page-gref", &c->page_gref)
> > == -1) {
> > > +return -1;
> > > +}
> > > +c->page = xc_gnttab_map_grant_ref(c->xendev.gnttabdev,
> > > +  c->xendev.dom,
> > > +  c->page_gref,
> > > +  PROT_READ | PROT_WRITE);
> > > +} else {
> > > +assert(mfn == (xen_pfn_t)mfn);
> > > +
> > > +c->page = xc_map_foreign_range(xen_xc, c->xendev.dom,
> > > +   XC_PAGE_SIZE,
> > > +   PROT_READ | PROT_WRITE, mfn);
> > > +}
> > 
> > This is an extension of the protocol. It should be documented somewhere.
> > At least in the commit message and somewhere under docs/misc or
> > xen/include/public/io/fbif.h.
> > 
> 
> V2 will contain a better commit message, and I'll add a patch to the headers
> to document the changes.
> The same protocol extension docs should be in xen/include/public/io/fbif.h 
> and xen/include/public/io/kdbif.h, as this change covers the vfb and vkbd, or 
> would it be better to change the patch to only affect the vkbd?

I am OK with changing both vfb and vkbd but in that case you need to
document both.


> > 
> > >  if (c->page == NULL)
> > >   return -1;
> > >
> > >  xen_be_bind_evtchn(&c->xendev);
> > > -xen_be_printf(&c->xendev, 1, "ring mfn %"PRIx64", remote-port %d,
> > local-port %d\n",
> > > -   mfn, c->xendev.remote_port, c->xendev.local_port);
> > > +
> > > +if (c->page_gref) {
> > > +xen_be_printf(&c->xendev, 1, "ring gref %d, remote-port %d, 
> > > local-
> > port %d\n",
> > > +  c->page_gref, c->xendev.remote_port, 
> > > c->xendev.local_port);
> > > +} else {
> > > +xen_be_printf(&c->xendev, 1, "ring mfn %"PRIx64", remote-port %d,
> > local-port %d\n",
> > > +  mfn, c->xendev.remote_port, c->xendev.local_port);
> > > +}
> > >
> > >  return 0;
> > >  }
> > > @@ -120,8 +135,13 @@ static void common_unbind(struct common *c)  {
> > >  xen_be_unbind_evtchn(&c->xendev);
> > >  if (c->page) {
> > > - munmap(c->page, XC_PAGE_SIZE);
> > > +if (c->page_gref) {
> > > +xc_gnttab_munmap(c->xendev.gnttabdev, c->page, 1);
> > > +} else {
> > > +munmap(c->page, XC_PAGE_SIZE);
> > > +}
> > >   c->page = NULL;
> > > +c->page_gref = 0;
> > >  }
> > >  }
> > >
> > > @@ -954,6 +974,7 @@ static void fb_event(struct XenDevice *xendev)
> > >
> > >  struct XenDevOps xen_kbdmouse_ops = {
> > >  .size   = sizeof(struct XenInput),
> > > +.flags  = DEVOPS_FLAG_NEED_GNTDEV,
> > >  .init   = input_init,
> > >  .initialise = input_initialise,
> > >  .connected  = input_connected,
> > > --
> > > 2.1.0
> > >
> 



Re: [Qemu-devel] [PATCH 04/23] block: Connect BlockBackend and DriveInfo

2014-09-11 Thread Markus Armbruster
Benoît Canet  writes:

> The Wednesday 10 Sep 2014 à 10:13:33 (+0200), Markus Armbruster wrote :
>> Make the BlockBackend own the DriveInfo.  Change blockdev_init() to
>> return the BlockBackend instead of the DriveInfo.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  block/block-backend.c | 38 +++
>>  blockdev.c| 79 
>> +--
>>  include/sysemu/blockdev.h |  4 +++
>>  3 files changed, 78 insertions(+), 43 deletions(-)
>> 
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index deccb54..2a22660 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -12,14 +12,18 @@
>>  
>>  #include "sysemu/block-backend.h"
>>  #include "block/block_int.h"
>> +#include "sysemu/blockdev.h"
>>  
>>  struct BlockBackend {
>>  char *name;
>>  int refcnt;
>>  BlockDriverState *bs;
>> +DriveInfo *legacy_dinfo;
>>  QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
>>  };
>>  
>
>> +static void drive_info_del(DriveInfo *dinfo);
>
> Is there any technical reason not to just put the
> drive_info_del above the blk_delete function ?
> I don't see any possible circular references between the two.
>
> Some people like Eric frown upon static function prototypes
> in final code that's why I am asking.

Placing functions before their callers makes the program easier to read
when you need to see the functions definition before you can understand
their use.

Placing the functions after callers makes the program easier to read
when the gist of what they do is obvious from the call.  You're omitting
unnecessary detail there, to be flesh it own further down.  Saving a
function declaration is immaterial compared to that.

Before I put the function where I don't want it, I'd inline it :)

>> +
>>  static QTAILQ_HEAD(, BlockBackend) blk_backends =
>>  QTAILQ_HEAD_INITIALIZER(blk_backends);
>>  
>> @@ -87,6 +91,7 @@ static void blk_delete(BlockBackend *blk)
>>  blk_detach_bs(blk);
>>  QTAILQ_REMOVE(&blk_backends, blk, link);
>>  g_free(blk->name);
>> +drive_info_del(blk->legacy_dinfo);
>>  g_free(blk);
>>  }
>>  
>> @@ -119,6 +124,16 @@ void blk_unref(BlockBackend *blk)
>>  }
>>  }
>>  
>> +static void drive_info_del(DriveInfo *dinfo)
>> +{
>> +if (dinfo) {
>> +qemu_opts_del(dinfo->opts);
>> +g_free(dinfo->id);
>> +g_free(dinfo->serial);
>> +g_free(dinfo);
>> +}
>> +}
>> +
>>  const char *blk_name(BlockBackend *blk)
>>  {
>>  return blk->name;
>> @@ -187,3 +202,26 @@ BlockDriverState *blk_detach_bs(BlockBackend *blk)
>>  }
>>  return bs;
>>  }
>> +
>> +DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
>> +{
>> +return blk->legacy_dinfo;
>> +}
>> +
>> +DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo)
>> +{
>> +assert(!blk->legacy_dinfo);
>> +return blk->legacy_dinfo = dinfo;
>> +}
>> +
>> +BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
>> +{
>> +BlockBackend *blk;
>> +
>> +QTAILQ_FOREACH(blk, &blk_backends, link) {
>> +if (blk->legacy_dinfo == dinfo) {
>> +return blk;
>> +}
>> +}
>> +assert(0);
>> +}
>> diff --git a/blockdev.c b/blockdev.c
>> index 0a0b95e..73e2da9 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -47,8 +47,6 @@
>>  #include "trace.h"
>>  #include "sysemu/arch_init.h"
>>  
>> -static QTAILQ_HEAD(drivelist, DriveInfo) drives = 
>> QTAILQ_HEAD_INITIALIZER(drives);
>> -
>>  static const char *const if_name[IF_COUNT] = {
>>  [IF_NONE] = "none",
>>  [IF_IDE] = "ide",
>> @@ -89,7 +87,8 @@ static const int if_max_devs[IF_COUNT] = {
>>   */
>>  void blockdev_mark_auto_del(BlockDriverState *bs)
>>  {
>> -DriveInfo *dinfo = drive_get_by_blockdev(bs);
>> +BlockBackend *blk = bs->blk;
>> +DriveInfo *dinfo = blk_legacy_dinfo(blk);
>>  
>>  if (dinfo && !dinfo->enable_auto_del) {
>>  return;
>> @@ -105,7 +104,8 @@ void blockdev_mark_auto_del(BlockDriverState *bs)
>>  
>>  void blockdev_auto_del(BlockDriverState *bs)
>>  {
>> -DriveInfo *dinfo = drive_get_by_blockdev(bs);
>> +BlockBackend *blk = bs->blk;
>> +DriveInfo *dinfo = blk_legacy_dinfo(blk);
>>  
>>  if (dinfo && dinfo->auto_del) {
>>  drive_del(dinfo);
>> @@ -153,15 +153,15 @@ QemuOpts *drive_add(BlockInterfaceType type, int 
>> index, const char *file,
>>  
>>  DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit)
>>  {
>> +BlockBackend *blk;
>>  DriveInfo *dinfo;
>>  
>> -/* seek interface, bus and unit */
>> -
>> -QTAILQ_FOREACH(dinfo, &drives, next) {
>> -if (dinfo->type == type &&
>> -dinfo->bus == bus &&
>> -dinfo->unit == unit)
>> +for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
>
> Here I understand why you didn't made blk_next pure round robin
> circling in a loop.

It works exactly like bdrv_next().

> Maybe the comments of the blk_next function should say it's

[Qemu-devel] [PATCH v3] virtio-pci: enable bus master for old guests

2014-09-11 Thread Michael S. Tsirkin
commit cc943c36faa192cd4b32af8fe5edb31894017d35
pci: Use bus master address space for delivering MSI/MSI-X messages
breaks virtio-net for rhel6.[56] x86 guests because they don't
enable bus mastering for virtio PCI devices. For the same reason,
rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.

Old guests forgot to enable bus mastering, enable it automatically on
DRIVER (ppc64 guests may never reach DRIVER_OK). We also need to prevent
these old guests to disable bus mastering (this happens when the driver
probes the device). And we must also re-enable bus mastering after migration.

Cc: qemu-sta...@nongnu.org
Reported-by: Greg Kurz 
Signed-off-by: Michael S. Tsirkin 
[ old guest detection on DRIVER,
  squashed patch from Michael S. Tsirkin to re-enable bus mastering,
  Greg Kurz  ]
Tested-by: Cedric Le Goater 
Signed-off-by: Greg Kurz 

Signed-off-by: Michael S. Tsirkin 
---

So, below seems mostly correct.
I still think it's too complex, and I have an idea for
a much simpler patch.
Will try sending that next week.


 hw/virtio/virtio-pci.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ddb5da1..81a6ea6 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -314,12 +314,14 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 msix_unuse_all_vectors(&proxy->pci_dev);
 }
 
-/* Linux before 2.6.34 sets the device as OK without enabling
+/* Linux before 2.6.34 drives the device without enabling
the PCI device bus master bit. In this case we need to disable
some safety checks. */
-if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
+if ((val & VIRTIO_CONFIG_S_DRIVER) &&
 !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
 proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
+memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
+  true);
 }
 break;
 case VIRTIO_MSI_CONFIG_VECTOR:
@@ -473,10 +475,15 @@ static void virtio_write_config(PCIDevice *pci_dev, 
uint32_t address,
 pci_default_write_config(pci_dev, address, val, len);
 
 if (range_covers_byte(address, len, PCI_COMMAND) &&
-!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
-!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
-virtio_pci_stop_ioeventfd(proxy);
-virtio_set_status(vdev, vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
+!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
+if (proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG) {
+memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
+  true);
+} else {
+virtio_pci_stop_ioeventfd(proxy);
+virtio_set_status(vdev, vdev->status & ~(VIRTIO_CONFIG_S_DRIVER_OK|
+ VIRTIO_CONFIG_S_DRIVER));
+}
 }
 }
 
@@ -887,9 +894,11 @@ static void virtio_pci_vmstate_change(DeviceState *d, bool 
running)
 if (running) {
 /* Try to find out if the guest has bus master disabled, but is
in ready state. Then we have a buggy guest OS. */
-if ((vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+if ((vdev->status & VIRTIO_CONFIG_S_DRIVER) &&
 !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
 proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
+memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
+  true);
 }
 virtio_pci_start_ioeventfd(proxy);
 } else {
-- 
MST



Re: [Qemu-devel] [PATCH v2] virtio-pci: enable bus master for old guests

2014-09-11 Thread Michael S. Tsirkin
On Thu, Sep 11, 2014 at 06:00:39PM +0200, Greg Kurz wrote:
> On Wed, 10 Sep 2014 20:30:47 +0300
> "Michael S. Tsirkin"  wrote:
> 
> > commit cc943c36faa192cd4b32af8fe5edb31894017d35
> > pci: Use bus master address space for delivering MSI/MSI-X messages
> > breaks virtio-net for rhel6.[56] x86 guests because they don't
> > enable bus mastering for virtio PCI devices
> > 
> > Old guests forgot to enable bus mastering, enable it
> > automatically when driver discovers device.
> > 
> > Cc: qemu-sta...@nongnu.org
> > Reported-by: Greg Kurz 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> > 
> > OK, this should have better luck.
> 
> Unfortunately not... it is even worse than V1 as
> both x86 and ppc64 guests fail. I've added some
> debug messages in virtio_ioport_write() and
> virtio_write_config() to track:
> - VIRTIO_PCI_STATUS changes
> - bus master disablement
> 
> This is what I get for x86 with a virtio-net device:
> 
> virtio_write_config: PCI_COMMAND = 0x3
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x0
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x1 ==> v2 patch enables BM
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x3
> virtio_write_config: PCI_COMMAND = 0x3   ==> guest disables BM
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x7
> virtio_ioport_write: detected bus master bug ==> v1 patch would have
>  enabled BM here
> 
> and for ppc64 when booting from a virtio-blk disk:
> 
> virtio_write_config: PCI_COMMAND = 0x3
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x0
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x1 ==> v2 patch enables BM
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x3
> virtio_write_config: PCI_COMMAND = 0x3   ==> guest disables BM


So guest sets DRIVER then overwrites PCI_COMMAND.  Weird.
Any idea which guest code does this?

> I think we should:
> - detect the "old driver" when reaching DRIVER as it works for
>   both architectures
> - always re-enable BM when an "old driver" disables it
> 
> That is basically your v1 plus its follow-up patch:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg01918.html
> 
> plus:
> 
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -317,7 +317,7 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> addr, uint32_t val)
>  /* Linux before 2.6.34 sets the device as OK without enabling
> the PCI device bus master bit. In this case we need to disable
> some safety checks. */
> -if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +if ((val & VIRTIO_CONFIG_S_DRIVER) &&
>  !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
>  proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
>  
> memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> 
> I could verify it fixes the issues. I will send a patch shortly.
> 
> 
> Cheers.
> 
> --
> Greg
> 
> > This also makes it possible to simplify code,
> > will do that in a follow-up patch.
> > 
> > 
> >  hw/virtio/virtio-pci.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index ddb5da1..a29d94f 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -303,6 +303,14 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> > addr, uint32_t val)
> >  virtio_pci_stop_ioeventfd(proxy);
> >  }
> > 
> > +/* Linux before 2.6.34 uses the device without enabling
> > +   the PCI device bus master bit. As a work-around, enable it
> > +   automatically when driver detects the device. */
> > +if (val == VIRTIO_CONFIG_S_ACKNOWLEDGE) {
> > +
> > memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> > +  true);
> > +}
> > +
> >  virtio_set_status(vdev, val & 0xFF);
> > 
> >  if (val & VIRTIO_CONFIG_S_DRIVER_OK) {



Re: [Qemu-devel] [PATCH] virtio-pci: enable bus master for old guests

2014-09-11 Thread Michael S. Tsirkin
On Thu, Sep 11, 2014 at 08:47:01PM +0300, Michael S. Tsirkin wrote:
> On Thu, Sep 11, 2014 at 06:45:33PM +0200, Greg Kurz wrote:
> > From: Michael S. Tsirkin 
> > 
> > commit cc943c36faa192cd4b32af8fe5edb31894017d35
> > pci: Use bus master address space for delivering MSI/MSI-X messages
> > breaks virtio-net for rhel6.[56] x86 guests because they don't
> > enable bus mastering for virtio PCI devices. For the same reason,
> > rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.
> > 
> > Old guests forgot to enable bus mastering, enable it automatically on
> > DRIVER (ppc64 guests may never reach DRIVER_OK). We also need to prevent
> > these old guests to disable bus mastering (this happens when the driver
> > probes the device). And we must also re-enable bus mastering after 
> > migration.
> > 
> > Cc: qemu-sta...@nongnu.org
> > Reported-by: Greg Kurz 
> > Signed-off-by: Michael S. Tsirkin 
> > [ old guest detection on DRIVER,
> >   squashed patch from Michael S. Tsirkin to re-enable bus mastering,
> >   Greg Kurz  ]
> > Tested-by: Cedric Le Goater 
> > Signed-off-by: Greg Kurz 
> 
> I think this is too fragile. Let's try to figure out a strategy
> that does not depend on VIRTIO_PCI_FLAG_BUS_MASTER_BUG.

OTOH it does work, so I think I'll apply something similar (this one is
slightly broken wrt migration) and work on simplifications as a follow-up.

> > ---
> >  hw/virtio/virtio-pci.c |   18 +-
> >  1 file changed, 13 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index ddb5da1..f981841 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -317,9 +317,11 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> > addr, uint32_t val)
> >  /* Linux before 2.6.34 sets the device as OK without enabling
> > the PCI device bus master bit. In this case we need to disable
> > some safety checks. */
> > -if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > +if ((val & VIRTIO_CONFIG_S_DRIVER) &&
> >  !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
> >  proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
> > +
> > memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> > +  true);
> >  }
> >  break;
> >  case VIRTIO_MSI_CONFIG_VECTOR:
> > @@ -473,10 +475,14 @@ static void virtio_write_config(PCIDevice *pci_dev, 
> > uint32_t address,
> >  pci_default_write_config(pci_dev, address, val, len);
> >  
> >  if (range_covers_byte(address, len, PCI_COMMAND) &&
> > -!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
> > -!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
> > -virtio_pci_stop_ioeventfd(proxy);
> > -virtio_set_status(vdev, vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
> > +!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
> > +if (proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG) {
> > +
> > memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> > +  true);
> > +} else {
> > +virtio_pci_stop_ioeventfd(proxy);
> > +virtio_set_status(vdev, vdev->status & 
> > ~VIRTIO_CONFIG_S_DRIVER_OK);
> > +}
> >  }
> >  }
> >  
> > @@ -890,6 +896,8 @@ static void virtio_pci_vmstate_change(DeviceState *d, 
> > bool running)
> >  if ((vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
> >  !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
> >  proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
> > +
> > memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> > +  true);
> >  }
> >  virtio_pci_start_ioeventfd(proxy);
> >  } else {



Re: [Qemu-devel] Microcheckpointing: Memory-VCPU / Disk State consistency

2014-09-11 Thread Dr. David Alan Gilbert
(I've cc'd in Fam, Stefan, and Kevin for Block stuff, and 
  Yang and Eddie for Colo)

* Walid Nouri (walid.no...@gmail.com) wrote:
> Hello Michael, Hello Paolo
> i have ???studied??? the available documentation/Information and tried to
> get an idea of the QEMU live block operation possibilities.
> 
> I think the MC protocol doesn???t need synchronous block device replication
> because primary and secondary VM are not synchronous. The state of the
> primary is allays ahead of the state of the secondary. When the primary is
> in epoch(n) the secondary is in epoch(n-1).
> 
> What MC needs is a block device agnostic, controlled and asynchronous
> approach for replicating the contents of block devices and its state changes
> to the secondary VM while the primary VM is running. Asynchronous block
> transfer is important to allow maximum performance for the primary VM, while
> keeping the secondary VM updated with state changes.
> 
> The block device replication should be possible in two stages or modes.
> 
> The first stage is the live copy of all block devices of the primary to the
> secondary. This is necessary if the secondary doesn???t have an existing
> image which is in sync with the primary at the time MC has started. This is
> not very convenient but as far as I know actually there is no mechanism for
> persistent dirty bitmap in QEMU.
> 
> The second stage (mode) is the replication of block device state changes
> (modified blocks)  to keep the image on the secondary in sync with the
> primary. The mirrored blocks must be buffered in ram (block buffer) until
> the complete Checkpoint (RAM, vCPU, device state) can be committed.
> 
> For keeping the complete system state consistent on the secondary system
> there must be a possibility for MC to commit/discard block device state
> changes. In normal operation the mirrored block device state changes (block
> buffer) are committed to disk when the complete checkpoint is committed. In
> case of a crash of the primary system while transferring a checkpoint the
> data in the block buffer corresponding to the failed Checkpoint must be
> discarded.

I think for COLO there's a requirement that the secondary can do reads/writes
in parallel with the primary, and the secondary can discard those reads/writes
- and that doesn't happen in MC (Yang or Eddie should be able to confirm that).

> The storage architecture should be ???shared nothing??? so that no shared
> storage is required and primary/secondary can have separate block device
> images.

MC/COLO with shared storage still needs some stuff like this; but it's subtely
different.   They still need to be able to buffer/release modifications
to the shared storage; if any of this code can also be used in the
shared-storage configurations it would be good.

> I think this can be achieved by drive-mirror and a filter block driver.
> Another approach could be to exploit the block migration functionality of
> live migration with a filter block driver.
> 
> The drive-mirror (and live migration) does not rely on shared storage and
> allow live block device copy and incremental syncing.
> 
> A block buffer can be implemented with a QEMU filter block driver. It should
> sit at the same position as the Quorum driver in the block driver hierarchy.
> When using block filter approach MC will be transparent and block device
> agnostic.
> 
> The block buffer filter must have an Interface which allows MC control the
> commits or discards of block device state changes. I have no idea where to
> put such an interface to stay conform with QEMU coding style.
> 
> 
> I???m sure there are alternative and better approaches and I???m open for
> any ideas
> 
> 
> Walid
> 
> Am 17.08.2014 11:52, schrieb Paolo Bonzini:
> >Il 11/08/2014 22:15, Michael R. Hines ha scritto:
> >>Excellent question: QEMU does have a feature called "drive-mirror"
> >>in block/mirror.c that was introduced a couple of years ago. I'm not
> >>sure what the
> >>adoption rate of the feature is, but I would start with that one.
> >
> >block/mirror.c is asynchronous, and there's no support for communicating
> >checkpoints back to the master.  However, the quorum disk driver could
> >be what you need.
> >
> >There's also a series on the mailing list that lets quorum read only
> >from the primary, so that quorum can still do replication and fault
> >tolerance, but skip fault detection.
> >
> >Paolo
> >
> >>There is also a second fault tolerance implementation that works a
> >>little differently called
> >>"COLO" - you may have seen those emails on the list too, but their
> >>method does not require a disk replication solution, if I recall correctly.
> >
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH] virtio-pci: enable bus master for old guests

2014-09-11 Thread Michael S. Tsirkin
On Thu, Sep 11, 2014 at 06:45:33PM +0200, Greg Kurz wrote:
> From: Michael S. Tsirkin 
> 
> commit cc943c36faa192cd4b32af8fe5edb31894017d35
> pci: Use bus master address space for delivering MSI/MSI-X messages
> breaks virtio-net for rhel6.[56] x86 guests because they don't
> enable bus mastering for virtio PCI devices. For the same reason,
> rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.
> 
> Old guests forgot to enable bus mastering, enable it automatically on
> DRIVER (ppc64 guests may never reach DRIVER_OK). We also need to prevent
> these old guests to disable bus mastering (this happens when the driver
> probes the device). And we must also re-enable bus mastering after migration.
> 
> Cc: qemu-sta...@nongnu.org
> Reported-by: Greg Kurz 
> Signed-off-by: Michael S. Tsirkin 
> [ old guest detection on DRIVER,
>   squashed patch from Michael S. Tsirkin to re-enable bus mastering,
>   Greg Kurz  ]
> Tested-by: Cedric Le Goater 
> Signed-off-by: Greg Kurz 

I think this is too fragile. Let's try to figure out a strategy
that does not depend on VIRTIO_PCI_FLAG_BUS_MASTER_BUG.

> ---
>  hw/virtio/virtio-pci.c |   18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index ddb5da1..f981841 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -317,9 +317,11 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> addr, uint32_t val)
>  /* Linux before 2.6.34 sets the device as OK without enabling
> the PCI device bus master bit. In this case we need to disable
> some safety checks. */
> -if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +if ((val & VIRTIO_CONFIG_S_DRIVER) &&
>  !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
>  proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
> +
> memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> +  true);
>  }
>  break;
>  case VIRTIO_MSI_CONFIG_VECTOR:
> @@ -473,10 +475,14 @@ static void virtio_write_config(PCIDevice *pci_dev, 
> uint32_t address,
>  pci_default_write_config(pci_dev, address, val, len);
>  
>  if (range_covers_byte(address, len, PCI_COMMAND) &&
> -!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
> -!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
> -virtio_pci_stop_ioeventfd(proxy);
> -virtio_set_status(vdev, vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
> +!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
> +if (proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG) {
> +
> memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> +  true);
> +} else {
> +virtio_pci_stop_ioeventfd(proxy);
> +virtio_set_status(vdev, vdev->status & 
> ~VIRTIO_CONFIG_S_DRIVER_OK);
> +}
>  }
>  }
>  
> @@ -890,6 +896,8 @@ static void virtio_pci_vmstate_change(DeviceState *d, 
> bool running)
>  if ((vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
>  !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
>  proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
> +
> memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> +  true);
>  }
>  virtio_pci_start_ioeventfd(proxy);
>  } else {



Re: [Qemu-devel] [PATCH 04/23] block: Connect BlockBackend and DriveInfo

2014-09-11 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 10.09.2014 um 10:13 hat Markus Armbruster geschrieben:
>> Make the BlockBackend own the DriveInfo.  Change blockdev_init() to
>> return the BlockBackend instead of the DriveInfo.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  block/block-backend.c | 38 +++
>>  blockdev.c| 79 
>> +--
>>  include/sysemu/blockdev.h |  4 +++
>>  3 files changed, 78 insertions(+), 43 deletions(-)
>> 
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index deccb54..2a22660 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -12,14 +12,18 @@
>>  
>>  #include "sysemu/block-backend.h"
>>  #include "block/block_int.h"
>> +#include "sysemu/blockdev.h"
>>  
>>  struct BlockBackend {
>>  char *name;
>>  int refcnt;
>>  BlockDriverState *bs;
>> +DriveInfo *legacy_dinfo;
>>  QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
>>  };
>>  
>> +static void drive_info_del(DriveInfo *dinfo);
>> +
>>  static QTAILQ_HEAD(, BlockBackend) blk_backends =
>>  QTAILQ_HEAD_INITIALIZER(blk_backends);
>>  
>> @@ -87,6 +91,7 @@ static void blk_delete(BlockBackend *blk)
>>  blk_detach_bs(blk);
>>  QTAILQ_REMOVE(&blk_backends, blk, link);
>>  g_free(blk->name);
>> +drive_info_del(blk->legacy_dinfo);
>>  g_free(blk);
>>  }
>>  
>> @@ -119,6 +124,16 @@ void blk_unref(BlockBackend *blk)
>>  }
>>  }
>>  
>> +static void drive_info_del(DriveInfo *dinfo)
>> +{
>> +if (dinfo) {
>> +qemu_opts_del(dinfo->opts);
>> +g_free(dinfo->id);
>> +g_free(dinfo->serial);
>> +g_free(dinfo);
>> +}
>> +}
>> +
>>  const char *blk_name(BlockBackend *blk)
>>  {
>>  return blk->name;
>> @@ -187,3 +202,26 @@ BlockDriverState *blk_detach_bs(BlockBackend *blk)
>>  }
>>  return bs;
>>  }
>> +
>> +DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
>> +{
>> +return blk->legacy_dinfo;
>> +}
>> +
>> +DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, DriveInfo *dinfo)
>> +{
>> +assert(!blk->legacy_dinfo);
>> +return blk->legacy_dinfo = dinfo;
>> +}
>> +
>> +BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
>> +{
>> +BlockBackend *blk;
>> +
>> +QTAILQ_FOREACH(blk, &blk_backends, link) {
>> +if (blk->legacy_dinfo == dinfo) {
>> +return blk;
>> +}
>> +}
>> +assert(0);
>> +}
>> diff --git a/blockdev.c b/blockdev.c
>> index 0a0b95e..73e2da9 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -47,8 +47,6 @@
>>  #include "trace.h"
>>  #include "sysemu/arch_init.h"
>>  
>> -static QTAILQ_HEAD(drivelist, DriveInfo) drives = 
>> QTAILQ_HEAD_INITIALIZER(drives);
>> -
>>  static const char *const if_name[IF_COUNT] = {
>>  [IF_NONE] = "none",
>>  [IF_IDE] = "ide",
>> @@ -89,7 +87,8 @@ static const int if_max_devs[IF_COUNT] = {
>>   */
>>  void blockdev_mark_auto_del(BlockDriverState *bs)
>>  {
>> -DriveInfo *dinfo = drive_get_by_blockdev(bs);
>> +BlockBackend *blk = bs->blk;
>> +DriveInfo *dinfo = blk_legacy_dinfo(blk);
>>  
>>  if (dinfo && !dinfo->enable_auto_del) {
>>  return;
>> @@ -105,7 +104,8 @@ void blockdev_mark_auto_del(BlockDriverState *bs)
>>  
>>  void blockdev_auto_del(BlockDriverState *bs)
>>  {
>> -DriveInfo *dinfo = drive_get_by_blockdev(bs);
>> +BlockBackend *blk = bs->blk;
>> +DriveInfo *dinfo = blk_legacy_dinfo(blk);
>>  
>>  if (dinfo && dinfo->auto_del) {
>>  drive_del(dinfo);
>
> The if condition looks as if there were cases where dinfo was NULL. To
> be precise, this was introduced in commit 0fc0f1fa, so that hot unplug
> after a forced drive_del wouldn't segfault.
>
> Does such a device that doesn't have a dinfo at least have a
> BlockBackend? If it doesn't, we'll segfault in blk_legacy_dinfo(). I
> won't say that I understand that part of the code, but I can't see why
> BlockBackend would still be there when DriveInfo isn't any more.

Ah, our friend drive_del again.

As I explained in my reply to your review of PATCH 02, the initial
implementation of drive_del destroyed the root BDS and the DriveInfo
right away, leaving the device model with a dangling pointer to the root
BDS.  Naturally, this was prone to crash eventually, when the memory
gets reused.  But here, it crashes right away, because the DriveInfo no
longer exists.  Commit 0fc0f1f "fixed" that part.  The real fix is
commit d22b2f4.

However, my PATCH 18 will change blockdev-add not to create a DriveInfo,
and then this test will become necessary again.

>> @@ -153,15 +153,15 @@ QemuOpts *drive_add(BlockInterfaceType type, int 
>> index, const char *file,
>>  
>>  DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit)
>>  {
>> +BlockBackend *blk;
>>  DriveInfo *dinfo;
>>  
>> -/* seek interface, bus and unit */
>> -
>> -QTAILQ_FOREACH(dinfo, &drives, next) {
>> -if (dinfo->type == type &&
>> -dinfo->bus == bus &&
>>

Re: [Qemu-devel] [Bug 1361912] Re: qemu-mips64 Segmentation fault

2014-09-11 Thread Hill
This is a error in user mode, I think it should be very easy to
reproduce.

On Thu, Sep 11, 2014 at 4:55 AM, Leon Alrae 
wrote:

> Could you please provide backtrace and give more details to reproduce
> the issue?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1361912
>
> Title:
>   qemu-mips64 Segmentation fault
>
> Status in QEMU:
>   New
>
> Bug description:
>   When I ran qemu-mips64 for any mips 64 executable , I got this error:
>
>   $ ./qemu-mips64  ../lang
>   qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>   Segmentation fault (core dumped)
>
>   Is this a known issue?
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1361912/+subscriptions
>

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1361912

Title:
  qemu-mips64 Segmentation fault

Status in QEMU:
  New

Bug description:
  When I ran qemu-mips64 for any mips 64 executable , I got this error:

  $ ./qemu-mips64  ../lang
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  Is this a known issue?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1361912/+subscriptions



Re: [Qemu-devel] [edk2] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Paolo Bonzini
Il 11/09/2014 19:11, Gabriel L. Somlo ha scritto:
> On Thu, Sep 11, 2014 at 06:40:38PM +0200, Paolo Bonzini wrote:
>> Il 11/09/2014 18:35, Gabriel L. Somlo ha scritto:
> Can you configure Chamaleon to avoid the boot prompt?
>>> Yes. After doing that, usb starts working once OS X is fully booted.
>>>
>>> Works with either piix or q35 just fine.
>>>
>>> Does this mean it's likely to be an OVMF uhci/ehci issue specific to Q35 ?
>>> (one from which Fedora can recover but OS X can't) ? 
>>
>> Yes, that's my interpretation too.
>>
>> You did test an UHCI controller, I think, but I don't remember---did you
>> test an EHCI controller without companions, using something like
>> "-device ich9-usb-ehci1,id=myehci -device usb-keyboard,bus=myehci.0"?
> 
> That fails during QEMU initialization with:
> 
> qemu-system-x86_64: -device usb-kbd,bus=myehci.0: Warning:
> speed mismatch trying to attach usb device "QEMU USB Keyboard" (full
> speed) to bus "myehci.0", port "1" (high speed)
> qemu-system-x86_64: -device usb-kbd,bus=myehci.0: Device
> initialization failed.
> qemu-system-x86_64: -device usb-kbd,bus=myehci.0: Device 'usb-kbd'
> could not be initialized
> 
> So I guess OVMF doesn't have a chance to get involved here before we
> crash and burn :)

Ah, that needs this patch here:

http://article.gmane.org/gmane.comp.emulators.qemu/296216/raw

Paolo

> Thanks,
> --Gabriel
> 
>> If that works, the issue would be specific to EHCI companion
>> controllers.  If that doesn't work, there is at least a generic in the
>> EHCI driver---of course there could possibly be another in the companion
>> controllers, but I'd try getting EHCI alone to work.
> 
> --
> Want excitement?
> Manually upgrade your production database.
> When you want reliability, choose Perforce
> Perforce version control. Predictably reliable.
> http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
> 




Re: [Qemu-devel] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Gabriel L. Somlo
On Thu, Sep 11, 2014 at 06:40:38PM +0200, Paolo Bonzini wrote:
> Il 11/09/2014 18:35, Gabriel L. Somlo ha scritto:
> >> > Can you configure Chamaleon to avoid the boot prompt?
> > Yes. After doing that, usb starts working once OS X is fully booted.
> > 
> > Works with either piix or q35 just fine.
> > 
> > Does this mean it's likely to be an OVMF uhci/ehci issue specific to Q35 ?
> > (one from which Fedora can recover but OS X can't) ? 
> 
> Yes, that's my interpretation too.
> 
> You did test an UHCI controller, I think, but I don't remember---did you
> test an EHCI controller without companions, using something like
> "-device ich9-usb-ehci1,id=myehci -device usb-keyboard,bus=myehci.0"?

That fails during QEMU initialization with:

qemu-system-x86_64: -device usb-kbd,bus=myehci.0: Warning:
speed mismatch trying to attach usb device "QEMU USB Keyboard" (full
speed) to bus "myehci.0", port "1" (high speed)
qemu-system-x86_64: -device usb-kbd,bus=myehci.0: Device
initialization failed.
qemu-system-x86_64: -device usb-kbd,bus=myehci.0: Device 'usb-kbd'
could not be initialized

So I guess OVMF doesn't have a chance to get involved here before we
crash and burn :)

Thanks,
--Gabriel

> If that works, the issue would be specific to EHCI companion
> controllers.  If that doesn't work, there is at least a generic in the
> EHCI driver---of course there could possibly be another in the companion
> controllers, but I'd try getting EHCI alone to work.



Re: [Qemu-devel] [PATCH 10/23] block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()

2014-09-11 Thread Benoît Canet

> +   blk_bs(blk_by_legacy_dinfo(dinfo)));

This seems to be a fairly common pattern: "blk_bs(blk_by_legacy_dinfo())".
How about a helper function ?



Re: [Qemu-devel] [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-11 Thread Chris Webb
Paolo Bonzini  wrote:

> This is a hypercall that should have kicked VCPU 3 (see rcx).
> 
> Can you please apply this patch and gather a trace of the host
> (using "trace-cmd -e kvm qemu-kvm ")?

Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and
have put the resulting trace.dat at

  http://cdw.me.uk/tmp/trace.dat

This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having
trouble persuading the -smp 4 qemu to crash as reliably under tracing.
(Something timing related?) Otherwise the qemu-system-x86 command line is
exactly as before.

The guest kernel crash message which corresponds to this trace was:

divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 0 PID: 618 Comm: mkdir Not tainted 3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
task: 88007c997080 ti: 88007c614000 task.ti: 88007c614000
RIP: 0010:[]  [] kvm_unlock_kick+0x72/0x80
RSP: 0018:88007c617d40  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0001
RDX: 0001 RSI: 88007fd11c40 RDI: 
RBP: 88007fd11c40 R08: 81b98940 R09: 0001
R10:  R11: 0007 R12: 00f6
R13: 0001 R14: 0001 R15: 00011c40
FS:  7f43eb1ed700() GS:88007fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7f43eace0a30 CR3: 01a12000 CR4: 000406f0
Stack:
 88007c994380 88007c9949aa 0046 81689715
 810f3174 0001 ea0001f16320 ea0001f17860
  88007c99e1e8 88007c997080 0001
Call Trace:
 [] ? _raw_spin_unlock+0x45/0x70
 [] ? try_to_wake_up+0x2a4/0x330
 [] ? __wake_up_common+0x4c/0x80
 [] ? __wake_up_sync_key+0x38/0x60
 [] ? do_notify_parent+0x19a/0x280
 [] ? sched_move_task+0xb6/0x190
 [] ? do_exit+0xa1c/0xab0
 [] ? do_group_exit+0x34/0xb0
 [] ? SyS_exit_group+0xb/0x10
 [] ? system_call_fastpath+0x1a/0x1f
Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 
c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 <0f> 01 c1 0f 1f 00 5b 
5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 
RIP  [] kvm_unlock_kick+0x72/0x80
 RSP 
---[ end trace bf5a4445f9decdbb ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: mkdir/618/0x0006
Modules linked in:
CPU: 0 PID: 618 Comm: mkdir Tainted: G  D   3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
  c022d302 81684029 
 810ee956 81686266 00011c40 88007c617fd8
 00011c40 88007c997080 0006 0046
Call Trace:
 [] ? dump_stack+0x49/0x6a
 [] ? __schedule_bug+0x46/0x60
 [] ? __schedule+0x5a6/0x7c0
 [] ? printk+0x59/0x75
 [] ? do_exit+0x85b/0xab0
 [] ? printk+0x59/0x75
 [] ? oops_end+0x7a/0x100
 [] ? do_error_trap+0x85/0x110
 [] ? kvm_unlock_kick+0x72/0x80
 [] ? __alloc_pages_nodemask+0x108/0xa60
 [] ? divide_error+0x1e/0x30
 [] ? kvm_unlock_kick+0x72/0x80
 [] ? _raw_spin_unlock+0x45/0x70
 [] ? try_to_wake_up+0x2a4/0x330
 [] ? __wake_up_common+0x4c/0x80
 [] ? __wake_up_sync_key+0x38/0x60
 [] ? do_notify_parent+0x19a/0x280
 [] ? sched_move_task+0xb6/0x190
 [] ? do_exit+0xa1c/0xab0
 [] ? do_group_exit+0x34/0xb0
 [] ? SyS_exit_group+0xb/0x10
 [] ? system_call_fastpath+0x1a/0x1f

Best wishes,

Chris.


Re: [Qemu-devel] [PATCH v2] virtio-pci: enable bus master for old guests

2014-09-11 Thread Michael S. Tsirkin
On Thu, Sep 11, 2014 at 06:00:39PM +0200, Greg Kurz wrote:
> On Wed, 10 Sep 2014 20:30:47 +0300
> "Michael S. Tsirkin"  wrote:
> 
> > commit cc943c36faa192cd4b32af8fe5edb31894017d35
> > pci: Use bus master address space for delivering MSI/MSI-X messages
> > breaks virtio-net for rhel6.[56] x86 guests because they don't
> > enable bus mastering for virtio PCI devices
> > 
> > Old guests forgot to enable bus mastering, enable it
> > automatically when driver discovers device.
> > 
> > Cc: qemu-sta...@nongnu.org
> > Reported-by: Greg Kurz 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> > 
> > OK, this should have better luck.
> 
> Unfortunately not... it is even worse than V1 as
> both x86 and ppc64 guests fail. I've added some
> debug messages in virtio_ioport_write() and
> virtio_write_config() to track:
> - VIRTIO_PCI_STATUS changes
> - bus master disablement
> 
> This is what I get for x86 with a virtio-net device:
> 
> virtio_write_config: PCI_COMMAND = 0x3
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x0
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x1 ==> v2 patch enables BM
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x3
> virtio_write_config: PCI_COMMAND = 0x3   ==> guest disables BM

Weird. What code in guest causes this second write?
Any idea?

> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x7
> virtio_ioport_write: detected bus master bug ==> v1 patch would have
>  enabled BM here
> 
> and for ppc64 when booting from a virtio-blk disk:
> 
> virtio_write_config: PCI_COMMAND = 0x3
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x0
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x1 ==> v2 patch enables BM
> virtio_ioport_write: VIRTIO_PCI_STATUS = 0x3
> virtio_write_config: PCI_COMMAND = 0x3   ==> guest disables BM
> 
> I think we should:
> - detect the "old driver" when reaching DRIVER as it works for
>   both architectures

Basically take v2 and replace ACKNOWLEDGE with DRIVER.
Yea, maybe.
But I would like to understand what is going on here.


> - always re-enable BM when an "old driver" disables it
> 
> That is basically your v1 plus its follow-up patch:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg01918.html
> 
> plus:
> 
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -317,7 +317,7 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> addr, uint32_t val)
>  /* Linux before 2.6.34 sets the device as OK without enabling
> the PCI device bus master bit. In this case we need to disable
> some safety checks. */
> -if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +if ((val & VIRTIO_CONFIG_S_DRIVER) &&
>  !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
>  proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
>  
> memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> 
> I could verify it fixes the issues. I will send a patch shortly.
> 
> 
> Cheers.
> 
> --
> Greg
> 
> > This also makes it possible to simplify code,
> > will do that in a follow-up patch.
> > 
> > 
> >  hw/virtio/virtio-pci.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index ddb5da1..a29d94f 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -303,6 +303,14 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> > addr, uint32_t val)
> >  virtio_pci_stop_ioeventfd(proxy);
> >  }
> > 
> > +/* Linux before 2.6.34 uses the device without enabling
> > +   the PCI device bus master bit. As a work-around, enable it
> > +   automatically when driver detects the device. */
> > +if (val == VIRTIO_CONFIG_S_ACKNOWLEDGE) {
> > +
> > memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
> > +  true);
> > +}
> > +
> >  virtio_set_status(vdev, val & 0xFF);
> > 
> >  if (val & VIRTIO_CONFIG_S_DRIVER_OK) {



[Qemu-devel] [PATCH v3 1/4] hw/arm/boot: load DTB as a ROM image

2014-09-11 Thread Ard Biesheuvel
In order to make the device tree blob (DTB) available in memory not only at
first boot, but also after system reset, use rom_blob_add_fixed() to install
it into memory.

Reviewed-by: Peter Maydell 
Signed-off-by: Ard Biesheuvel 
---
 hw/arm/boot.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index e32f2f415885..50eca931e1a4 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -396,7 +396,10 @@ static int load_dtb(hwaddr addr, const struct 
arm_boot_info *binfo)
 
 qemu_fdt_dumpdtb(fdt, size);
 
-cpu_physical_memory_write(addr, fdt, size);
+/* Put the DTB into the memory map as a ROM image: this will ensure
+ * the DTB is copied again upon reset, even if addr points into RAM.
+ */
+rom_add_blob_fixed("dtb", fdt, size, addr);
 
 g_free(fdt);
 
-- 
1.8.3.2




[Qemu-devel] [PATCH v3 4/4] hw/arm/boot: enable DTB support when booting ELF images

2014-09-11 Thread Ard Biesheuvel
Add support for loading DTB images when booting ELF images using
-kernel. If there are no conflicts with the placement of the ELF
segments, the DTB image is loaded at the base of RAM.

Signed-off-by: Ard Biesheuvel 
---
 hw/arm/boot.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 1bed02db6fdd..c8dc34f0865b 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -482,7 +482,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 int kernel_size;
 int initrd_size;
 int is_linux = 0;
-uint64_t elf_entry;
+uint64_t elf_entry, elf_low_addr, elf_high_addr;
 int elf_machine;
 hwaddr entry, kernel_load_offset;
 int big_endian;
@@ -549,7 +549,25 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 
 /* Assume that raw images are linux kernels, and ELF images are not.  */
 kernel_size = load_elf(info->kernel_filename, NULL, NULL, &elf_entry,
-   NULL, NULL, big_endian, elf_machine, 1);
+   &elf_low_addr, &elf_high_addr, big_endian,
+   elf_machine, 1);
+if (kernel_size > 0 && have_dtb(info)) {
+/* If there is still some room left at the base of RAM, try and put
+ * the DTB there like we do for images loaded with -bios or -pflash.
+ */
+if (elf_low_addr > info->loader_start
+|| elf_high_addr < info->loader_start) {
+/* Pass elf_low_addr as address limit to load_dtb if it may be
+ * pointing into RAM, otherwise pass '0' (no limit)
+ */
+if (elf_low_addr < info->loader_start) {
+elf_low_addr = 0;
+}
+if (load_dtb(info->loader_start, info, elf_low_addr) < 0) {
+exit(1);
+}
+}
+}
 entry = elf_entry;
 if (kernel_size < 0) {
 kernel_size = load_uimage(info->kernel_filename, &entry, NULL,
-- 
1.8.3.2




[Qemu-devel] [PATCH v3 2/4] hw/arm/boot: pass an address limit to and return size from load_dtb()

2014-09-11 Thread Ard Biesheuvel
Add an address limit input parameter to load_dtb() so that we can
tell load_dtb() how much memory the dtb is allowed to consume. If
the dtb doesn't fit, return 0, otherwise return the actual size of
the loaded dtb.

Reviewed-by: Peter Maydell 
Signed-off-by: Ard Biesheuvel 
---
 hw/arm/boot.c | 34 +++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 50eca931e1a4..2083aeb95d80 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -312,7 +312,26 @@ static void set_kernel_args_old(const struct arm_boot_info 
*info)
 }
 }
 
-static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo)
+/**
+ * load_dtb() - load a device tree binary image into memory
+ * @addr:   the address to load the image at
+ * @binfo:  struct describing the boot environment
+ * @addr_limit: upper limit of the available memory area at @addr
+ *
+ * Load a device tree supplied by the machine or by the user  with the
+ * '-dtb' command line option, and put it at offset @addr in target
+ * memory.
+ *
+ * If @addr_limit contains a meaningful value (i.e., it is strictly greater
+ * than @addr), the device tree is only loaded if its size does not exceed
+ * the limit.
+ *
+ * Returns: the size of the device tree image on success,
+ *  0 if the image size exceeds the limit,
+ *  -1 on errors.
+ */
+static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
+hwaddr addr_limit)
 {
 void *fdt = NULL;
 int size, rc;
@@ -341,6 +360,15 @@ static int load_dtb(hwaddr addr, const struct 
arm_boot_info *binfo)
 }
 }
 
+if (addr_limit > addr && size > (addr_limit - addr)) {
+/* Installing the device tree blob at addr would exceed addr_limit.
+ * Whether this constitutes failure is up to the caller to decide,
+ * so just return 0 as size, i.e., no error.
+ */
+g_free(fdt);
+return 0;
+}
+
 acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
 scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
 if (acells == 0 || scells == 0) {
@@ -403,7 +431,7 @@ static int load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo)
 
 g_free(fdt);
 
-return 0;
+return size;
 
 fail:
 g_free(fdt);
@@ -572,7 +600,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
  */
 hwaddr dtb_start = QEMU_ALIGN_UP(info->initrd_start + initrd_size,
  4096);
-if (load_dtb(dtb_start, info)) {
+if (load_dtb(dtb_start, info, 0) < 0) {
 exit(1);
 }
 fixupcontext[FIXUP_ARGPTR] = dtb_start;
-- 
1.8.3.2




[Qemu-devel] [PATCH v3 3/4] hw/arm/boot: load device tree to base of DRAM if no -kernel option was passed

2014-09-11 Thread Ard Biesheuvel
If we are running the 'virt' machine, we may have a device tree blob but no
kernel to supply it to if no -kernel option was passed. In that case, copy it
to the base of RAM where it can be picked up by a bootloader.

Signed-off-by: Ard Biesheuvel 
---
 hw/arm/boot.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 2083aeb95d80..1bed02db6fdd 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -490,6 +490,16 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info 
*info)
 
 /* Load the kernel.  */
 if (!info->kernel_filename) {
+
+if (have_dtb(info)) {
+/* If we have a device tree blob, but no kernel to supply it to,
+ * copy it to the base of RAM for a bootloader to pick up.
+ */
+if (load_dtb(info->loader_start, info, 0) < 0) {
+exit(1);
+}
+}
+
 /* If no kernel specified, do nothing; we will start from address 0
  * (typically a boot ROM image) in the same way as hardware.
  */
-- 
1.8.3.2




[Qemu-devel] [PATCH] virtio-pci: enable bus master for old guests

2014-09-11 Thread Greg Kurz
From: Michael S. Tsirkin 

commit cc943c36faa192cd4b32af8fe5edb31894017d35
pci: Use bus master address space for delivering MSI/MSI-X messages
breaks virtio-net for rhel6.[56] x86 guests because they don't
enable bus mastering for virtio PCI devices. For the same reason,
rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.

Old guests forgot to enable bus mastering, enable it automatically on
DRIVER (ppc64 guests may never reach DRIVER_OK). We also need to prevent
these old guests to disable bus mastering (this happens when the driver
probes the device). And we must also re-enable bus mastering after migration.

Cc: qemu-sta...@nongnu.org
Reported-by: Greg Kurz 
Signed-off-by: Michael S. Tsirkin 
[ old guest detection on DRIVER,
  squashed patch from Michael S. Tsirkin to re-enable bus mastering,
  Greg Kurz  ]
Tested-by: Cedric Le Goater 
Signed-off-by: Greg Kurz 
---
 hw/virtio/virtio-pci.c |   18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ddb5da1..f981841 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -317,9 +317,11 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 /* Linux before 2.6.34 sets the device as OK without enabling
the PCI device bus master bit. In this case we need to disable
some safety checks. */
-if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
+if ((val & VIRTIO_CONFIG_S_DRIVER) &&
 !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
 proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
+memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
+  true);
 }
 break;
 case VIRTIO_MSI_CONFIG_VECTOR:
@@ -473,10 +475,14 @@ static void virtio_write_config(PCIDevice *pci_dev, 
uint32_t address,
 pci_default_write_config(pci_dev, address, val, len);
 
 if (range_covers_byte(address, len, PCI_COMMAND) &&
-!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER) &&
-!(proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG)) {
-virtio_pci_stop_ioeventfd(proxy);
-virtio_set_status(vdev, vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
+!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
+if (proxy->flags & VIRTIO_PCI_FLAG_BUS_MASTER_BUG) {
+memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
+  true);
+} else {
+virtio_pci_stop_ioeventfd(proxy);
+virtio_set_status(vdev, vdev->status & ~VIRTIO_CONFIG_S_DRIVER_OK);
+}
 }
 }
 
@@ -890,6 +896,8 @@ static void virtio_pci_vmstate_change(DeviceState *d, bool 
running)
 if ((vdev->status & VIRTIO_CONFIG_S_DRIVER_OK) &&
 !(proxy->pci_dev.config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
 proxy->flags |= VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
+memory_region_set_enabled(&proxy->pci_dev.bus_master_enable_region,
+  true);
 }
 virtio_pci_start_ioeventfd(proxy);
 } else {




[Qemu-devel] [PATCH v3 0/4] ARM: load_dtb() changes for -bios and ELF images

2014-09-11 Thread Ard Biesheuvel
I split off these patches from the series I sent last week. Peter's NOR flash
patch has been merged in the mean time, and the CPU reset patch needs to be
discussed more widely before we can move forward with it.

So what remains are changes in the DTB handling in hw/arm/boot.c, to load a
DTB even when using -bios or using -kernel but with an ELF image.

Changes since v2:
- handle ELF images below base of RAM
- improve comments
- add R-b to #2

Changes since v1:
- updated load_dtb() to take an address limit and return the dtb size, or 0
  if it didn't fit
- don't fail in the ELF case only because the DTB clashes with the ELF segments,
  in that case, just proceed without it
- reshuffled order, added R-b to #1

Ard Biesheuvel (4):
  hw/arm/boot: load DTB as a ROM image
  hw/arm/boot: pass an address limit to and return size from load_dtb()
  hw/arm/boot: load device tree to base of DRAM if no -kernel option was
passed
  hw/arm/boot: enable DTB support when booting ELF images

 hw/arm/boot.c | 71 ++-
 1 file changed, 65 insertions(+), 6 deletions(-)

-- 
1.8.3.2




Re: [Qemu-devel] OVMF, Q35 and USB keyboard/mouse

2014-09-11 Thread Paolo Bonzini
Il 11/09/2014 18:35, Gabriel L. Somlo ha scritto:
>> > Can you configure Chamaleon to avoid the boot prompt?
> Yes. After doing that, usb starts working once OS X is fully booted.
> 
> Works with either piix or q35 just fine.
> 
> Does this mean it's likely to be an OVMF uhci/ehci issue specific to Q35 ?
> (one from which Fedora can recover but OS X can't) ? 

Yes, that's my interpretation too.

You did test an UHCI controller, I think, but I don't remember---did you
test an EHCI controller without companions, using something like
"-device ich9-usb-ehci1,id=myehci -device usb-keyboard,bus=myehci.0"?
If that works, the issue would be specific to EHCI companion
controllers.  If that doesn't work, there is at least a generic in the
EHCI driver---of course there could possibly be another in the companion
controllers, but I'd try getting EHCI alone to work.

Paolo



  1   2   3   >